Lessons learned from Protocol Buffers, part 4: static interfaces

Warning: During this entire post, I will use the word static to mean “relating to a type instead of an instance”. This isn’t a strictly accurate use but I believe it’s what most developers actually think of when they hear the word.

A few members of the interfaces in Protocol Buffers have no logical reason to act on instances of their types. The message interface has members to return the message’s type descriptor (the PB equivalent of System.Type), the default instance for the message type, and a builder for the message type. The builder interface copies the first two of these, and also has a method to create a builder for a particular field. None of these touch any instance data.

In most cases this doesn’t actually cause any difficulties – we usually have an instance available when we’re in the PB library code, and the generated types have static properties for the default instance and the type descriptor anyway. Even so, it feels messy to have interface members which rely only on the type of the implementation and not on any of the actual data of the instance.

I’ve wondered before now about the possibility of having static members in interfaces – usually when thinking about plug-in architectures – but there’s always been the problem of working out how to specify the type on which to call the members. Variables and other expressions usually refer to values rather than types, and System.Type doesn’t help as it provides no compile-time knowledge of the type being referred to.

There’s one big exception to this, however: generic type parameters. I don’t know why it had never occurred to me before, but this is a great fit for the ability to safely call static methods on types which are unknown at compile-time. Furthermore, it could provide a great way of enforcing the presence of constructors with appropriate signatures, and even operators. Before I get too far ahead of myself, let’s tie the simple case to a concrete example.

Creating builders from nothing

In my previous post I gave an example of a method which ideally wanted to return a new message given a CodedInputStream and an ExtensionRegistry (both types within Protocol Buffers, the details of which are unimportant to this example). The current code looks like this:

private static TMessage BuildImpl<TMessage2, TBuilder> (Func<TBuilder> builderBuilder,
                                                        CodedInputStream input,
                                                        ExtensionRegistry registry)
    where TBuilder : IBuilder<TMessage2, TBuilder>
    where TMessage2 : TMessage, IMessage<TMessage2, TBuilder>
{
    TBuilder builder = builderBuilder();
    input.ReadMessage(builder, registry);
    return builder.Build();
}

For the purposes of this discussion I’ll simplify it a little, making it a generic method in a non-generic type.

private static TMessage BuildImpl<TMessage, TBuilder> (Func<TBuilder> builderBuilder,
                                                       CodedInputStream input,
                                                       ExtensionRegistry registry)
    where TBuilder : IBuilder<TMessage, TBuilder>
    where TMessage : IMessage<TMessage, TBuilder>
{
    TBuilder builder = builderBuilder();
    input.ReadMessage(builder, registry);
    return builder.Build();
}

The first parameter is a function which will return us a builder. We can’t simply add a new() constraint to TBuilder as not all geenrated builders will have a public constructor. However, we do know that the TMessage type has a CreateBuilder() method because it implements IMessage<TMessage, TBuilder>. Unfortunately we don’t have an instance of TMessage to call CreateBuilder() on! Really, we’d like to be able to change the code to this:

private static TMessage BuildImpl<TMessage, TBuilder> (CodedInputStream input, ExtensionRegistry registry)
    where TBuilder : IBuilder<TMessage, TBuilder>
    where TMessage : IMessage<TMessage, TBuilder>
{
    TBuilder builder = TMessage.CreateBuilder();
    input.ReadMessage(builder, registry);
    return builder.Build();
}

That’s currently impossible, but only because we can’t specify static methods in interfaces. Suppose we could write:

public interface IMessage<TMessage, TBuilder>
    where TMessage : IMessage<TMessage, TBuilder>
    where TBuilder : IBuilder<TMessage, TBuilder>
{
    static TBuilder CreateBuilder();

    // Other methods as before
}

Wouldn’t that be useful? The value would almost entirely be for generic types or methods where the type parameter is constrained to specify the relevant interface, but that could arguably still be very handy.

Operators and constructors

At this point hopefully the idea I mentioned earlier of being able to specify operators and constructors is quite obvious. For instance, we could make all the existing numeric types implement IArithmetic<T> (where T was the same type, e.g. int : IArithmetic<int>):

public interface IArithmetic<T>
{
    static T operator +(T left, T right);
    static T operator -(T left, T right);
    static T operator /(T left, T right);
    static T operator *(T left, T right);
}

// Used in LINQ to Objects, for example:
public static T Sum<T>(this IEnumerable<T> source) where T : IArithmetic<T>
{
    T total = default(T);
    foreach (T element in source)
    {
        total += element; // Interface says we can do total + element
    }
    return total;
}

Plug-ins for a particular program could implement IPlugin:

public interface IPlugin
{
    static new (PluginHost host);

    // Normal plug-in members here
    string Title { get; }
}

// Used within a PluginHost like this…
public T CreatePlugin<T>() where T : IPlugin
{
    T plugin = new T(this);
    log.Info(“Loaded plugin {0}”, plugin.Title);
    return plugin;
}

In fact, I’d imagine it would make sense to define a whole family of IConstructable interfaces, along the same lines as the Func and Action delegate families:

public interface IConstructable
{
    static new();
}

public interface IConstructable<T>
{
    static new(T arg)
}

public interface IConstructable<T1, T2>
{
    static new(T1 arg1, T2 arg2);
}

// etc

Inheritance raises its ugly head

There’s a fly in the ointment here. Normally, if a base type implements an interface, that means a type derived from it will effectively implement the interface too. That ceases to hold in all cases. You can get away with it for straightforward static methods/properties, in the same way that people often write code such as UTF8Encoding.UTF8 when they really just mean Encoding.UTF8. However, it doesn’t work for constructors – they aren’t inherited, so you can’t guarantee that Banana has a parameterless constructor just because Fruit does.

This is not only a problem for one concrete class deriving from another, but also for abstract implementations of interfaces. This happens a lot in Protocol Buffers; often the interface is partially implemented by the abstract class, with the final “leaf” classes in the inheritance tree implementing outstanding members and occasionally overriding earlier implementations for the sake of efficiency. Should we be able to specify an abstract static method in the abstract class, making sure that there’s an appropriate implementation by the time we hit a concrete class? As it happens that would be useful elsewhere in the Protocol Buffer library, but I’ll admit it’s slightly messy. I suspect there are ways round all of these issues, even if they might sometimes involve restricting the feature to particular, common use cases. However, every niggle would add its own piece of complexity.

There may well be other issues which would prove challenging – and other interesting aspects such as what explicit interface implementation would mean, if anything, in the context of static members. Language experts may well be able to reel off great lists of problems – I’d be very interested to here the ensuing discussions.

Conclusion

I believe that static interface members could prove very useful in generic algorithms, particularly if operators and constructors were allowed as well as the existing member types available in interfaces. There are significant sticking points to be carefully considered, and I wouldn’t like to prejudge the outcome of such deliberations in terms of whether the feature would be useful enough to merit the additional language complexity involved. It feels odd to implement an interface member which is effectively only of use when the implementing type is being used as the type argument for a generic type or method, but as developers learn to think more generically that may be less of a restriction than it currently seems.

This post is the last in my current batch talking about Protocol Buffers. There may well be more, but it’s unlikely now that the Protocol Buffer port is almost entirely complete. Most of these posts have involved generics, and the current limitations of what C# allows us to express in them. I do not intend to give the impression that I’m dissatisfied with C# – I’ve just found it interesting to take a look at what lies beyond the current boundaries of the language. The one aspect of these posts which I would definitely like to see addresses is that of covariant return types – the Java implementation of Protocol Buffers is significantly simpler in many ways purely due to this one point.

13 thoughts on “Lessons learned from Protocol Buffers, part 4: static interfaces”

  1. I think that adding static methods to interfaces would be a little bit confusing.
    For example:

    interface IAction
    {
    public static IAction Combine(IAction a, IAction b);
    }

    That looks like we can write something like this:

    IAction first = …
    IAction second = …
    IAction combined = IAction.Combine(first, second);

    It’s not obvious that we want every implementation to have that static method

    Like

  2. @susL: You wouldn’t be able to write IAction.Combine(first, second). This would *only* work in the case of a generic method/type which had a type parameter with a constraint of “T : IAction”; you could then write

    IAction combined = T.Combine(first, second).

    And if you didn’t want every implementation to have that static method, you wouldn’t put it in the interface after all. This would only be useful in some scenarios – just like iterator blocks, it wouldn’t be something to use every day.

    Jon

    Like

  3. Thanks for your answer, Jon.
    But I think we misunderstood each other. I intended to draw a parallel between static methods in interfaces and classes. I meant that currently we call static methods using syntax “Class.StaticMethod();” and calling static methods of interfaces naturally leads to a syntax “Interface.StaticMethod();”
    I was not speaking in terms of the syntax you’ve proposed.
    And I agree that it’s quite a rare scenario, but I think all scenarios with static methods in interfaces are equally rare :)

    Like

  4. Hi Jon,
    Are you aware the the CLR does in fact support static members on interfaces?
    It’s just that C# (and every other .Net language that I know of) does not use this feature.
    You can read a bit about this here:
    http://dotnetjunkies.com/WebLog/malio/archive/2007/03/01/204317.aspx

    Here’s some information from the book “Expert .NET 2.0 IL Assembler” by Serge Lidin:

    “An interface must offer the implementation of its static members—the items shared by all instances of a type—if it has any. Bear in mind, of course, that the definition of static as “shared by all instances” is general for all types and does not imply that interfaces can be instantiated. They cannot be. Interfaces are inherently abstract and cannot even have instance constructors.
    Static members (fields, methods) of an interface are not part of the contract defined by the interface and have no bearing on the types that implement the interface. A type implementing an interface must implement all instance members of the interface, but it has nothing to do with the static members of the interface. Static members of an interface can be accessed directly like static members of any type, and you don’t need an “instance” of an interface (meaning an instance of a class implementing this interface) for that.”

    — Omer.

    Like

  5. Hello
    What happens if you define an extensio method on the interface and then restrict the generic to the interface which has a extension method defined on? Wouldn’t that work…

    Hmmm maybe not because extension methods are instance methods… Damn. So consider my post as random thoughts or braindump ;)

    Daniel

    Like

  6. Static interfaces would indeed come in handy sometimes. And even if it’s just to force implementers to provide a certain static method. I especially like your IArithmetic example.

    Constructor interfaces are indeed a bit tricky. It doesn’t matter, however, that they are not inherited, as you can’t do (new IConstructable()) anyway.

    Like

  7. Hello Jason,

    I have been searching ProtoBufs documentation for hours and I could not find a simple feature: how to write a generic type, which is determined at runtime, similar to “object”. As far as I can see all values need to have specific types. Suppose I want to serialize an object:

    message Object
    {
    message Field
    {
    required string name = 1;
    required string value = 2;
    }

    optional string objectType = 1;
    repeated Field fields = 2;
    }

    How can I make Field.value be of any type (=”object”)?

    I appreciate your answer.

    Like

  8. The easiest way might be to just add it as an “unknown field” – I don’t know of anything equivalent to really just “object”.

    Like

  9. I would love to have this feature. In particular, the ability to declare that a generic type has a specific constructor signature would be handy.

    class Foo{ public Foo(int num, string str) {} }
    List Join<T1, T2, TOut>(List list1, List list2) where TOut : new(T1, T2) // syntax…
    { return Enumerable.Zip(list1, list2, TOut.new); }

    Like

Leave a comment