Category Archives: Google

Hosting ASP.NET Core behind https in Google Kubernetes Engine

Side-note: this may be one of the clumsiest titles I’ve ever written for a blog post. But it does what it says on the tin. Oh, and the space after “ASP” in “ASP .NET Core” everywhere it to avoid auto-linking. While I could use a different dot or a zero-width non-breaking space to avoid it, I’m not sure I trust WordPress to do the right thing with those…

Background

Over the past few weeks, I’ve moved nodatime.org, csharpindepth.com and jonskeet.uk over to Google Kubernetes Engine. (They all used to be hosted on Azure.)

I’ve done this for a few reasons:

  • As my job is primarily making .NET developers more productive on Google Cloud Platform, it feels natural to run my own code there. I want to see where there are friction points, so I can help fix them.
  • I wanted more redundancy, particularly for nodatime.org; Kubernetes felt a simple way of managing that at a reasonable cost
  • HTTPS certificate management (via Let’s Encrypt) has been a bit painful for me on Azure; I could have automated more, but that would have taken extra time I don’t have. (It may also have improved since I last looked.)

The first of these is the most important, by a long way. But the HTTPS management part – and then the knock-on effects – is what I’m looking at in this blog post.

Basic hosting

Hosting an ASP .NET Core application in Google Kubernetes Engine (GKE from now on) is really simple, at least once you’ve understood the Kubernetes concepts. I have:

In each case, the ASP .NET Core application is built with a vanilla Dockerfile which would not look unusual to anyone who’s hosted ASP .NET Core in Docker anywhere else.

I happen to use Google Cloud Build to build the Docker images, and Google Container Registry to host the images, but neither of those are required. (For csharpindepth.com and jonskeet.uk there are simple triggers in Google Cloud Build to build and deploy on GitHub pushes; for nodatime.org it’s a bit more complicated as the documentation build currently has some Windows dependencies. I have a machine at home that polls GitHub every half hour, and pushes the built application to Google Cloud Build for packaging when necessary.)

So, that gets HTTP hosting sorted. I dare say there are some aspects I’ve not configured as well as I could have done, but it was reasonably straightforward to get going.

HTTPS with Google-managed certificates

With HTTP working, it’s time to shoot for HTTPS. It’s important to note that the apps I’m talking about are all hobby projects, not commercial ones – I’m already paying for hosting, so I don’t want to have to pay for SSL certificates as well. Enter Let’s Encrypt, of course.

A while ago I used Let’s Encrypt to set up HTTPS on Azure, and while it was free and I didn’t have to write any code, it wasn’t exactly painless. I followed two guides at the same time, because neither of them exactly matched the Azure portal I was looking at. There were lots of bits of information to grab from various different bits of the portal, and it took a couple of attempts to get right… but I got there. I also set up a web job to renew the certificates, but didn’t go through the hoops required to run those web jobs periodically. (There were instructions, but it looked like they’d take a while to work through compared with just manually running the web job every couple of months or so. I decided to take the pragmatic approach, knowing that I was expecting to move to GKE anyway. If Azure had been the expected permanent home for the apps, I’d have gone through the steps and I’m sure they’d have worked fine.) I don’t know which guide I worked through at the time, but if I were starting today I’d probably try Scott Hanselman’s guide.

So, what can I do on Google Cloud Platform instead? I decided to terminate the SSL connection at the load balancer, using Google-managed certificates. To be really clear, these are currently in beta, but have worked well for me so far. Terminating the SSL connection at the load balancer means that the load balancer forwards the request to the Kubernetes service as an HTTP request, not HTTPS. The ASP .NET Core app itself only exposes an HTTP port, so it doesn’t need to know any details of certificates.

The steps to achieve this are simple, assuming you have the Google Cloud SDK (gcloud) installed already:

  • Create the certificate, e.g.
    gcloud beta compute ssl-certificates create nodatime-org --domains nodatime.org
  • Attach the certificate to the load balancer, via the Kubernetes ingress in my case, with an annotation in the ingress metadata:
    ingress.gcp.kubernetes.io/pre-shared-cert: "nodatime-org"
  • Apply the modifications to the ingress:
    kubectl apply -f ingress.yaml
  • Wait for the certificate to become valid (the provisioning procedure takes a little while, and I’ve seen some interesting errors while that’s taking place)
  • Enjoy HTTPS support, with auto-renewing certificates!

There are only two downsides to this that I’ve experienced so far:

  • Currently each certificate can only be associated with a single domain. For example, I have different certificates for nodatime.org, http://www.nodatime.org and test.nodatime.org. (More about the last of these later.) This is a minor annoyance, but the ingress supports multiple pre-shared certificates, so it has no practical implications for me.
  • I had to accept some downtime on HTTPS when transferring from Azure to GKE, while the certificate was provisioning after I’d transferred the DNS entry. This was a one-time issue of course, and one that wouldn’t affect most users.

Beyond the basics

At this point I had working HTTPS URLs – but any visitor using HTTP would stay that way. (At the time of writing this is still true for csharpindepth.com and jonskeet.uk.) Obviously I’d like to encourage secure browsing, so I’d like to use the two pieces of functionality provided by ASP .NET Core:

  • Redirection of HTTP requests via app.UseHttpsRedirection()
  • HSTS support via app.UseHsts()

I should note here that the Microsoft documentation was fabulously useful throughout. It didn’t quite get me all the way, but it was really close.

Now, I could have just added those calls into the code and deployed straight to production. Local testing would have worked – it would have redirected from localhost:5000 on HTTP to localhost:5001 on HTTPS with no problems. It would also have failed massively for reasons we’ll look at in a minute. Just for a change, I happened to do the right thing…

For hosting changes, always use a test deployment first

In Azure, I had a separate AppService I could deploy to, called nodatimetest. It didn’t have a fancy URL, but it worked okay. That’s where I tested Azure-specific changes before deploying to the real AppService. Unfortunately, it wouldn’t have helped in this situation, as it didn’t have a certificate.

Fortunately, creating a new service in Kubernetes, adding it to the ingress, and creating a managed certificate is so easy that I did do this for the new hosting – and I’m so glad I did so. I use a small script to publish the local ASP .NET Core build to Google Cloud Build which does the Docker packaging, pushes it to Google Container Registry and updates the Kubernetes deployment. As part of that script, I add a small text file containing the current timestamp so I can check that I’m really looking at the deployment I expect. It takes just under two minutes to build, push, package, deploy – not a tight loop you’d want for every day development, but pretty good for the kind of change that can’t be tested locally.

So, I made the changes to use HTTPS redirection and HSTS, deployed, and… there was no obvious change.

Issue 1: No HTTPS port to redirect to

Remember how the ASP .NET Core app in Kubernetes is only listening on HTTP? That means it doesn’t know which port to redirect users to for HTTPS. Oops. While I guess it would be reasonable to guess 443 if it didn’t know any better, the default of “don’t redirect if you haven’t been told a port” means that your application doesn’t stop working if you get things wrong – it just doesn’t redirect.

This is easily fixed in ConfigureServices:

services.AddHttpsRedirection(options => options.HttpsPort = 443);

… but I’ve added conditional code so it doesn’t do that in development environment, as otherwise it would try to redirect from localhost:5000 to localhost:443, which wouldn’t work. This is a bit hacky at the moment, which is a common theme – I want to clean up all the configuration at some point quite soon (moving things into appsettings.json as far as possible) but it’s just about hanging together for now.

So, make the change, redeploy to test, and… observed infinite redirection in the browser. What?

Issue 2: Forwarding proxied headers

Remember again how the ASP .NET Core app is only listening on HTTP? We want it to behave differently depending on whether the end user made a request to the load balancer on HTTP or HTTPS. That means using headers forwarded from the proxy (in our case the load balancer) to determine the original request scheme. Fortunately, again there’s documentation on hand for this.

There are two parts to configuring this:

  • Configuring the ForwardedHeadersOptions in ConfigureServices
  • Calling app.UseForwardedHeaders() in Configure

(At least, that’s the way that’s documented. I’m sure there are myriad alternatives, but my experience level of ASP .NET Core is such that I’m still in “follow the examples verbatim, changing as little as possible” at the moment.)

I won’t go into the gory details of exactly how many times I messed up the forwarded headers options, but I will say:

  • The example which just changes options.ForwardedHeaders is probably fine if your proxy server is local to the application, but otherwise you will need do to extra work
  • The troubleshooting part of the documentation is spectacularly useful
  • There are warnings logged if you get things wrong, and those logs will help you – but they’re at a debug log level, so you may need to update your logging settings. (I only realized this after I’d fixed the problem, partly thanks to Twitter.)

Lesson to learn: when debugging a problem, turn on debugging logs. Who’d have thought?

Configuring this properly is an area where you really need to understand your deployment and how a request reaches you. In my case, the steps are:

  • The user’s HTTPS request is terminated by the load balancer
  • The load balancer makes a request to the Kubernetes service
  • The Kubernetes service makes a request to the application running on one of the suitable nodes

This leads to a relatively complex configuration, as there are two networks to trust (Google Cloud load balancers, and my internal Kubernetes network) and we need to allow two “hops” of proxying. So my configuration code looks like this:

services.Configure<ForwardedHeadersOptions>(options =>
{
    options.KnownNetworks.Clear();
    // Google Cloud Platform load balancers
    options.KnownNetworks.Add(new IPNetwork(IPAddress.Parse("130.211.0.0"), 22));
    options.KnownNetworks.Add(new IPNetwork(IPAddress.Parse("35.191.0.0"), 16));
    // GKE service which proxies the request as well.
    options.KnownNetworks.Add(new IPNetwork(IPAddress.Parse("10.0.0.0"), 8));
    options.ForwardedHeaders = ForwardedHeaders.XForwardedFor | ForwardedHeaders.XForwardedProto;
    options.ForwardLimit = 2;
});

(The call to KnownNetworks.Clear() probably isn’t necessary. The default is to include the loopback, which is safe enough to leave in the list.)

Okay, deploy that to the test environment. Everything will work now, right? Well, sort of…

Issue 3: make sure health checks are healthy!

As it happens, when I’d finished fixing issue 2, I needed to help at a birthday party for a family we’re friends with. Still, I went to the party happily knowing everything was fine.

I then came home and found the test deployment was broken. Really broken. “502 Bad Gateway” broken. For both HTTP and HTTPS. This is not good.

I tried adding more logging, but it looked like none of my requests were getting through to the application. I could see in the logs (thank you, Stackdriver!) that requests were being made, always to just “/” on HTTP. They were all being redirected to HTTPS via a 307, as I’d expect.

This confused me for a while. I honestly can’t remember what gave me the lightbulb moment of “Ah, these are load balancer health checks, and it thinks they’re failing!” but I checked with the load balancer in the Google Cloud Console and sure enough, I had multiple working backends, and one broken one – my test backend. The reason I hadn’t seen this before was that I’d only checked the test deployment for a few minutes – not long enough for the load balancer to deem the backend unhealthy.

I was stuck at this point for a little while. I considered reconfiguring the load balancer to make the health check over HTTPS, but I don’t think that could work as the app isn’t serving HTTPS itself – I’d need to persuade it to make the request as if it were a user-generated HTTPS request, with appropriate X-Forwarded-Proto etc headers. However, I saw that I could change which URL the load balancer would check. So how about we add a /healthz URL that would be served directly without being redirected? (The “z” at the end is a bit of Googler heritage. Just /health would be fine too, of course.)

I started thinking about adding custom inline middleware to do this, but fortunately didn’t get too far before realizing that ASP .NET Core provides health checking already… so all I needed to do was add the health check middleware before the HTTPS redirection middleware, and all would be well.

So in ConfigureServices, I added a no-op health check service:

services.AddHealthChecks();

And in Configure I added the middleware at an appropriate spot:

app.UseHealthChecks("/healthz");

After reconfiguring the health check on the load balancer, I could see /healthz requests coming in and receiving 200 (OK) responses… and the load balancer was then happy to use the backend again. Hooray!

After giving the test service long enough to fail, I deployed to production, changed the load balancer health check URL, and all was well. I did the two parts of this quickly enough so that it never failed – a safer approach would have been to add the health check handler but without the HTTPS redirection first, deploy that, change the health check URL, then turn on HTTPS.

But the end result is, all is working! Hooray!

Conclusion

Moving the service in the first place has been a long process, mostly due to a lack of time to spend on it, but the HTTPS redirection has been its own interesting bit of simultaneous pleasure and frustration. I’ve learned a number of lessons along the way:

  • The combination of Google Kubernetes Engine, Google Cloud Load Balancing, Google Cloud Build and Google Container registry is pretty sweet.
  • Managed SSL certificates are wonderfully easy to use, even if there is a bit of a worrying delay while provisioning
  • It’s really, really important to be able to test deployment changes (such as HTTPS redirection) in an environment which is very similar to production, but which no-one is depending on. (Obviously if you have a site which few people care about anyway, there’s less risk. But as it’s easy to set up a test deployment on GKE, why not do it anyway?)
  • HTTPS redirection caused me three headaches, all predictable:
    • ASP .NET Core needs to know the HTTPS port to redirect to.
    • You need to configure forwarded headers really carefully, and know your deployment model thoroughly .
    • Be aware of health checks! Make sure you leave a test deployment “live” for long enough for the health checks to mark it as unhealthy if you’ve done something wrong, before you deploy to production.
  • When debugging, turn on debug logging. Sounds obvious in retrospect, doesn’t it? (Don’t start trying to copy middleware source code into your own application so you can add logging, rather than using the logging already there…)

I also have some future work to do:

  • There’s a particular URL (http://nodatime.org/tzdb/latest.txt) which is polled by applications in order to spot time zone information changes. That’s the bulk of the traffic to the site. It currently redirects to HTTPS along with everything else, which leads to the total traffic being nearly double what it was before, for no real benefit. I’ve encouraged app authors to use HTTPS instead, but I’ve also filed a feature request against myself to consider serving that particular URL without the redirect. It looks like that’s non-trivial though.
  • I have a bunch of hard-coded information which should really be in appsettings.json. I want to move all of that, but I need to learn more about the best way of doing it first.

All in all, this has been a very positive experience – I hope the details above are useful to anyone else hosting ASP .NET Core apps in Google Kubernetes Engine.

There’s a hole in my abstraction, dear Liza, dear Liza

I had an interesting day at work today. I thought my code had broken… but it turns out it was just a strange corner case which made it work very slowly. Usually when something interesting happens in my code it’s quite hard to blog about it, because of all the confidentiality issues involved. In this case, it’s extremely easy to reproduce the oddity in an entirely vanilla manner. All we need is the Java collections API.

I have a set – a HashSet, in fact. I want to remove some items from it… and many of the items may well not exist. In fact, in our test case, none of the items in the "removals" collection will be in the original set. This sounds – and indeed is – extremely easy to code. After all, we’ve got Set<T>.removeAll to help us, right?

Let’s make this concrete, and look at a little test. We specify the size of the "source" set and the size of the "removals" collection on the command line, and build both of them. The source set contains only non-negative integers; the removals set contains only negative integers. We measure how long it takes to remove all the elements using System.currentTimeMillis(), which isn’t the world most accurate stopwatch but is more than adequate in this case, as you’ll see. Here’s the code:

import java.util.*;

public class Test
{
    public static void main(String[] args)
    {
        int sourceSize = Integer.parseInt(args[0]);
        int removalsSize = Integer.parseInt(args[1]);
        
        Set<Integer> source = new HashSet<Integer>();
        Collection<Integer> removals = new ArrayList<Integer>();
        
        for (int i = 0; i < sourceSize; i++)
        {
            source.add(i);
        }
        for (int i = 1; i <= removalsSize; i++)
        {
            removals.add(-i);
        }
        
        long start = System.currentTimeMillis();
        source.removeAll(removals);
        long end = System.currentTimeMillis();
        System.out.println("Time taken: " + (end – start) + "ms");
    }
}

Let’s start off by giving it an easy job: a source set of 100 items, and 100 to remove:

c:UsersJonTest>java Test 100 100
Time taken: 1ms

Okay, so we hadn’t expected it to be slow… clearly we can ramp things up a bit. How about a source of one million items1 and 300,000 items to remove?

c:UsersJonTest>java Test 1000000 300000
Time taken: 38ms

Hmm. That still seems pretty speedy. Now I feel I’ve been a little bit cruel, asking it to do all that removing. Let’s make it a bit easier – 300,000 source items and 300,000 removals:

c:UsersJonTest>java Test 300000 300000
Time taken: 178131ms

Excuse me? Nearly three minutes? Yikes! Surely it ought to be easier to remove items from a smaller collection than the one we managed in 38ms? Well, it does all make sense, eventually. HashSet<T> extends AbstractSet<T>, which includes this snippet in its documentation for the removeAll method:

This implementation determines which is the smaller of this set and the specified collection, by invoking the size method on each. If this set has fewer elements, then the implementation iterates over this set, checking each element returned by the iterator in turn to see if it is contained in the specified collection. If it is so contained, it is removed from this set with the iterator’s remove method. If the specified collection has fewer elements, then the implementation iterates over the specified collection, removing from this set each element returned by the iterator, using this set’s remove method.

Now that sounds reasonable on the surface of it – iterate through the smaller collection, check for the presence in the bigger collection. However, this is where the abstraction is leaky. Just because we can ask for the presence of an item in a large collection doesn’t mean it’s going to be fast. In our case, the collections are the same size – but checking for the presence of an item in the HashSet is O(1) whereas checking in the ArrayList is O(N)… whereas the cost of iterating is going to be the same for each collection. Basically by choosing to iterate over the HashSet and check for presence in the ArrayList, we’ve got an O(M * N) solution overall instead of an O(N) solution. Ouch. The removeAll method is making an "optimization" based on assumptions which just aren’t valid in this case.

Then fix it, dear Henry, dear Henry, dear Henry

There are two simple ways of fixing the problem. The first is to simply change the type of the collection we’re removing from. Simply changing ArrayList<Integer> to HashSet<Integer> gets us back down to the 34ms range. We don’t even need to change the declared type of removals.

The second approach is to change the API we use: if we know we want to iterate over removals and perform the lookup in source, that’s easy to do:

for (Integer value : removals)
{
    source.remove(value);
}

In fact, on my machine that performs slightly better than removeAll – it doesn’t need to check the return value of remove on each iteration, which removeAll does in order to return whether or not any items were removed. The above runs in about 28ms. (I’ve tested it with rather larger datasets, and it really is faster than the dual-hash-set approach.)

However, both of these approaches require comments in the source code to explain why we’re not using the most obvious code (a list and removeAll). I can’t complain about the documentation here – it says exactly what it will do. It’s just not obvious that you need to worry about it, until you run into such a problem.

So what should the implementation do? Arguably, it really needs to know what’s cheap in each of the collections it’s dealing with. The idea of probing for performance characteristics before you decide on a strategy is completely anathema to clean abstraction we like to consider with frameworks like Java collections… but maybe in this case it would be a good idea.


1 Please perform Dr Evil impression while reading this. I’m watching you through your webcam, and I’ll be disappointed if I don’t see you put your little finger to your mouth.

MVP Again

I’m delighted to be able to announce that I’m now an MVP again.

Google has reconsidered the situation and worked out a compromise: I now receive no significant gifts from Microsoft, and I’m not under NDA with them. While that precludes me from a lot of MVP activities, it removes any concerns to do with Google’s Code of Conduct. Basically my MVP status is truly just a token of Microsoft’s recognition of what I’ve done in the C# community – and that’s fine by me.

When I announced that I’d been advised not to seek renewal, I was amazed at the scale of the reaction in the comments, other blog posts, Twitter and personal email. I was touched by the response of the community. I really love working at Google, and the fact that we could figure out a solution to this situation is definitely one of the things that makes Google such an awesome place to work. Oh, and did I mention that we’re hiring? :)

Anyway, the basic message of this post is: thanks to the community for caring, thanks to Google for reconsidering, and thanks to Microsoft for renewing my award. And they all lived happily ever after…

OS Jam at Google London: C# 4 and the DLR

Last night I presented for the first time at the Google Open Source Jam at our offices in London. The room was packed, but only a very few attendees were C# developers. I know that C# isn’t the most popular language on the Open Source scene, but I was still surprised there weren’t more people using C# for their jobs and hacking on Ruby/Python/etc at night.

All the talks at OSJam are just 5 minutes long, with 2 minutes for questions. I’m really not used to this format, and felt extremely rushed… however, it was still a lot of fun. I used a somewhat different approach to my slides than the normal “bullet points in PowerPoint” – and as it was only short, I thought I might as well effectively repeat the presentation here in digital form. (Apologies if the images are an inconvenient size for you. I tried a few different ones, and this seemed about right. Comments welcome, as I may do a similar thing in the future.)

First slide

Introductory slide. Colleagues forced me to include the askjonskeet.com link.

Second slide

.NET isn’t Open Source. You can debug through a lot of the source code for the framework if you agree to a “reference licence”, but it’s not quite the same thing.

Third slide

.NET isn’t Open Source, but the DLR is. And IronRuby. And IronPython. Yay!

And of course Mono is Open Source: the DLR and Mono play nicely together, and the Mono team is hoping to implement the new C# 4.0 features for the 2.8 release in roughly the same timeframe as Microsoft.

Fourth slide

This is what .NET 4.0 will look like. The DLR will be included in it, despite being open source. IronRuby and IronPython aren’t included, but depend heavily on the DLR. (Currently available versions allow you to use a “standalone” DLR or the one in .NET 4.0b1.)

C# doesn’t really depend on the DLR except for its handling of dynamic. C# is a statically typed language, but C# 4.0 has a new static type called dynamic which you can do just about anything with. (This got a laugh, despite being a simple and mostly accurate summary of the dynamic typing support in C# 4.0.)

Fifth slide

The fundamental point of the DLR is to handle call sites – decide what to do dynamically with little bits of code. Oh, and do it quickly. That’s what the caches are for. They’re really clever – particularly the L0 cache which compiles rules (about the context in which a particular decision is valid) into IL via dynamic methods. Awesome stuff.

I’m sure the DLR does many other snazzy things, but this feels like it’s the core part of it.

Sixth slide

At execution time, the relevant binder is used to work out what a call site should actually do. Unless, that is, the call has a target which implements the shadowy IDynamicMetaObjectProvider interface (winner of “biggest mouthful of a type name” prize, 2009) – in which case, the object is asked to handle the call. Who knows what it will do?

Seventh slide

Beautifully syntax-highlighted C# 4.0 source code showing the dynamic type in action. The method calls on lines 2 and 3 are both dynamic, even though in the latter case it’s just using a static method. Which overload will it pick? It all depends on the type of the actual value at execution time.

If I’d had more time, I’d have demonstrated how the C# compiler preserves the static type information it knows at compile time for the execution time binder to use. This is very cool, but would take far too long to demonstrate in this talk – especially to a bunch of non-C# developers.

Eighth slide There were a couple of questions, but I can’t remember them offhand. Someone asked me afterwards about how all this worked on non-.NET implementations (i.e. Mono, basically). I gather the DLR itself works, but I don’t know whether C# code compiled in the MS compiler will work at the moment – it embeds references to binder types in Microsoft.CSharp.dll, and I don’t know what the story is about that being supported on Mono.

This is definitely the format I want to use for future presentations. It’s fun to write, fun to present, and I’m sure the “non-professionalism” of it makes it a lot more interesting to watch. Although it’s slower to create text-like slides (such as the first and the last one) this way, the fact that I don’t need to find clip-art or draw boxes with painful user interfaces is a definite win – especially as I’m going to try to be much more image-biased from now on. (I don’t want people reading slides while I’m talking – they should be listening, otherwise it’s just pointless.)

DotNetRocks interview

Last Monday evening I had a chat with the guys from DotNetRocks, and today the show has gone live.

I wouldn’t claim to have said anything particularly earth-shattering, and regular readers will probably be familiar with many of the themes anyway, but I thoroughly enjoyed it and hope you will too. Amongst other things, we talked about:

  • Protocol buffers
  • Implicit typing and anonymous types
  • Why it doesn’t bother me that Office hasn’t been ported to .NET
  • C# 4
  • My wishlist for C#
  • Threading and Parallel Extensions
  • Working for Google
  • How to learn LINQ
  • C# in Depth

Feedback welcome. And yes, I know I sound somewhat like a stereotypical upper-class idiot at times. Unfortunately there’s not a lot I can do about that. Only the “idiot” part is accurate :)

Lessons learned from Protocol Buffers, part 4: static interfaces

Warning: During this entire post, I will use the word static to mean “relating to a type instead of an instance”. This isn’t a strictly accurate use but I believe it’s what most developers actually think of when they hear the word.

A few members of the interfaces in Protocol Buffers have no logical reason to act on instances of their types. The message interface has members to return the message’s type descriptor (the PB equivalent of System.Type), the default instance for the message type, and a builder for the message type. The builder interface copies the first two of these, and also has a method to create a builder for a particular field. None of these touch any instance data.

In most cases this doesn’t actually cause any difficulties – we usually have an instance available when we’re in the PB library code, and the generated types have static properties for the default instance and the type descriptor anyway. Even so, it feels messy to have interface members which rely only on the type of the implementation and not on any of the actual data of the instance.

I’ve wondered before now about the possibility of having static members in interfaces – usually when thinking about plug-in architectures – but there’s always been the problem of working out how to specify the type on which to call the members. Variables and other expressions usually refer to values rather than types, and System.Type doesn’t help as it provides no compile-time knowledge of the type being referred to.

There’s one big exception to this, however: generic type parameters. I don’t know why it had never occurred to me before, but this is a great fit for the ability to safely call static methods on types which are unknown at compile-time. Furthermore, it could provide a great way of enforcing the presence of constructors with appropriate signatures, and even operators. Before I get too far ahead of myself, let’s tie the simple case to a concrete example.

Creating builders from nothing

In my previous post I gave an example of a method which ideally wanted to return a new message given a CodedInputStream and an ExtensionRegistry (both types within Protocol Buffers, the details of which are unimportant to this example). The current code looks like this:

private static TMessage BuildImpl<TMessage2, TBuilder> (Func<TBuilder> builderBuilder,
                                                        CodedInputStream input,
                                                        ExtensionRegistry registry)
    where TBuilder : IBuilder<TMessage2, TBuilder>
    where TMessage2 : TMessage, IMessage<TMessage2, TBuilder>
{
    TBuilder builder = builderBuilder();
    input.ReadMessage(builder, registry);
    return builder.Build();
}

For the purposes of this discussion I’ll simplify it a little, making it a generic method in a non-generic type.

private static TMessage BuildImpl<TMessage, TBuilder> (Func<TBuilder> builderBuilder,
                                                       CodedInputStream input,
                                                       ExtensionRegistry registry)
    where TBuilder : IBuilder<TMessage, TBuilder>
    where TMessage : IMessage<TMessage, TBuilder>
{
    TBuilder builder = builderBuilder();
    input.ReadMessage(builder, registry);
    return builder.Build();
}

The first parameter is a function which will return us a builder. We can’t simply add a new() constraint to TBuilder as not all geenrated builders will have a public constructor. However, we do know that the TMessage type has a CreateBuilder() method because it implements IMessage<TMessage, TBuilder>. Unfortunately we don’t have an instance of TMessage to call CreateBuilder() on! Really, we’d like to be able to change the code to this:

private static TMessage BuildImpl<TMessage, TBuilder> (CodedInputStream input, ExtensionRegistry registry)
    where TBuilder : IBuilder<TMessage, TBuilder>
    where TMessage : IMessage<TMessage, TBuilder>
{
    TBuilder builder = TMessage.CreateBuilder();
    input.ReadMessage(builder, registry);
    return builder.Build();
}

That’s currently impossible, but only because we can’t specify static methods in interfaces. Suppose we could write:

public interface IMessage<TMessage, TBuilder>
    where TMessage : IMessage<TMessage, TBuilder>
    where TBuilder : IBuilder<TMessage, TBuilder>
{
    static TBuilder CreateBuilder();

    // Other methods as before
}

Wouldn’t that be useful? The value would almost entirely be for generic types or methods where the type parameter is constrained to specify the relevant interface, but that could arguably still be very handy.

Operators and constructors

At this point hopefully the idea I mentioned earlier of being able to specify operators and constructors is quite obvious. For instance, we could make all the existing numeric types implement IArithmetic<T> (where T was the same type, e.g. int : IArithmetic<int>):

public interface IArithmetic<T>
{
    static T operator +(T left, T right);
    static T operator -(T left, T right);
    static T operator /(T left, T right);
    static T operator *(T left, T right);
}

// Used in LINQ to Objects, for example:
public static T Sum<T>(this IEnumerable<T> source) where T : IArithmetic<T>
{
    T total = default(T);
    foreach (T element in source)
    {
        total += element; // Interface says we can do total + element
    }
    return total;
}

Plug-ins for a particular program could implement IPlugin:

public interface IPlugin
{
    static new (PluginHost host);

    // Normal plug-in members here
    string Title { get; }
}

// Used within a PluginHost like this…
public T CreatePlugin<T>() where T : IPlugin
{
    T plugin = new T(this);
    log.Info(“Loaded plugin {0}”, plugin.Title);
    return plugin;
}

In fact, I’d imagine it would make sense to define a whole family of IConstructable interfaces, along the same lines as the Func and Action delegate families:

public interface IConstructable
{
    static new();
}

public interface IConstructable<T>
{
    static new(T arg)
}

public interface IConstructable<T1, T2>
{
    static new(T1 arg1, T2 arg2);
}

// etc

Inheritance raises its ugly head

There’s a fly in the ointment here. Normally, if a base type implements an interface, that means a type derived from it will effectively implement the interface too. That ceases to hold in all cases. You can get away with it for straightforward static methods/properties, in the same way that people often write code such as UTF8Encoding.UTF8 when they really just mean Encoding.UTF8. However, it doesn’t work for constructors – they aren’t inherited, so you can’t guarantee that Banana has a parameterless constructor just because Fruit does.

This is not only a problem for one concrete class deriving from another, but also for abstract implementations of interfaces. This happens a lot in Protocol Buffers; often the interface is partially implemented by the abstract class, with the final “leaf” classes in the inheritance tree implementing outstanding members and occasionally overriding earlier implementations for the sake of efficiency. Should we be able to specify an abstract static method in the abstract class, making sure that there’s an appropriate implementation by the time we hit a concrete class? As it happens that would be useful elsewhere in the Protocol Buffer library, but I’ll admit it’s slightly messy. I suspect there are ways round all of these issues, even if they might sometimes involve restricting the feature to particular, common use cases. However, every niggle would add its own piece of complexity.

There may well be other issues which would prove challenging – and other interesting aspects such as what explicit interface implementation would mean, if anything, in the context of static members. Language experts may well be able to reel off great lists of problems – I’d be very interested to here the ensuing discussions.

Conclusion

I believe that static interface members could prove very useful in generic algorithms, particularly if operators and constructors were allowed as well as the existing member types available in interfaces. There are significant sticking points to be carefully considered, and I wouldn’t like to prejudge the outcome of such deliberations in terms of whether the feature would be useful enough to merit the additional language complexity involved. It feels odd to implement an interface member which is effectively only of use when the implementing type is being used as the type argument for a generic type or method, but as developers learn to think more generically that may be less of a restriction than it currently seems.

This post is the last in my current batch talking about Protocol Buffers. There may well be more, but it’s unlikely now that the Protocol Buffer port is almost entirely complete. Most of these posts have involved generics, and the current limitations of what C# allows us to express in them. I do not intend to give the impression that I’m dissatisfied with C# – I’ve just found it interesting to take a look at what lies beyond the current boundaries of the language. The one aspect of these posts which I would definitely like to see addresses is that of covariant return types – the Java implementation of Protocol Buffers is significantly simpler in many ways purely due to this one point.

Lessons learned from Protocol Buffers, part 3: generic type relationships

In part 2 of this series we saw how the message and builder interfaces were self-referential in order to allow the implementation types to be part of the API. That’s one sort of relationship, but in this post we’ll see how the two interfaces relate to each other. If you remember from part 1 every generated message type has a corresponding builder type. As it happens, this is implemented with a nested type, so if you had a Person message, the generated types would be Person and Person.Builder (in a specified namespace, of course).

Without any interfaces involved, this would be very simple. The types would just look like this (with more members, of course):

public class Person
{
    public static Builder CreateBuilder() { … }

    public Builder CreateBuilderForType() { … }

    public class Builder
    {
        public Builder() { … }

        public Person Build() { … }
    }
}

You may well be wondering why there are two methods for creating a builder. The static method is convenient for code which knows it’s dealing with the Person message. The instance method ends up being part of the message interface, which makes it useful for code which can work with any message. In addition, the constructor for Person.Builder is accessible in the C# version. In the original Java code the only way of creating a builder is via the methods in the message class; I decided to remove this restriction for the sake of making the oh-so-readable object initializer syntax available in C# 3.

Redesigning the interfaces to refer to each other

In part 2 we created self-referential interfaces for the message and builder interfaces which looked like this:

public interface IMessage<TMessage> where TMessage : IMessage<TMessage>
{
    …
}

public interface IBuilder<TBuilder> where TBuilder : IBuilder<TBuilder>
{
    …
}

The constraints on the type parameters allow us to make the API very specific, and we can use the same trick again when we relate the builder and message types together. The step where we introduce a new type parameter to each of them is straightforward:

public interface IMessage<TMessage, TBuilder> where TMessage : IMessage<TMessage, TBuilder>
{
    …
}

public interface IBuilder<TMessage, TBuilder> where TBuilder : IBuilder<TMessage, TBuilder>
{
    …
}

Unfortunately without any restrictions on the “foreign” type parameter in each interface, we don’t get enough information to make everything work. We need to tie the two types together more tightly, like this:

public interface IMessage<TMessage, TBuilder>
    where TMessage : IMessage<TMessage, TBuilder>
    where TBuilder : IBuilder<TMessage, TBuilder>
{
    …
}

public interface IBuilder<TMessage, TBuilder>
    where TMessage : IMessage<TMessage, TBuilder>
    where TBuilder : IBuilder<TMessage, TBuilder>
{
    …

To make this concrete for Person and Person.Builder we end up with implementations like this:

public class Person : IMessage<Person, Builder>
{
    public static Builder CreateBuilder() { … }

    public Builder CreateBuilderForType() { … }

    public class Builder : IBuilder<Person, Builder>
    {
        public Builder() { … }

        public Person Build() { … }
    }
}

This works, but it’s really ugly. Any generic methods wanting to take a TMessage type parameter implementing IMessage<TMessage, TBuilder> have to also have a TBuilder type parameter, and the two constraints need to be expressed each time. It’s a real pain. In fact, I’ve got an IMessage<TMessage> interface which contains almost nothing in it (and which the more generic interface extends). This allows me to get hold of the message type (and use it in the API), inferring the builder type by reflection. That’s a pain too, frankly. It’s a particular nuisance because when I do infer the builder type, I haven’t actually got any compile-time constraint the lets any other code know that it’s the right builder type for the message type. In one specific case it’s led to this horrific method (in a type generic in TMessage:

private static TMessage BuildImpl<TMessage2, TBuilder> (Func<TBuilder> builderBuilder,
                                                        CodedInputStream input,
                                                        ExtensionRegistry registry)
    where TBuilder : IBuilder<TMessage2, TBuilder>
    where TMessage2 : TMessage, IMessage<TMessage2, TBuilder>
{
    TBuilder builder = builderBuilder();
    input.ReadMessage(builder, registry);
    return builder.Build();
}

Fortunately this is hidden from public view – and the only reason to do it at all is to enable a pleasant API of MessageStreamIterator<TMessage> : IEnumerable<TMessage> where TMessage : IMessage<TMessage>. The result of the evil method above is exactly what the caller is likely to want, otherwise I wouldn’t put up with it. However, that sort of excuse has been coming up far too much in the PB implementation, so I’ve had a quick think about what could be done about it.

Contemplating a more expressive language

I should really prefix this section by saying that I’m not actually suggesting this as a way forward for C# or .NET. (I suspect it would take more work in the CLR as well as just in the language; I don’t know enough about CLR generics to say for sure, but I’d be surprised if this were feasible.) I haven’t encountered many situations where I’ve wanted anything like this, and the extra complexity in the language would be quite high, I suspect. Suppose an interface could contain extra type parameters, including constraints, in the body of the interface:

// Purely imaginary syntax!
public interface IMessage<TMessage> where TMessage : IMessage<TMessage>
{
    <TBuilder> where TBuilder : IBuilder<TBuilder>, TBuilder.TMessage : TMessage

    // Normal methods, which could use TBuilder
}

public interface IBuilder<TBuilder> where TBuilder : IBuilder<TBuilder>
{
    <TMessage> where TMessage : IMessage<TMessage>, TMessage.TBuilder : TBuilder

    // Normal methods, which could use TMessage
}

There are various ways in which the interface implementation could indicate the type of TBuilder. The syntax itself isn’t particularly interesting – it’s the extra information which is conveyed which is the important bit. I’ve dithered between this being a step forward and it not. At first glance it looks no better than having both type parameters in the interface declaration, but I believe it would genuinely make a difference. For instance, the above evil method could be written as:

private static TMessage BuildImpl(Func<TMessage.TBuilder> builderBuilder,
                                  CodedInputStream input,
                                  ExtensionRegistry registry)
{
    TMessage.TBuilder builder = builderBuilder();
    input.ReadMessage(builder, registry);
    return builder.Build();

This time there’s no need for the method to be generic, because the type is already generic in the message type. Furthermore, we can call this method with no reflection. All other APIs which have previously had to be specify two type parameters can now just specify the one. Apart from anything else, this leaves more scope for type inference in generic methods – passing either a message or a builder to a generic method happens occasionally, but it’s very rare to pass in both.

We’ve essentially expressed the relationship between the message type and the builder type a little more explicitly, so that we can guarantee it exists (and use it) at compile time. That’s at the heart of the problem to start with – without a second type parameter in the initial interface declaration, in the current language there’s no way of expressing a close relationship with another type.

Conclusion

I don’t think it would be fair to say that C# really lets us down here – it happens not to support a pretty rare scenario, and that’s fair enough. I’d be interested to know whether any other languages allow the same sort of concepts to be expressed more pleasantly. The ugly solution I’ve presented here does at least work, and it’s nearly invisible to most users, who are likely to just reference the concrete generated types. I’m not happy with the verbosity which has become necessary in many places, but it’s in a good cause. It’s interesting to note that the Java API doesn’t use this sort of doubly-generic relationship: again, covariant return types allow the concrete message and builder types to express their APIs directly and still implement a more general interface at the same time.

In the next part I’ll look at another possibility which would make interfaces and generics a more powerful combination: static interface methods.

Google release protocol buffers as an open source project

Yesterday the Google open source blog announced a new open source project, to release one of the core pieces of Google infrastructure: protocol buffers. Basically protocol buffers are a way of encoding structured data in a language-neutral and versioning-friendly fashion. Yes, there are a lot of similar ways of doing similar things, but this happens to be the one we use.

Currently the protocol buffer compiler (which takes .proto files and converts them into source for various languages) supports C++, Java and Python. I’m hoping to add C# to that list reasonably quickly. To start with this will just be in my spare time rather than as a 20% project, but who knows…

It’s really nice to work in an environment where the question of “is there any reason we can’t open source this?” crops up quite frequently. I’m looking forward to the day when a piece of code I’ve developed at Google ends up as part of an open source project.