Category Archives: C#

Ultimate Man Cave: voice automation for my shed

Source code for everything is on Github. It probably won’t be useful to you unless you’ve got very similar hardware to mine, but you may want to just have a look.


Near the end of 2015, we had a new shed built at the back of our garden. The term “shed” is downplaying it somewhat – it’s a garden building, about 7m x 2.5m, with heating, lighting and an ethernet connection from the house.

It’s divided in half, with one half being a normal shed (lawnmower, wheelbarrow, tools etc) and one half being my office for working from home. Both sides are also used for general storage – we have a lot of stuff to sort out from a loft conversion a few years ago.

The shed

It only took about three days of using the shed for me to work out that I wanted remote-controlled lighting. If I’m going out there at 6.30am in winter, it’s pretty dark – so it’s really useful to be able to turn the lights on from the house first, so I can negotiate the muddier bits of the garden, see the keyhole to unlock it etc.

After a little research, this turned out to be pretty easy: MiLight is simple and relatively cheap. The equivalent of $100 got me four lights and a wifi controller box. It only took me a few minutes to configure it to talk to my wifi, install the Light Controller android app, and I could easily turn my lights on and off from my phone from the house, before stepping outside. Yay. First steps to home automation.

I won’t go into all the details of the rest of the tech in my shed, but the important parts for the purposes of this post are:

Command-line automation

Sometimes, I’m too lazy to reach for my phone when I want to turn on the lights. Very much a first world problem, I realize. And not so much a problem, as an opportunity to see what’s feasible.

So, I looked around the net for code related to MiLight / EasyBulb, and found (amongst other things) Andy Scott’s MiLight.NET library on Github. A small amount of tweaking, and I had a short console app allowing me to run “lights on” or “lights off” which did the obvious thing. Amongst other things, copying this onto an Intel NUC allowed me to turn the lights off via remote desktop when Holly messaged me at the (Google) office to tell me that I’d left them on. It also meant I could schedule a task to turn the lights off at 10.30pm automatically, in case I forgot when I came in.

For a few months, that kept me satisfied… but it was never going to be the final solution.

The next step was to look at other aspects I could automate, and both the amplifier/receiver and the Sonos unit were obvious targets. I knew both had network support, as I already had apps for both on my phone, but I had no idea what the protocols involved were. The amplifier lives in an A/V cabinet, and I normally keep the doors of that shut – so just turning it on, setting the source, and changing the volume either involved getting the phone out or opening the cabinet. Again, could do better.

Sonos supports UPnP/SOAP for control. An old blog post got me started, and then I used Intel Device Spy to work out what else I could easily do. (I don’t have very demanding requirements – just play/pause, set volume, next/previous track is fine.)

It turns out that Onkyo has its own protocol called ISCP (Integra Serial Control Protocol) which has a network binding called eISCP. There’s remarkably good documentation in the form of an Excel spreadsheet, providing more information than I’m ever likely to need.

Implementing both of these was slightly faffy. The eISCP code didn’t work for some time, then started working – presumably with some minor tweak, but it wasn’t clear to me which of the many tweaks I made actually fixed it. The Sonos code worked fairly soon, but was very inelegant for quite a while.

Initially, this was all driven from the command line. I introduced a very simple sort of discovery, separating out controllers from their commands:

public interface IController
    string Name { get; }
    IImmutableList<ICommand> Commands { get; }

public interface ICommand
    string Name { get; }
    string Description { get; }

    void Execute(params string[] arguments);

There’s then a Factory class with a static AllControllers property. (I’m not keen on the naming here, but we’ll come to that later.)

The fact that Execute takes a string array is indicative of its use for a command line application – although looking at it now, I might have made it IEnumerable given that I’ll always be skipping the first actual argument which identifies the controller.

Anyway, this allows a very simple command line app which doesn’t know anything about lights, music etc – it just offers you the controllers and commands it finds.

There’s only actually one implementation of IController, calledReflectiveController. You pass it the real controller to wrap, which can be any instance of a type with a description and with public methods which also have descriptions. These descriptions are provided with an attribute. The arguments passed to Execute are then converted to the method parameter types using Convert.ChangeType. Crude but effective.

With this in place, adding a new command to an existing controller is just a matter of adding a public method. Adding a new controller is just a matter of creating a new class with a description, and adding it to the list of controllers in Factory. It’s all really, really simple.

Deploy to the Pi!

This was the aim all along, of course – I’ve been wanting to try out Windows IoT edition, and put my Raspberry Pi to good use, and try out Windows UAP to get a feeling for it. (In particular, I want to learn about some of the constraints I’ll run into with Noda Time 2.0.) This project was a fantastic excuse to do all three.

I started off by building the application just on my laptop. This is one of the lovely benefits of universal apps – you can get them working in a convenient environment first, then deploy elsewhere when you’re ready.

In fact, the very first version of the app didn’t have any speech recognition – it just had buttons to turn the lights on or off. I checked that this worked on both my laptop and the Raspberry Pi – it was nice to see that Windows IoT still supports a UI over HDMI, and it all worked fine, first time. A few years ago, this would have been absolutely stunning in itself – but I think we’re starting to take portability for granted.

Voice automation

On to the final steps: adding speech recognition.

I had a bit of a false start, as there are multiple approaches to speech recognition in Windows UAP. Initially I tried using Cortana, but never got that to work. Instead, I went with the Windows.Media.SpeechRecognition library, which worked pretty much immediately. Again, my initial attempt was more complicated than it needed to be, using an SRGS grammar file. This worked, but it was fiddly. When I discovered the SpeechRecognitionListConstraint class, it was beautiful… it’s literally just a list of strings, and the speech recognizer raises an event when any of those strings is recognized.

The code required to start the speech recognition is trivial:

private async void RegisterVoiceActivation(object sender, RoutedEventArgs e)
    recognizer = new SpeechRecognizer
        Constraints = { new SpeechRecognitionListConstraint(handlers.Keys) }
    recognizer.ContinuousRecognitionSession.ResultGenerated += HandleVoiceCommand;
    recognizer.StateChanged += HandleStateChange;

        SpeechRecognitionCompilationResult compilationResult = await recognizer.CompileConstraintsAsync();

    if (compilationResult.Status == SpeechRecognitionResultStatus.Success)
        await recognizer.ContinuousRecognitionSession.StartAsync();
        await Dispatcher.RunIdleAsync(_ => lastState.Text = $"Compilation failed: {compilationResult.Status}");

Given the way we’re compiling the constraints, I’d be reasonably happy not checking the compilation result, but I just never took that code away after using it for SRGS (where it was very much required).

The HandleVoiceCommand method just checks whether the recognition confidence is above a certain threshold (0.6 at the moment, but I may tweak it down a bit), and if so, it consults a dictionary to find out a delegate to invoke. It also updates the UI for diagnostic purposes. The dictionary itself is the only code that knows about the shed controllers, using import static to avoid having Factory. everywhere:

private const string Prefix = "shed ";

private static readonly Dictionary<string, Action> handlers = new Dictionary<string, Action>
    { "lights on", Lighting.On },
    { "lights off", Lighting.Off },
    { "music play", Sonos.Play },
    { "music pause", Sonos.Pause },
    { "music mute", () => Sonos.SetVolume(0) },
    { "music quiet", () => Sonos.SetVolume(30) },
    { "music medium", () => Sonos.SetVolume(60) },
    { "music loud", () => Sonos.SetVolume(90) },
    { "music next", Sonos.Next },
    { "music previous", Sonos.Previous },
    { "music restart", Sonos.Restart },
    { "amplifier on", Amplifier.On },
    { "amplifier off", Amplifier.Off },
    { "amplifier mute", () => Amplifier.SetVolume(0) },
    { "amplifier quiet", () => Amplifier.SetVolume(30) },
    { "amplifier medium", () => Amplifier.SetVolume(50) },
    { "amplifier loud", () => Amplifier.SetVolume(60) },
    { "amplifier source pie", () => Amplifier.Source("pi") },
    { "amplifier source sonos", () => Amplifier.Source("sonos") },
    { "amplifier source playstation", () => Amplifier.Source("ps4") }

Here, WithKeyPrefix is just a small extension method to create a new dictionary with a specified prefix to each key.

Just like with the command line version, adding a command is now simply a matter of adding a single entry in this dictionary.

Deploy that on my Raspberry Pi, and as if by magic, I can say “shed lights on” and the lights come on, etc. Admittedly after saying “shed music play” it can be quite tricky to launch further actions, as the music interferes with the speed recognition for obvious reasons.

Simple code for the win

I’d like to take a few moments to talk about the code. At this point, you may want to have Github open in another tab to follow along.

There are lots of things about the code which I’d deem pretty unacceptable at work:

  • It uses the service locator pattern instead of dependency injection. I’m not a fan of this in general.
  • I really hate the name Factory – but I haven’t found anything significantly better, yet. (ControllerProvider? I’d call it just Controllers, but that’s the final part of the namespace name…)
  • There are no tests. At all. Not even a test project.
  • There are only a few comments.
  • The IP addresses are hard-coded into Factory. No config files, no discovery, not even names – just IP addresses.
  • There’s no abstraction beyond IController and ICommand. I could potentially have an IVolumeController, IMusicController, ISourceController etc.

None of these bother me, even though the code is “in production” and I’m expecting to use it for a long time. It’s never going to grow large enough for the service locator pattern to be a problem. With so few types involved, a few non-ideal names isn’t going to cause much of a problem. The only tests that matter are the ones involving me saying “shed amplifier on” and the amplifier either turning on or not… there’s very little code here that’s really testable anyway. My device IP addresses are all fixed by my router, so I’d only have to change them if I change that – and I’d still end up changing it in just one place. Extra abstraction wouldn’t actually give me any benefits at the moment.

So yes, basically I’m happy with the code now. It provides me value, and it’s easy to maintain. In particular, adding extra controllers or commands is trivial. I guess what I’m saying is that this is a reminder that not all code is “enterprise software” and even “best practice” rules such as writing no code without tests have their limitations. Context is king.

What next?

My Raspberry Pi 3 has a small touchscreen display on it, which uses the Rasperry Pi SPI for communication. I haven’t yet managed to get this working, but obviously that would be a lovely next step. It’s a bit of a pain changing from Displayport to HDMI to see the UI and check what phrases have been recognized, for example. The display part will definitely be useful – I might use the touch part just for a very few key commands, such as “stop the music, you can’t hear me any more!”

The device I’d most like to control next is the heater. I keep leaving the heating on accidentally, then having to put my shoes on again to go out and just turn the heating off. If the heater plugged in via a regular socket, it would be easy enough to sort out – but unfortunately the power cable goes straight into a box in the wall. I may try to sort this out at some point, but it’s going to be a pain.

The other thing I’d like to do is add the ability to switch monitor inputs using DDC/CI. That could be tricky in terms of getting access to such a low-level API, and also it requires a permanent “live” connection to the monitor – whereas both my HDMI and Displayport connections are switched (by the Onkyo for HDMI, and a KVM for Displayport). I’m still thinking about that one. I could potentially have a secondary output from the NUC to a DVI input on the monitor, then make the NUC listen as a server that the Pi could talk to…


Home automation is fun and simple – but it really, really helps to have a project which will actually be useful to you. I’ve had a few Raspberry Pis sitting around for ages waiting to be used. They’ve always been fun to play with, but now there’s a purpose, and that makes a huge difference…

To base() or not to base(), that is the question

Today I’ve been reviewing the ECMA-334 C# specification, and in particular the section about class instance constructors.

I was struck by this piece in a clause about default constructors:

If a class contains no instance constructor declarations, a default instance constructor is automatically provided. That default constructor simply invokes the parameterless constructor of the direct base class.

I believe this to be incorrect, and indeed it is, as shown here (in C# 6 code for brevity, despite this being the C# 5 spec that I’m reviewing; that’s irrelevant in this case):

using System;

class Base
    public int Foo { get; }

    public Base(int foo = 5)
        Foo = foo;

class Derived : Base

class Test
    static void Main()
        var d = new Derived();
        Console.WriteLine(d.Foo); // Prints 5

Here the default constructor in Derived clearly doesn’t execute a parameterless constructor in Base because there is no parameterless constructor in Base. Instead, it executes the parameterized constructor, providing the default argument value.

So, I considered whether we could reword the standard to something like:

If a class contains no instance constructor declarations, a default instance constructor is automatically provided. That default constructor simply invokes a constructor of the direct base class as if the default constructor contained a constructor initializer of base().

But is that always the case? It turns out it’s not – at least not in Roslyn. There are more interesting optional parameters we can use than just int foo = 5. Let’s have a look:

using System;
using System.Runtime.CompilerServices;

class Base
    public string Origin { get; }

    public Base([CallerMemberName] string name = "Unspecified",
                [CallerFilePath] string source = "Unspecified",                
                [CallerLineNumber] int line = -1)
        Origin = $"{name} - {source}:{line}";

class Derived1 : Base {}
class Derived2 : Base
    public Derived2() {}
class Derived3 : Base
    public Derived3() : base() {}

class Test
    static void Main()
        Console.WriteLine(new Derived1().Origin);
        Console.WriteLine(new Derived2().Origin);
        Console.WriteLine(new Derived3().Origin);

The result is:

Unspecified - Unspecified:-1
Unspecified - Unspecified:-1
.ctor - c:\Users\Jon\Test\Test.cs:23

When base() is explicitly specified, that source location is treated as the “caller” for caller member info attributes. When it’s implicit (including when there’s a default constructor), no source location is made available to the Base constructor.

This is somewhat compiler-specific – and I can imagine different results where the default constructor could specify a name but not source file or line number, and the declared constructor with an implicit call could specify the name and source file but no line number.

I would never suggest using this little tidbit of Roslyn implementation trivia, but it’s fun nonetheless…

“Sideways overriding” with partial methods

First note: this blog post is very much tongue in cheek. I’m not actually planning on using the idea. But it was too fun not to share.

As anyone following my activity on GitHub may be aware, I’ve been quite a lot of work on Protocol Buffers recently – in particular, a mostly-new port for proto3. I’ve recently been looking at JSON support, and thinking about how to implement “overriding” ToString() for a few well-known types. I generate partial classes, so that gives me a hook to provide extra functionality. Indeed, I’m planning on using this to provide conversion methods for Timestamp and Duration, for example. However, you can’t really override anything in partial methods.

Refresher on partial methods

While partial classes were introduced in C# 2, partial methods were introduced in C# 3. The idea is that one source file (usually the generated one) can provide a partial method signature, and another source file (usually the manually-written one) can provide an implementation if it wants to. Any part of the source can call the method, and the call will be removed at compile-time if nothing provides an implementation. The fact that the method may not be there leads to some limitations:

  • Partial methods are implicitly private, but you can’t specify an access modifier explicitly
  • Partial methods are always void – they can’t return any values
  • Partial methods cannot have out parameters

(Interestingly, a partial method implementation can be an async method – but with a return type of void, which is never a nice situation to be in.)

There’s more in the spec, but the last two bullets are the important part.

So, suppose I want to override ToString() in the generated code, but provide a mechanism for that override to be “further overridden” effectively, in the manual code for the same class? How do I get the value from an “extra override”? How do I even detect whether or not it’s there?

Side effects to the rescue!

(Now there’s a phrase you never thought you’d hear from me.)

I mentioned before that if a partial method is called but no implementation is provided, the call is removed. That includes all aspects of the call – including the evaluation of the arguments. So if evaluating the argument has a side-effect… we can spot that side effect.

Next, we have to work out how to get a value back from a method. We can’t use the return value, and we can’t use an out parameter. There are two options here: we could either pass a wrapper (e.g. an array with a single element) and allow the “extra override” to populate the wrapper… or we can use a ref parameter. The latter feels ever-so-slightly cleaner to me.

And so the ugly hack is born. The code generator can always generate code like this:

partial void ToStringOverride(bool ignored, ref string value);

public override string ToString()
    string value = null;
    bool overridden = false;
    ToStringOverride(overridden = true, ref value);
    return overridden ? value : "Original";

For any partial class where the ToStringOverride method isn’t implemented, overridden will still be false, so we’ll fall back to returning "Original". (I would hope that any decent JIT would remove the overridden and value local variables entirely at that point.) Otherwise, we’ll return whatever the method has changed value to.

Here’s a short but complete example:

using System;

// Generated code
partial class UglyHack1
    partial void ToStringOverride(bool ignored, ref string value);

    public override string ToString()
        string value = null;
        bool overridden = false;
        ToStringOverride(overridden = true, ref value);
        return overridden ? value : "Original";

// Generated code
partial class UglyHack2
    partial void ToStringOverride(bool ignored, ref string value);

    public override string ToString()
        string value = null;
        bool overridden = false;
        ToStringOverride(overridden = true, ref value);
        return overridden ? value : "Original";        

// Manual code
partial class UglyHack2
    partial void ToStringOverride(bool ignored, ref string value)
        value = "Different!";

class Test
    static void Main()
        var g1 = new UglyHack1();
        var g2 = new UglyHack2();


Horribly ugly, but it works…


Obviously this isn’t really pleasant. Some alternatives:

  • Derive from the generated class in order to override ToString again. Doesn’t work with sealed classes, and will only work if clients create instances of the derived class.
  • Introduce a new interface, and allow manual code to implement it on the partial class. The ToString method can then check this is IMyOtherToString or whatever, and call it appropriately. This introduces another virtual call for no great reason, and exposes the interface to the outside world, which we may not want to do.
  • Don’t override ToString in the generated code at all. Not good if you normally want to override it.
  • Introduce an abstract base class which the generated class derives from. Override ToString() in that base class, possibly calling an abstract member which is then provided in the generated class – but allowing the manual code to override ToString() again.


Ugly hacks are fun. But it’s much better to keep them where it belongs: in a blog post, not in production code.

Backwards compatibility is (still) hard

At the moment, I’m spending a fair amount of time thinking about a new version of the C# API and codegen for Protocol Buffers, as well as other APIs for interacting with Google services. While that’s the context for this post, I want to make it very clear that this is still a personal post, and should in no way be taken to be “Google’s opinion” on anything. The underlying issue could apply in many other situations, but it’s easiest to describe a concrete scenario.

Context and current state: the builder pattern

The problem I’ve been trying to address is the relative pain of initializing a protobuf message. Protocol buffer messages are declared in a separate schema file (.proto) and then code is generated. The schema declares fields, each of which has a name, a type and a number associated with it. The generated message types are immutable, with builder classes associated with them. So for example, we might start off with a message like this:

message Person {
  string first_name = 1;
  string last_name = 3;

And construct a Person object in C# like this:

var person = new Person.Builder { FirstName = "Jon", LastName = "Skeet" }.Build();
// Now person.FirstName and person.LastName are readonly properties

That’s not awful, but it’s not the cleanest code in the world. We can make it slightly simpler using an implicit conversion from the builder type to the message type:

Person person = new Person.Builder { FirstName = "Jon", LastName = "Skeet" };

It’s still not really clean though. Let’s revisit why the builder pattern is useful:

  • We can specify just the properties we want.
  • By deferring the “build” step until after we’ve specified everything, we get mutability without building a lot of intermediate objects.

If only there were another language construct allowing that…

Optional parameters to the rescue!

If we provided a constructor with an optional parameter for each property, we can specify just what we want. So something like:

public Person(string firstName = null, string lastName = null)
var person = new Person(firstName: "Jon", lastName: "Skeet");

Hooray! That looks much nicer:

  • We can use var (if we want to) because there are no implicit conversions to confuse things.
  • We don’t need to mention a builder at all.
  • Every piece of text in the statement is something we want to express, and we only express it once.

That last point is a lovely place to be in terms of API design – while you still need to worry about naming, ordering and how the syntax fits into bigger expressions, you’ve achieved some sense of “as simple as possible, but no simpler”.

So, that’s all great – except for versioning.

Let’s just add a field at the end…

One of the aims of protocol buffers is to support an evolving schema. (The limitations are different for proto2 and proto3, but that’s a slightly different matter.) So what happens if we add a new field to the message?

message Person {
  string first_name = 1;
  string last_name = 3;
  string title = 4; // Mr, Mrs etc

Now we end up with the following constructor:

public Person(string firstName = null, string lastName = null, string title = null)

The code still compiles – but if we try to use run our old client code against the new version of the library, it will fail – because the method it refers to no longer exists. So we have source compatibility, but not binary compatibility.

Let’s just add a field in the middle…

You may have noticed that I don’t have a field with tag 2 – this is not an accident. Suppose we now add it, for the obvious middle_name field:

message Person {
  string first_name = 1;
  string middle_name = 2;
  string last_name = 3;
  string title = 4; // Mr, Mrs etc

Regenerate the code, and we end up with a constructor with 4 parameters:

public Person(
    string firstName = null,
    string middleName = null,
    string lastName = null,
    string title = null)

Just to be clear, this change is entirely fine in protocol buffers – while normally fields are assigned incrementally, it shouldn’t be a breaking change to add a new field “between” existing ones.

Let’s take a look at our client code again:

var person = new Person(firstName: "Jon", lastName: "Skeet");

Yup, that still works – we need to recompile, but we still end up with a Person with the right properties. But that’s not the only code we could have started with. Suppose we’d actually had:

var person = new Person("Jon", "Skeet");

Until this last change that would have been fine – even after we’d added the optional title parameter, the two arguments would still have mapped to firstName and lastName respectively.

Having added the middle_name field, however, the code would still compile with no errors or warnings, but the meaning of the second argument would have changed – it would now map onto the middleName parameter instead of lastName.

Basically, we’d like to stop this code (using positional arguments) from compiling in the first place.

Feature requests and a workaround

The two features we really want from C# here are:

  • Some way of asking the generated code to perform dynamic overload resolution at execution time… not based on dynamic values, but on the basis that the code we’re compiling against may have changed since we compiled. This resolution only needs to be performed once, on first execution (or class load, or whatever) as by the time we’re executing, everything is fixed (the parameter names and types, and the argument names and types). It could be efficient.
  • Some way of forcing any call sites to use named arguments for any optional parameters. (Even though in our case all the parameters are optional, I can easily imagine a case where there are a few required parameters and then the optional ones. Using positional arguments for those required parameters is fine.)

It’s hard (without forking Roslyn :) to implement these feature requests ourselves, but for the second one we can at least have a workaround. Consider the following struct:

public struct DoNotCallThisMethodWithPositionalArguments {}

… and now imagine our generated constructor had been :

public Person(
    DoNotCallThisMethodWithPositionalArguments ignoreMe =
    string firstName = null,
    string middleName = null,
    string lastName = null,
    string title = null)

Now our constructor call using positional arguments will fail, because there’s no conversion from string to the crazily-named struct. The only “nice” way you can call it is to use named arguments, which is what we wanted. You could call it using positional arguments like this:

var person = new Person(
    new DoNotCallThisMethodWithPositionalArguments(),

(or using default(...) like the constructor declaration) – but at this point the code looks broken, so it’s your own fault if you decide to use it.

The reason for making it a struct rather than a class is to avoid null being convertible to it. Annoyingly, it wouldn’t be hard to make a class that you could never create an actual instance of, but you can’t prevent anyone from creating a value of a struct. Basically, what we really want is a type such that there is no valid expression which is convertible to that type – but aside from static classes (which can’t be used as parameter types) I don’t know of any way of doing that. (I don’t know what would happen if you compiled the Person class using a non-static class as the first parameter, then made that class static and recompiled it. Confusion on the part of the C# compiler, I should think.)

Another option (as mentioned in the comments) is to have a “poor man’s” version of the compiler enforcement via a Roslyn Code Diagnostic – add an attribute to any method call where you want “All optional parameters must be specified with named arguments” to apply, and then make the code diagnostic complain if you disobey that. That diagnostic could ship with the Protocol Buffers NuGet package, which would make for a pretty nice experience. Not quite as good as a language feature though :)


Default parameters are a pain in terms of compatibility. For internal company code, it’s often reasonable to only care about source compatibility as you can recompile all calling code before deployment – but for open source projects, binary compatibility within the same major version is the norm.

How useful and common do I think these features would be? Probably not common enough to meet the bar – unless there’s encouragement within comments here, in which case I’m happy to file feature requests on GitHub, of course.

As it happens, I’m currently looking at radical changes to the C# implementation of Protocol Buffers, regretfully losing the immutability aspect due to it raising the barrier to entry. It’s not quite a done deal yet, but assuming that goes ahead, all of this will be mostly irrelevant – for Protocol Buffers. There are plenty of other places where code generation could be more robustly backward-compatible through judicious use of optional-but-please-use-named-arguments parameters though…

Precedence: ordering or grouping?

As I’ve mentioned before, I’m part of the technical group looking at updating the ECMA-334 C# standard to reflect the C# 5 Microsoft specification. I recently made a suggestion that I thought would be uncontroversial, but which caused some discussion – and prompted this “request for comment” post, effectively.

What does the standard say about precedence?

The current proposed standard includes the following text:

The order of evaluation of operators in an expression is determined by the precedence and associativity of the operators (ยง13.4.2).

Operands in an expression are evaluated from left to right.

When an expression contains multiple operators, the precedence of the operators controls the order in which the individual operators are evaluated. [Note: For example, the expression x + y * z is evaluated as x + (y * z) because the * operator has higher precedence than the binary + operator. end note]

I like the example in the note, but I’m not keen on the rest of the wording. It’s very easy to miss the difference between operands for any given expression always being evaluated left to right, and operators being evaluated in an order determined by precedence.

I’ve always thought about precedence in terms of grouping, not ordering – I think of one operator as “binding tighter” than another rather than as being “executed before” it. When I consider precedence, I mentally apply brackets to group operators and operands together explicitly. Eric Lippert has blogged along similar lines but I wouldn’t want to put words into his mouth by suggesting he agrees with me. Interestingly, he includes:

Order of evaluation rules describe the order in which each operand in an expression is evaluated.

That’s certainly true about the order of evaluation of operands in an expression, but as seen earlier precedence is also specified in terms of “order of evaluation”. To me, that’s what makes the standard confusing.

Importantly though, when I expressed this in a meeting, smarter people than me said that they exactly thought of precedence in terms of order of evaluation. Grouping was just another way of looking at it, but a sort of secondary approach.

What does “order of evaluation” even mean?

Let’s take a closer look at the wording of the standard. It’s fairly clear what “operands in an expression are evaluated from left to right” means (ignoring the possibility that the “left” operand actually occurs to the right of the “right” operand physically due to line breaks). The left operand is completely evaluated, from start to finish, before the right operand is evaluated. Great.

But what about the “order of evaluation of operators”? Here it’s trickier. Does “evaluating” a + b include first evaluating a and b? Should “order of evaluation” mean “order of starting to evaluate”? If so, the standard would actually be inaccurate. Let’s go back to the example of x + y * z. We can view that as a sequence of steps:

  1. Evaluate x.
  2. Evaluate y.
  3. Evaluate z.
  4. Multiply the results of steps 2 and 3.
  5. Add the results of steps 1 and 4.

Note that the multiplication definitely occurs before the addition. So that looks right. But if I rewrite it just a little to give more context (sorry about the bullets; it’s the only way I could get the formatting right):

  • Evaluate x + y * z
    • 1 Evaluate x
    • 2 Evaluate y * z
      • 2a Evaluate y
      • 2b Evaluate z
      • 2c Multiply the results of steps 2a and 2b; this is the result of step 2
    • 3 Add the results of steps 1 and 2; this is the result of the expression

At that point, it’s clear that we’ve started evaluating the + operator before we’ve started evaluating the * operator.

Similarly, if we view it as a tree:

   / \
  x   *
     / \
    y   z

… we hit the + node before we hit the *node.

So from the “starting to evaluate” perspective, precedence appears to fall apart. The ordering only makes sense when you start talking about the operator performing its duty with the already-evaluated operands – which is tricky for ??, ?. and ?: which don’t always evaluate all their operands. I suspect that’s fixable though, with careful wording – and I’m influenced by the fact that the term “precedence” is naturally ordering-related (one thing preceding another). Maybe it’s as simple as talking about the order of completing evaluation of operands.

Non-conclusion: over to you

So, what should we do in the standard? Given the range of views on the technical group, I said I’d write this blog post and canvas opinion. Readers: how do you as readers think about precedence? How should the standard talk about precedence? Is any aspect of the existing wording (in C# 5 specification it’s section 7.3.1; in ECMA-334 4th edition it’s section 14.2.1) particularly helpful or confusing?

The technical group is full of very smart people – all of them smarter than me and with a deeper computer science (and/or C# compiler implementation) background. That makes me simultaneously nervous of proposing changes – but also confident in my role of “interested amateur” in that if I find something confusing, I suspect some other readers will too.

I’m in no way saying it’s wrong to think of precedence in terms of ordering – albeit with a more precise definition of ordering than we’ve got now – but I’m suggesting it’s not the most helpful way of expressing it for readers. Just to be entirely clear, I’m not suggesting any sort of semantic change – if we change the wording of the standard, it would be purely about clarification, with no behavioural change.

I had originally intended to make this blog post as “on the fence” as possible, but the more I’ve looked at it, the more I’ve reinforced my original position – I can only apologise for not being terribly even-handed. I’m very happy to be corrected though, and look forward to reading plenty of comments. Don’t be shy.

When is an identifier not an identifier? (Attack of the Mongolian Vowel Separator)

Here’s a few things you may not be aware of:

  • C# identifiers can include Unicode escape sequences (\u1234 etc)
  • C# identifiers can include Unicode characters in the category “Other, formatting” (Cf) but these are ignored when comparing identifiers for equality
  • The Mongolian Vowel Separator (U+180E) has oscillated between the Cf and Zs categories a couple of times
  • .NET has its own copy of Unicode categories, separate from whatever Win32 might provide
  • Roslyn (built in .NET) uses the Unicode categories, whereas csc.exe (the “old” native C# compiler) uses either the Win32 categories or a built-in copy
  • Neither the .NET table nor the Win32 table necessarily reflects exactly what any one version of the Unicode standard says
  • Compilers can have bugs in

Put them together, and chaos ensues!

How this all started – blame Vladimir

I started looking at this based on a discussion in our ECMA technical group meeting last week, when we were considering the normative references – and in particular, which version of Unicode we were going to target. Currently the ECMA 4th edition spec targets Unicode 4.0 and the Microsoft C# 5 specification targets Unicode 3.0. It’s not clear to me whether any compilers actually take note of this, and moving forward we’d like both the ECMA and Microsoft standards to not specify a particular version of Unicode, effectively encouraging compiler authors to use the most recent one available to them. Despite the wrinkles listed below, I think that makes the most sense for real world uses – it’s crazy to require compilers to ship with their own private copies of Unicode, effectively.

When discussing this, Vladimir Reshetnikov mentioned the Mongolian Vowel Separator (U+180E) which has had an interesting life. It was introduced in Unicode 3.0.0, when it was in the Cf category (“other, formatting”). Then in Unicode 4.0.0 it was moved into the Zs (separator, space) category. In Unicode 6.3.0 it was then moved back to the Cf category.

Of course, my natural inclination was to try to abuse this. My initial aim was to come up with code which behaved differently depending on which version of Unicode the compiler was using. It turned out to be a little more complicated than that, but we’ll assume a hypothetical compiler first, with no bugs, but which obeys whichever version of the Unicode standard we want it to. (Arguably that’s already a bug given the requirements of the current C# specs, but we’ll set that aside.)

Hypothetical example 1: valid or invalid

For simplicity, let’s start with some source code which is all in ASCII:

class MvsTest
    static void Main()
        string stringx = "a";
        string\u180ex = "b";

If the compiler is using Unicode 6.3 or higher (or a version earlier than 4.0) then U+180E is deemed to be in the Cf category, and is therefore valid within an identifier. In that case, it’s fine for it to be escaped as per the code above. At that point, the identifier in the second line of the method is deemed to be “the same as” stringx, so the output is b.

What about a compiler using a version of Unicode between 4.0 and 6.2 (inclusive) though? At that point, U+180E is deemed to be in the Zs category, which makes it a whitespace character. Zs characters are allowed as whitespace within C# programs – but not within identifiers. Once it’s not a valid identifier – and because this isn’t within a character/string literal – it’s invalid to use the Unicode escape sequence, so the code doesn’t compile.

Hypothetical example 2: valid two different ways

We can write the same code without using an escape sequence, however. If you create a regular ASCII file like this:

class MvsTest
    static void Main()
        string stringx = "a";
        stringAAAx = "b";

then open it up in a hex editor and replace the AAA with bytes E1 A0 8E, then you’ve got a file containing the UTF-8 representation of U+180E at the same location as we had the Unicode escape sequence in the first version.

So, a compiler which compiled the first version would still compile this version (assuming you could tell it that the source code was UTF-8), and the results would be the same – it would print b as the second statement of the method would be a simple assignment to the existing variable.

However, a compiler which treats U+180E as whitespace and would therefore treat the first program as an error would accept this program – and treat that second statement as a declaration of a second local variable (x) and assign it an initial value. You might get a warning about the variable being unused, but it’s valid C# and the output is a.

Reality: the Microsoft compilers

Whenever we talk about the Microsoft C# compiler these days, we need to distinguish between the “native” compiler (csc) and Roslyn (rcsc, although typically I just call it Roslyn).

As it’s written in native code, csc uses whatever Windows supplies for its Unicode character tables – or it embeds it directly in the executable, potentially. (I’ve been scouring MSDN to find a Win32 native function to tell me the Unicode category of a specific code point, and failed so far. It would have been useful…)

Compare that with Roslyn, which is written in C# and (as far as I’m aware) uses char.GetUnicodeCategory – which in turn uses the Unicode tables built into mscorlib.

My experiments suggest that whatever the native compiler uses to get the Unicode category has treated U+180E as a formatting character forever. At least, I’ve tried to find old machines (including VM images) which haven’t had Windows update applied since September 2013 (which is when Unicode 6.3 was published) and they all compile the first program listed above. I’m beginning to suspect that csc might actually have a copy of Unicode 3.0 built into it; it certainly treats U+180E as a formatting character, but doesn’t like either U+0600 or U+00AD within identifiers. (U+0600 wasn’t introduced until Unicode 4.0, but has always been a formatting character; U+00AD was a “dash punctuation” character in Unicode 3.0, and became a formatting character in 4.0.)

The table built into mscorlib has definitely changed over time, however. If you run a simple program such as this:

using System;

class Test
    static void Main()

then running under CLRv2, the result is “SpaceSeparator” whereas running on CLRv4 (at least on a recently-updated system), the result is “Format”.

Of course, Roslyn won’t run under old CLRs, but we have hope by way of – which runs Roslyn in an environment (of uncertain origin – Mono? I’m unsure) which prints “SpaceSeparator” for the above. Sure enough the first program fails to compile – but it’s harder to check the second program, as doesn’t allow you to upload source code, and copy/paste produces some odd results.

Reality: mcs (Mono C# compiler)

Mono’s compiler uses the BCL GetUnicodeCategory code too, which should make it significantly simpler to experiment – but unfortunately, the Mono parser has (at least) two bugs in it:

  • It will allow any Unicode escape sequence in an identifier, whether it’s an escape sequence for a valid identifier part or not. For example, string\u0020x = "" is valid under the Mono compiler. Filed as bug 24968. Source.
  • It doesn’t allow formatting characters within identifiers – it includes characters in classes Mn, Mc, Nd and Pc, but not Cf. Filed as bug 24969. Source.

For this reason, the first program always compiles and prints “b” whereas the second program always fails to compile, regardless of whether U+180E is treated as being in Zs or Cf.

What version is this, anyway?

Next, let’s think about the Unicode data itself. It’s not at all clear which version any particular BCL implementation is actually using. Consider this little program:

using System;

class Test
    static void Main()

On my computer, under CLR v4 this prints “DashPunctuation, Format, Format”, and under both Mono (3.3.0) and CLR v2 it prints “DashPunctuation, Format, SpaceSeparator”.

That’s very odd. It doesn’t correspond with any version of the Unicode standard, as far as I can tell:

  • U+00AD was a Po (other, punctuation) character in Unicode 1.x, then Pd (dash, punctuation) in 2.x and 3.x, and from 4.0 onwards has been Cf.
  • U+0600 was only introduced in Unicode 4.0, and has always been Cf
  • U+180E introduced as Cf in 3.0, then changed to Zs in 4.0, then back to Cf in 6.3.

So there is no version where the first line agrees with either the second line or the third line. I’m basically a bit baffled by this.

What about nameof and CallerMemberName?

The names of identifiers aren’t only used for comparisons – they’re available as strings without any reflection being involved at all. From C# 5, we’ve had CallerMemberName attribute, allowing things like:

public static void X\u0600y()

public static void ShowCaller([CallerMemberName] string caller = null)
    Console.WriteLine("Called by {0}", caller);

And in C# 6, we can write:

string x\u0600y = "";
Console.WriteLine("nameof = {0}", nameof(x\u0600y));

What should those print? They do just print “Xy” and “xy” as the names (respectively), as if the compiler has simply thrown away the formatting character entirely. But what should they print? Bear in mind that in the second case, we could easily have used nameof(xy) and that would still have compared equal to the declared identifier.

We can’t even say “What’s the name of the member being declared?” because you can overload with “different but equal” identifiers:

public static void Xy() {}
public static void X\u0600y() {}
public static void X\u070fy() {}

What should that print out? I’m sure you’ll be relieved to hear that the C# team has a plan in place – but fundamentally this is one of these “no obvious right answer” scenarios. It gets even weirder when you bring the CLI specification into the mix. Section I.8.5.1 of ECMA-335 6th edition has:

Assemblies shall follow Annex 7 of Technical Report 15 of the Unicode Standard
3.0 governing the set of characters permitted to start and be included in identifiers, available online
at Identifiers shall be in the
canonical format defined by Unicode Normalization Form C. For CLS purposes, two identifiers
are the same if their lowercase mappings (as specified by the Unicode locale-insensitive, one-to-one
lowercase mappings) are the same. That is, for two identifiers to be considered different
under the CLS they shall differ in more than simply their case. However, in order to override an
inherited definition the CLI requires the precise encoding of the original declaration be used.

I would love to explore the impact of this by adding a Cf character into IL, but unfortunately I haven’t worked out a way of affecting the encoding of ilasm, in order to persuade it that my hacked up IL is what I want it to be.


As noted before, text is hard.

It turns out that even when restricting oneself to identifiers, text is hard. Who would’ve thought?

When is a string not a string?

When is a string not a string?

As part of my “work” on the ECMA-334 TC49-TG2 technical group, standardizing C# 5 (which will probably be completed long after C# 6 is out… but it’s a start!) I’ve had the pleasure of being exposed to some of the interesting ways in which Vladimir Reshetnikov has tortured C#. This post highlights one of the issues he’s raised. As usual, it will probably never impact 99.999% of C# developers… but it’s a lovely little problem to look at.

Relevant specifications referenced in this post:
– The Unicode Standard, version 7.0.0 – in particular, chapter 3
C# 5 (Word document)
ECMA-335 (CLI specification)

What is a string?

How would you define the string (or System.String) type? I can imagine a number of responses to that question, from vague to pretty specific, and not all well-defined:

  • “Some text”
  • A sequence of characters
  • A sequence of Unicode characters
  • A sequence of 16-bit characters
  • A sequence of UTF-16 code units

The last of these is correct. The C# 5 specification (section 1.3) states:

Character and string processing in C# uses Unicode encoding. The char type represents a UTF-16 code unit, and the string type represents a sequence of UTF-16 code units.

So far, so good. But that’s C#. What about IL? What does that use, and does it matter? It turns out that it does… Strings need to be represented in IL as constants, and the nature of that representation is important, not only in terms of the encoding used, but how the encoded data is interpreted. In particular, a sequence of UTF-16 code units isn’t always representable as a sequence of UTF-8 code units.

I feel ill (formed)

Consider the C# string literal "X\uD800Y". That is a string consisting of three UTF-16 code units:

  • 0x0058 – ‘X’
  • 0xD800 – High surrogate
  • 0x0059 – ‘Y’

That’s fine as a string – it’s even a Unicode string according to the spec (item D80). However, it’s ill-formed (item D84). That’s because the UTF-16 code unit 0xD800 doesn’t map to a Unicode scalar value (item D76) – the set of Unicode scalar values explicitly excludes the high/low surrogate code points.

Just in case you’re new to surrogate pairs: UTF-16 only deals in 16-bit code units, which means it can’t cope with the whole of Unicode (which ranges from U+0000 to U+10FFFF inclusive). If you want to represent a value greater than U+FFFF in UTF-16, you need to use two UTF-16 code units: a high surrogate (in the range 0xD800 to 0xDBFF) followed by a low surrogate (in the range 0xDC00 to 0xDFFF). So a high surrogate on its own makes no sense. It’s a valid UTF-16 code unit in itself, but it only has meaning when followed by a low surrogate.

Show me some code!

So what does this have to do with C#? Well, string constants have to be represented in IL somehow. As it happens, there are two different representations: most of the time, UTF-16 is used, but attribute constructor arguments use UTF-8.

Let’s take an example:

using System;
using System.ComponentModel;
using System.Text;
using System.Linq;

class Test
    const string Value = "X\ud800Y";

    static void Main()
        var description = (DescriptionAttribute)
            typeof(Test).GetCustomAttributes(typeof(DescriptionAttribute), true)[0];
        DumpString("Attribute", description.Description);
        DumpString("Constant", Value);

    static void DumpString(string name, string text)
        var utf16 = text.Select(c => ((uint) c).ToString("x4"));
        Console.WriteLine("{0}: {1}", name, string.Join(" ", utf16));

The output of this code (under .NET) is:

Attribute: 0058 fffd fffd 0059
Constant: 0058 d800 0059

As you can see, the “constant” (Test.Value) has been preserved as a sequence of UTF-16 code units, but the attribute property has U+FFFD (the Unicode replacement character which is used to indicate broken data when decoding binary to text). Let’s dig a little deeper and look at the IL for the attribute and the constant:

.custom instance void [System]System.ComponentModel.DescriptionAttribute::.ctor(string)
= ( 01 00 05 58 ED A0 80 59 00 00 )

.field private static literal string Value
= bytearray (58 00 00 D8 59 00 )

The format of the constant (Value) is really simple – it’s just little-endian UTF-16. The format of the attribute is specified in ECMA-335 section II.23.3. Here, the meaning is:

  • Prolog (01 00)
  • Fixed arguments (for specified constructor signature)
    • 05 58 ED A0 80 59 (a single string argument as a SerString)
      • 05 (the length, i.e. 5, as a PackedLen)
      • 58 ED A0 80 59 (the UTF-8-encoded form of the string)
  • Number of named arguments (00 00)
  • Named arguments (there aren’t any)

The interesting part is the “UTF-8-encoded form of the string” here. It’s not valid UTF-8, because the input isn’t a well-formed string. The compiler has taken the high surrogate, determined that there isn’t a low surrogate after it, and just treated it as a value to be encoded in the normal UTF-8 way of encoding anything in the range U+0800 to U+FFFF inclusive.

It’s worth noting that if we had a full surrogate pair, UTF-8 would encode the single Unicode scalar value being represented, using 4 bytes. For example, if we change the declaration of Value to:

const string Value = "X\ud800\udc00Y";

then the UTF-8 bytes in the IL are 58 F0 90 80 80 59 – where F0 90 80 80 is the UTF-8 encoding for U+10000. That’s a well-formed string, and we get the same value for both the description attribute and the constant.

So in our original example, the string constant (encoded as UTF-16 in the IL) is just decoded without checking whether or not it’s ill-formed, whereas the attribute argument (encoded as UTF-8) is decoded with extra validation, which detects the ill-formed code unit sequence and replaces it.

Encoding behaviour

So which approach is right? According to the Unicode specification (item C10) both could be fine:

When a process interprets a code unit sequence which purports to be in a Unicode character encoding form, it shall treat ill-formed code unit sequences as an error condition and shall not interpret such sequences as characters.


Conformant processes cannot interpret ill-formed code unit sequences. However, the conformance clauses do not prevent processes from operating on code unit sequences that do not purport to be in a Unicode character encoding form. For example, for performance reasons a low-level string operation may simply operate directly on code units, without interpreting them as characters. See, especially, the discussion under D89.

It’s not at all clear to me whether either the attribute argument or the constant value “purports to be in a Unicode character encoding form”. In my experience, very few pieces of documentation or specification are clear about whether they expect a piece of text to be well-formed or not.

Additionally, System.Text.Encoding implementations can often be configured to determine how they behave when encoding or decoding ill-formed data. For example, Encoding.UTF8.GetBytes(Value) returns byte sequence 58 EF BF BD 59 – in other words, it spots the bad data and replaces it with U+FFFD as part of the encoding… so decoding this value will result in X U+FFFD Y with no problems. On the other hand, if you use new UTF8Encoding(true, true).GetBytes(Value), an exception will be thrown. The first constructor argument is whether or not to emit a byte order mark under certain circumstances; the second one is what dictates the encoding behaviour in the face of invalid data, along with the EncoderFallback and DecoderFallback properties.

Language behaviour

So should this compile at all? Well, the language specification doesn’t currently prohibit it – but specifications can be changed :)

In fact, both csc and Roslyn do prohibit the use of ill-formed strings with certain attributes. For example, with DllImportAttribute:

static extern void Foo();

This gives an error when Value is ill-formed:

error CS0591: Invalid value for argument to 'DllImport' attribute

There may be other attributes this is applied to as well; I’m not sure.

If we take it as read that the ill-formed value won’t be decoded back to its original form when the attribute is instantiated, I think it would be entirely reasonable to make it a compile-time failure – for attributes. (This is assuming that the runtime behaviour can’t be changed to just propagate the ill-formed string.)

What about the constant value though? Should that be allowed? Can it serve any purpose? Well, the precise value I’ve given is probably not terribly helpful – but it could make sense to have a string constant which ends with a high surrogate or starts with a low surrogate… because it can then be combined with another string to form a well-formed UTF-16 string. Of course, you should be very careful about this sort of thing – read the Unicode Technical Report 36 “Security Considerations” for some thoroughly alarming possibilities.


One interesting aspect to all of this is that “string encoding arithmetic” doesn’t behave as you might expect it to. For example, consider this method:

// Bad code!
string SplitEncodeDecodeAndRecombine
    (string input, int splitPoint, Encoding encoding)
    byte[] firstPart = encoding.GetBytes(input.Substring(0, splitPoint));
    byte[] secondPart = encoding.GetBytes(input.Substring(splitPoint));
    return encoding.GetString(firstPart) + encoding.GetString(secondPart);            

You might expect that this would be a no-op so long as everything is non-null and splitPoint is within range… but if you happen to split in the middle of a surrogate pair, it’s not going to be happy. There may well be other potential problems lurking there, depending on things like normalization form – I don’t think so, but at this point I’m unwilling to bet too heavily on string behaviour.

If you think the above code is unrealistic, just imagine partitioning a large body of text, whether that’s across network packets, files, or whatever. You might feel clever for realizing that without a bit of care you’d get binary data split between UTF-16 code units… but even handling that doesn’t save you. Yikes.

I’m tempted to swear off text data entirely at this point. Floating point is a nightmare, dates and times… well, you know my feelings about those. I wonder what projects are available that only need to deal with integers, and where all operations are guaranteed not to overflow. Let me know if you have any.


Text is hard.