Writing is hard work: what I’ve been up to recently…

Just a brief note to explain what I’ve been up to recently (and why I’ve got about four fun blog posts which I haven’t had time to write up yet). I’m wildly pleased to say that I’m currently writing a C# book for Manning (the same folks who published Groovy in Action).

I can’t give any more details at the moment, but hopefully as we get closer to publication I can give more details about not just the content but the writing process and anything interesting I’ve discovered while writing it. (Heck, there’s never enough room for everything you might want to include in a book – there’ll no doubt be plenty of left-overs to go round :)

Anyway, it’s hard work but incredibly rewarding. 28 hour days would be really welcome right now, admittedly, but the buzz is fantastic.

Non-volatile reads and Interlocked, and how they interact

Recently (May 2007) there’s been a debate on the microsoft.public.dotnet.framework newsgroup about the memory model, non-volatile variables, the Interlocked class, and how they all interact. Consider the following program:

Update! I screwed up the code, making all of the combinations possible accidentally. The new code is now in the post – the first few comments are based on the original code.

using System;
using System.Threading;

class Program
{
    int x;
    int y;
    
    void Run()
    {
        ThreadStart increment = IncrementVariables;
        new Thread(increment).Start();
        
        int a = x;
        int b = y;
        
        Console.WriteLine ("a={0}, b={1}", a, b);
    }
    
    void IncrementVariables()
    {
        Interlocked.Increment(ref y);
        Interlocked.Increment(ref x);
    }
    
    static void Main()
    {
        new Program().Run();
    }
}

The basic idea is that two variables are read at “roughly the same time” as they’re being incremented on another thread. The increments are each performed using Interlocked.Increment, which introduces a memory barrier – but the variables are read directly, and they’re not volatile. The question is what the program can legitimately print. I’ve put the reads into completely separate statements so that it’s crystal clear what order the IL will put them in. That unfortunately introduces two extra variables, a and b – think of them as “the value of x that I read” and “the value of y that I read” respectively.

Let’s consider the obvious possible values first:

a=0, b=0

This is very straightforward – the variables are read before the incrementing thread has got going.

a=0, b=1

This time, we read the value of x (and copied the value into a) before the incrementing thread did anything, then the values were incremented, and then we read the value of y.

a=1, b=1

This time, the incrementing thread does all its work before we get round to reading either of the variables.

So far, so good. The last possibility is the tricky one:

a=1, b=0

This would, on first sight, appear to be impossible. We increment y before we increment x, and we read x before we read y – don’t we? That should prevent this situation.

My contention is that there’s nothing to prevent the JIT from reordering the reads of x and y, effectively turning the middle bit of the code into this:

using System;
int b = y;
int a = x;
        
Console.WriteLine ("a={0}, b={1}", a, b);

Now that code could obviously show “a=1, b=0” by reading y before the increments took place and x afterwards.

The suggestion in the discussion was that the CLR had to honour the interlocked contract by effectively treating all access to these variables as volatile, because they’d been used elsewhere in an Interlocked call. I maintain that’s not only counter-intuitive, but would also require (in the case of public variables) all assemblies which might possibly use Interlocked with the variables to be scanned, which seems infeasible to me.

So, what do you all think? I’ll be mailing Joe Duffy to see if he can give his somewhat more expert opinion…