A while ago, I was directed to a disturbing (in my view) post on GrantRi’s blog, to do with the .NET memory model. I’m slightly less disturbed having read his clarification, but there’s a fairly deep problem here. Here’s part of a sample class:
string name; public void WriteNameLength() { string localName = name; if (localName!=null) { Console.WriteLine (localName.Length); } } |
Now, other threads may be changing the value of name
all over the place, and there’s an issue in WriteNameLength
in terms of whether or not it shows the “latest” value, but my question is: can the above throw a NullReferenceException
?
It looks like it can’t, because even if name
becomes null during the method, surely the value of localName
can’t change – it’s either null or it’s not, and we don’t try to dereference it if it’s null.
Unfortunately, it looks from Grant’s blog post as if a JIT should be free to treat the above as:
public void WriteNameLength() { if (name!=null) { Console.WriteLine (name.Length); } } |
Now the above clearly can throw an exception, if name
becomes null in another thread after the “if” and before the dereference (and if that change is noticed by the thread running WriteNameLength
).
This surprises me – just as it surprised lots of people reading Grant’s blog. It surprised me so much that I checked the CLI specification, and couldn’t work out whether it was correct or not. This is even more worrying – so I mailed Grant, and his (very speedy) reply was along the lines of “I’m not an expert, but it looks to me like the spec is too vague to say for sure whether this is legitimate or not.” (I apologise if I’ve misrepresented the reply – in some ways it doesn’t matter though.)
When trying to write performant, reliable systems, it is surely crucial to have a memory model specification which can be reasoned with. The Java memory model was reasonably well defined before 1.5, and then (after years of detailed discussion) it was updated in way which I believe was designed to give backward compatibility but lay out very clear rules. Surely the CLI deserves a specification with a similar level of detail – one which both JIT developers and application developers can use to make sure that there are no surprises amongst informed individuals. (There will always be people who write multi-threaded programs while remaining blissfully unaware of the importance of a memory model. It’s very hard to cater for them without crippling JIT optimisation, effectively synchronising all the time. I’m not too worried about that.)
Usually, when I’m writing multi-threaded code, I err on the side of caution – I tend to use locks when I could get away with volatile variables, for instance, just because I need to think slightly less hard to make sure everything’s correct. There are people for whom that’s not just good enough – their performance requirements make every context switch, every locking operation, every optimisation restriction valuable enough to really need to know the details of the memory model. There should be an effort on the part of MS and/or the ECMA committee to clearly and specifically define what the CLI memory model does and doesn’t guarantee. I doubt that anyone reading this blog is in a position to instigate such an effort – but
if you are, please give it careful consideration.
Vance Morrison has a great write-up of the CLR memory model (including changes that were made in 2.0 to strengthen the model) here: http://msdn.microsoft.com/msdnmag/issues/05/10/MemoryModels/
Joe Duffy’s blog has some great memory model tidbits as well: http://www.bluebytesoftware.com/blog/
My two cents: I view lock-free programming with great suspicion. I’d gladly trade cycles for less risk of bizarre “un-reproable” race conditions.
LikeLike
Jon,
Both your code samples have “==” in this “if”. Am I misreading it, or did you intend to use != in the ifs?
LikeLike
John: Thanks for spotting the code issue. Fixed.
Joel: Thanks for the links. It sounds like the .NET 2.0 memory model has been carefully thought out. Now we just need:
1) The model to be part of the next ECMA spec. (This may be in progress – I’ve no idea.)
2) The model to be really, really clear in terms of what it means. Given that it wasn’t obvious that the 1.0 ECMA spec allowed extra reads to be introduced (removing the local variable) I think the memory model should make everything very explicit. (That’s not the same as saying it should prohibit lots of optimisation – it just needs to be clear about what is and isn’t prohibited.) Hopefully there’s more detail available somewhere – I couldn’t see a link on the page to an authoritative definition of the 2.0 memory model.
LikeLike
I believe that this JIT compiler optimization has to do with string interning (intern pool)
LikeLike
Hi Jon,
As I said back then, that reading of the spec is insane. At the time I discussed this with Chris Brumme and fortunately he agreed with me. In fact, I think he’s one of the people that pushed for the stronger memory model that was ultimately picked for Whidbey (and is described by Vance’s post that Joel pointed to).
Here’s the killer argument why the x64 JIT team interpretation of the spec was bogus:
void Foo(String str)
{
if (…some string validation…)
throw new SomeException();
// do something with validated string
}
If the caller now passes in a string that was read from a field and then another thread subsequently modifies that field, it would be possible for the Foo method to see a different string after having already validated it. The only way to get around this, would be to start each method with a memory barrier. Clearly that would be unacceptable.
LikeLike
Vance’s article is the best reference on the 2.0 memory model, as implemented. If you care about CLI conformance, however, then CLI 2.0 is still your best bet, which specifies the “legacy” memory model. In the implemented commercial CLR’s memory model, reads cannot be introduced. In the specified memory model, they can. In the end, most people only care about getting their code to work on the commercial CLR, so it’s reasonable to depend on the implemented behavior. Allowing the introduction of reads is pretty asinine for lock free code, but you can avoid all these headaches by simply taking a lock. Most people should stick with heavyweight synchronization like this, and the words “memory model” will never, ever have to enter their minds.
I’ve pushed for specifying our implemented memory model, but there has generally been disinterest in it in the past. I suspect this is because members of the standards committees don’t want to force a change on implementors, but I think it would (at least) make a great informative section, i.e. a recommendation.
–joe
LikeLike