For a while now, I’ve been a big fan of a pattern in C# which mimics Java enums to a certain extent. In general, it’s a lovely pattern. Only after reading a comment on a recent blog post by Eric Lippert did I find out about a horrible flaw. Dubious thanks to John Payson for prompting this post. (And I fully intend to include this in the Abusing C# talk that I give every so often…)
Trigger warning: this post contains code you may wish you’d never seen.
Let’s start off with the happy path – a moment of calm before the storm.
When things go well…
The idea is to create a class – which can be abstract – with a certain fixed set of instances, which are available via static properties on the class itself. The class is not sealed, but it prevents other classes from subclassing it by using a private constructor. Here’s an example, including how it can be used:
using System; public abstract class Operation { public static Operation Add { get; } = new AddOperation(); public static Operation Subtract { get; } = new SubtractOperation(); // Prevent non-nested subclasses. They can't call this constructor, // which is the only way of creating an instance. private Operation() {} public abstract int Apply(int lhs, int rhs); private class AddOperation : Operation { public override int Apply(int lhs, int rhs) { return lhs + rhs; } } private class SubtractOperation : Operation { public override int Apply(int lhs, int rhs) { return lhs - rhs; } } } public class Test { static void Main() { Operation add = Operation.Add; Console.WriteLine(add.Apply(5, 3)); // 8 } }
(Note the use of the new “automatically implemented read-only property” from C# 6. Yum!)
Obviously there are possible variations of this – there could be multiple instances of one subclass, or we could even create instances dynamically if that were useful. The main point is that no other code can create instances or even declare their own subclasses of Operation
. That private constructor stops everything, right? Um…
My eyes! The goggles do nothing!
I’m sure all my readers are aware that every C# constructor either has to chain to another constructor using : this(...)
or chain to a base class constructor, either implicitly calling a constructor without arguments or explicitly using : base(...)
.
What I wasn’t aware of – or rather, I think I’d been aware of once, but ignored – was that constructor initializers can have cycles. In other words, it’s absolutely fine to write code like this:
public class SneakyOperation : Operation { public SneakyOperation() : this(1) {} public SneakyOperation(int i) : this() {} public override int Apply(int lhs, int rhs) { return lhs * rhs; } }
Neither operator attempts to call a base class constructor, so it doesn’t matter that the only base class constructor is private. This code compiles with no issues – not even a warning.
So we’ve already failed in our intention to prevent subclassing from occurring… but this will fail with a StackOverflowException
which will take down the process, so there’s not much harm, right?
Well… suppose we didn’t let it throw a StackOverflowException
. Suppose we triggered our own exception first. There are various ways of doing this. For example, we could use arithmetic:
public SneakyOperation(int x) : this(x, 0) {} // This isn't going to be happy if y is 0... public SneakyOperation(int x, int y) : this(x / y) {}
Or we could have one constructor call another with a delegate which will deliberately throw:
public SneakyOperation(int x) : this(() => { throw new Exception("Bang!"); }) {} public SneakyOperation(Func<int> func) : this(func()) {}
So now, we have a class where you can call the constructor, and an exception will be thrown. But hey, you still can’t get at the instance which was created, right? Well, this is tricky. We genuinely can’t use this
anywhere in code which we expect to execute – there’s no way of reaching a constructor body, because in order to do that we’d have to have some route which took us through the base class constructor, which was can’t access. Field initializers aren’t allow to use this
, either explicitly or implicitly (e.g. by calling instance methods). If there’s a way of circumventing that, I’m not aware of it.
So, we can’t capture a reference to the instance before throwing the exception. And after the exception is thrown, it’s garbage, and therefore unavailable… unless we take it out of the garbage bin, of course. Enter a finalizer…
private static volatile Operation instance; ~SneakyOperation() { // Resurrect the instance. instance = this; }
Now all we have to do is wrap up the "create an object and wait for it to be resurrected" logic in a handy factory method:
public static Operation GetInstance() { try { new SneakyOperation(0); } catch {} // Thoroughly expected Operation local; while ((local = instance) == null) { GC.Collect(); GC.WaitForPendingFinalizers(); } return local; }
Admittedly if multiple threads call GetInstance()
at a time, there’s no guarantee which thread will get which instance, or whether they’ll get the same instance… but that’s fine. There’s very little to guarantee this will ever terminate, in fact… but I don’t care much, as clearly I’m never going to use this code in production. It works on my machine, for the few times I’ve been able to bear running it. Here’s a complete example (just combine it with the Operation
class as before) in case you want to run it yourself:
public class SneakyOperation : Operation { private static volatile Operation instance; private SneakyOperation(int x) : this(() => { throw new Exception(); }) {} private SneakyOperation(Func<int> func) : this(func()) {} public static Operation GetInstance() { try { new SneakyOperation(0); } catch {} Operation local; while ((local = instance) == null) { GC.Collect(); GC.WaitForPendingFinalizers(); } return local; } ~SneakyOperation() { instance = this; } public override int Apply(int lhs, int rhs) { return lhs * rhs; } } public class Test { static void Main() { Operation sneaky = SneakyOperation.GetInstance(); Console.WriteLine(sneaky.Apply(5, 3)); // 15 } }
Vile, isn’t it?
Fixing the problem
The first issue is the cyclic constructors. That’s surely a mistake in the language specification. (It’s not a bug in the compiler, as far as I can tell – I can’t find anything in the spec to prohibit it.) This can be fixed, and it shouldn’t even be too hard to do. There’s nothing conditional in constructor initializers; each constructor chains to exactly one other constructor, known via overload resolution at compile time. Even dynamic
doesn’t break this – constructor initializers can’t be dispatched dynamically. This makes detecting a cycle trivial – from each constructor, just follow the chain of constructor calls until either you reach a base class constructor (success) or you reach one of the constructor declarations already in the chain (failure). I will be pestering Mads for this change – it may be too late for C# 6, but it’s never too early to ask about C# 7. This will be a breaking change – but it will only break code which is tragically broken already, always resulting in either an exception or a stack overflow.
Of course, a C# language change wouldn’t change the validity of such code being created through IL. peverify
appears to be happy with it, for example. Still, a language fix would be a good start.
There’s a second language change which could help to fix this too – the introduction of private abstract/virtual methods. Until you consider nested classes, neither of those make sense – but as soon as you remember that there can be subclasses with access to private members, it makes perfect sense to have them. If Operation
had a private abstract method (probably exposed via a public concrete method) then SneakyOperation
(and any further subclasses) would have to be abstract, as it couldn’t possibly override the method. I’ve already blogged about something similar with public abstract classes containing internal abstract methods, which prevents concrete subclasses outside that assembly… but this would restrict inheritance even further.
Does it reach the usefulness bar required for new language features? Probably not. Would it be valid IL with the obvious meaning? I’ve performed experiments which suggest that it would be valid, but not as restrictive as one would expect – but I could be mistaken. I think there’s distinct merit in the idea, but I’m not expecting to see it come to pass.
Conclusion
So, what can you take away from this blog post? For starters, the fact that the language can probably be abused in far nastier ways than you’ve previously imagined – and that it’s very easy to miss the holes in the spec which allow for that abuse. (It’s not like this hole is a recent one… it’s been open forever.) You should absolutely not take away the idea that this is a recommended coding pattern – although the original pattern (for Operation
) is a lovely one which deserves more widespread recognition.
Finally, you can take away the fact that nothing gets my attention quicker than a new (to me) way of abusing the language.
Sufficently priviledged code could use FormatterServices.GetUninitializedObject to create instance of SneakyOperation without abuse of finalizers
LikeLike
True. I vaguely remembered that there was something like that, but couldn’t remember exactly what it is. Still, the finalizer resurrection hack is arguably more fun ;)
LikeLike
I can’t think of any code I’ve written in C# that just screams for the pattern you used in
Operator
, but it looks extremely similar to how F#’s discriminated unions after they’ve been tossed into a decompiler. Do you know if there’s a proper name for the pattern?LikeLike
I don’t know of a name for it – but it is a very useful pattern when you know about it. As I mentioned before, when you only have a fixed set of values it’s very similar to Java enums.
LikeLike
I guess “Jon Skeet’s Almost Java Enums pattern” will do. Despite the lack of static type checking that comes with unions in F#, it seems like this would be great in C# for the same purposes I use unions otherwise. I’ll have to bookmark this and try it out somewhere so I don’t forget.
LikeLike
I call them Static Enums since the values are all static, but I’m not sure they have a common name. They definitely remind me of Java’s enum.
LikeLike
Looks pretty similar to the type safe enum pattern to me.
LikeLike
Yes, in this incarnation – but it’s pretty flexible in terms of allowing for multiple instances to be created if necessary, etc.
LikeLike
Jon, with All Hallow’s Eve drawing near, I think it very timely of you to post a piece of code that is, quite literally, so horrifying that one can’t look away. Tonight, I may need to sleep with the lights on and the covers up over my head.
Very twisted. :-)
LikeLiked by 1 person
Excellent post, as always! I always thought that looking at bad code can teach you how to write good code. After today, I’m not so sure anymore (The finalizer-resurrect-hack will haunt me forever…). Still, from a learning point-of-view, this is very intresting, thanks!
LikeLike
I’ve used that pattern regularly (once also representing Operators) for a number of years. I’ve always thought of it as the state pattern just missing the transition graph.
LikeLike
One thing ocurred to me though: Why is the instance valid after the finalizer was called by the GC?
LikeLike
That’s just regular object resurrection: if a finalizer takes some action that means the instance is reachable, it’s no longer eligible for garbage collection.
LikeLike
Of yourse, silly me. Here’s an additional reference to your post: http://knowyourmeme.com/memes/the-goggles-do-nothing
LikeLike
The cyclic constructor thing is certainly interesting as a trick to make things compile. As for creating an instance, there is a much simpler way to do that:
using System.Runtime.Serialization;
var sneakyOperation = (SneakyOperation)FormatterServices.GetUninitializedObject(typeof(SneakyOperation));
Incidentally, the GetInstance() method did not complete for me after several minutes of waiting. I’m not sure why, repeated GC.Collect() calls should do the trick, I would have thought.
LikeLike
Yes, Denis mentioned GetUninitializedObject too. I like the fact that the finalizer approach doesn’t require much in the way of permissions though…
When you ran the code, was this under the debugger? I just compiled and ran from a command prompt – I wonder whether that makes a difference.
LikeLike
@helb Because the finalizer is re-assinging the instance to the static variable, hence resurrecting it.
LikeLike
Ugh! Not quite the point of this blog post, but I wish I had this pattern months ago when I was trying to port a Java application over to C#! Enums and EnumSets were the most painful part.
LikeLike
The serialization hack, using GetUninitializedObject, of course also points out the other gotcha with this pattern, which is that if you serialize and then deserialize instances of your locked down class or its subclasses, you will get additional instances which are not the same ones stored in your static fields – although you are at least assured they won’t be instances of classes you didn’t define. Could be a source of subtle bugs though, if code relies on the identity-equality of an operation with Operation.Add for some purpose.
LikeLike
I would hope that proper serialization of a “smart enum” means turning it into its property name (like { “OperationList”: [“Add”, “Add”] } ) rather than some implementation-specific .net serialization mechanism. I love TypeConverters, which you could use to deserialize the strings into the “singleton” references: http://www.hanselman.com/blog/TypeConvertersTheresNotEnoughTypeDescripterGetConverterInTheWorld.aspx
LikeLike
Indeed – and this is the sort of thing that Java enums sort out for you. I generally try to avoid the sort of serialization that gets into trouble with this anyway, mind you :)
LikeLike
Nice article, and amazing abuse as usual :D
It seems that the VB.Net compiler team already take that in count (or it was for another reason) because when I wanted to try it in VB.Net (I often do that to see the behavioural differences between .Net languages) the .ctor cycle was detected and flagged as error
Also there are two little “typos” in the article :
– When you present the Sneaky .ctor which takes a Func you’ve kept the comment associated with the divide by 0 from the previous .ctor
– In the complete example at the end, the generic parameter of Func is missing.
LikeLike
Thanks – fixed both the errors now :) (Or at least, I’ve tried to. WordPress still hasn’t quite got the hang of Markdown, occasionally removing angle brackets in code when it shouldn’t.)
LikeLike
I just tried to mess up with this in newer versions, and it seems it was actually fixed in the compiler. Now I get error CS0768: Constructor ‘SneakyOperation.SneakyOperation(int)’ cannot call itself through another constructor.
LikeLike
Yes – I can’t remember whether it needed a spec change or just a compiler fix, but it’s sorted now.
LikeLike
Since access restrictions in programming languages are simply a tool for detecting violations of encapsulation (in the same way that type declarations are a tool) and have nothing to do with security, there is no reason to care about such abuses, even more so when it comes to IL, and foolhardy to make language changes merely to prevent such abuse (which is not to say the spec/compiler shouldn’t be changed to prevent constructor loops; allowing them is an obvious error).
LikeLiked by 1 person
Here’s one you might like from Marek Safar and VSadov: https://roslyn.codeplex.com/workitem/224
//non-standard bool
//We’re setting a bool’s value to a byte value of 5.
var a = new bool[1];
Buffer.SetByte(a, 0, 5);
//non-standard bool
//We’re setting a bool’s value to a byte value of 10.
var b = new bool[1];
Buffer.SetByte(b, 0, 10);
//Both are true.
Console.WriteLine(a[0]);
Console.WriteLine(b[0]);
//But they are not the same true.
Console.WriteLine(a[0] == b[0]);
LikeLike
Yes, I think I’ve seen something like that before – an alternative way of setting the value is to use a struct with two fields positioned at the same location, IIRC.
LikeLike
Very nasty indeed. I’m not too knowleadgeable of C#, so I have to ask: when does field initialization happen? I’m afraid of what could come next.
LikeLike
I used a variation of the Enum pattern for some Linq work once. Application developer were given a large List of data, which they had to pull subset from, which they would do like this:
var y = Businesses.Where(x=>x.CategoryID == 5 && x.SubCatID ==3).ToList();
So I created a class like:
class Filter
{
public static Func PetStoresOnly { get { return (Item x) => x.CategoryID == 5 && x.SubCatID ==3; } }
// etc….
}
then they could write:
var y = BigList.Where(Filter.PetStoresOnly).ToList();
They thought I’d written an overload which took a enum….
LikeLike
Java-style enums are very useful for implementing State and Strategy Pattern, amongst other things. I’m surprised that C# doesn’t have anything quite like them by now. Sure you can simulate them, but appropriate syntactic sugar would be nice.
LikeLiked by 1 person
just a suggestion for the private abstract methods. One could get something similar with internal abstract which is allowed in the current version. This will at least mitigate the inherit from other assembly case.
LikeLike
i’ve just stumbled upon this patter on SO, but i can’t think of use of it, except just “type safe enum that cant be used with case”. will you give a couple of valuable examples please.
in short: i see the elegance and i like it but i can’t see much value yet
LikeLike
Will see if I can find some examples, but don’t have the time to search exhaustively now. Just bear it in mind as a potential pattern – and consider that unlike a regular C# enum, it can behave polymorphically.
LikeLike
i, looks like, asked something i wasn’t implying. i need not examples as code snippets, but just some real-world application of pattern described in a few words. for me, things for Chain of Responsibility or Adaptor etc are instantly obvious. but for mentioned pattern — nothing. almost! as Strategy and State were both mentioned here. i think these are good examples. especially State becomes so neat!
LikeLike
I think the most obvious application of a type safe enum is the ability to provide a custom ToString/Description for each enum value without needing to resort to attribute+reflection+extension methods. It provides a bit of encapsulation, putting the code related to the enum within the enum itself instead of externally.
Additionally, there aren’t many things that Java does better than C# but I would say enums are one of them (though I understand why they’re simply sugar over integers in C#). Take a look at the “planet” example in the Java tutorial:
https://docs.oracle.com/javase/tutorial/java/javaOO/enum.html
LikeLike
Oh, and there is discussion about integrating pattern matching into switch/case that would allow arbitrary expressions. The feature isn’t well defined yet, but it may be able to be used for matching the type safe enum objects.
https://github.com/dotnet/roslyn/issues/5757
LikeLike