It seems to be quite a long time since I’ve written a genuine "code" blog post. Time to fix that.
This material may well be covered elsewhere – it’s certainly not terrifically original, and I’ve been meaning to post about it for a long time. In particular, I remember mentioning it at CodeMash in 2012. Anyway, the time has now come.
Refresher on array covariance
Just as a bit of background before we delve into the performance aspect, let me remind you what array covariance is, and when it applies. The basic idea is that C# allows a reference conversion from type TDerived[] to type TBase[], so long as:
- TDerived and TBase are both reference types (potentially interfaces)
- There’s a reference conversion from TDerived to TBase (so either TDerived is the same as TBase, or a subclass, or an implementing class etc)
Just to remind you about reference conversions, those are conversions from one reference type to another, where the result (on success) is never a reference to a different object. To quote section 6.1.6 of the C# 5 spec:
Reference conversions, implicit or explicit, never change the referential identity of the object being converted. In other words, while a reference conversion may change the type of the reference, it never changes the type or value of the object being referred to.
So as a simple example, there’s a reference conversion from string to object, therefore there’s a reference conversion from string[] to object[]:
object[] objects = strings;
// strings and objects now refer to the same object
There is not a reference conversion between value type arrays, so you can’t use the same code to conver from int[] to object[].
The nasty part is that every store operation into a reference type array now has to be checked at execution time for type safety. So to extend our sample code very slightly:
object[] objects = strings;
objects[0] = "string"; // This is fine
objects[0] = new Button(); // This will fail
The last line here will fail with an ArrayTypeMismatchException, to avoid storing a Button reference in a String[] object. When I said that every store operation has to be checked, that’s a slight exaggeration: in theory, if the compile-time type is an array with an element type which is a sealed class, the check can be avoided as it can’t fail.
Avoiding array covariance
I would rather arrays weren’t covariant in the first place, but there’s not a lot that can be done about that now. However, we can work around this, if we really need to. We know that value type arrays are not covariant… so how about we use a value type array instead, even if we want to store reference types?
All we need is a value type which can store the reference type we need – which is dead easy with a wrapper type:
{
private readonly T value;
public T Value { get { return value; } }
public Wrapper(T value)
{
this.value = value;
}
public static implicit operator Wrapper<T>(T value)
{
return new Wrapper<T>(value);
}
}
Now if we have a Wrapper<string>[], we can’t assign that to a Wrapper<object>[] variable – the types are incompatible. If that feels a bit clunky, we can put the array into its own type:
{
private readonly Wrapper<T>[] array;
public InvariantArray(int size)
{
array = new Wrapper<T>[size];
}
public T this[int index]
{
get { return array[index].Value; }
set { array[index] = value; }
}
}
Just to clarify, we now only have value type arrays, but ones where each value is a plain wrapper for a reference. We now can’t accidentally violate type-safety at compile-time, and the CLR doesn’t need to validate write operations.
There’s no memory overhead here – aside from the type information at the start, I’d actually expect the contents of a Wrapper<T>[] to be indistinguishable from a T[] in memory.
Benchmarking
So, how does it perform? I’ve written a small console app to test it. You can download the full code, but the gist of it is that we use a stopwatch to measure how long it takes to either repeatedly write to an array, or repeatedly read from an array (validating that the value read is non-null, just to prove that we’ve really read something). I’m hoping I haven’t fallen foul of any of the various mistakes in benchmarking which are so easy to make.
The test tries four scenarios:
- object[] (but still storing strings)
- string[]
- Wrapper<string>[]
- InvariantArray<string>
Running against an array size of 100, with 100 million iterations per test, I get the following results on my Thinkpad Twist :
Array type | Read time (ms) | Write time |
object[] | 11842 | 44827 |
string[] | 12000 | 40865 |
Wrapper<string>[] | 11843 | 29338 |
InvariantArray<string> | 11825 | 32973 |
That’s just one run, but the results are fairly consistent across runs. The one interesting deviation is the write time for object[] – I’ve observed it sometimes being the same as for string[], but not consistently. I don’t understand this, but it does seem that the JIT isn’t performing the optimization for string[] that it could if it spotted that string is sealed.
Both of the workarounds to avoid array covariance make a noticeable difference to the performance of writing to the array, without affecting read performance. Hooray!
Conclusion
I think it would be a very rare application which noticed a significant performance boost here, but I do like the fact that this is one of those situations where a cleaner design also leads to better performance, without many obvious technical downsides.
That said, I doubt that I’ll actually be using this in real code any time soon – the fact that it’s just "different" to normal C# code is a big downside in itself. Hope you found it interesting though :)