Category Archives: CSharpDev

Making reflection fly and exploring delegates

August 9, 2008 jonskeet 42 Comments

Background

I’ve recently been doing some optimisation work which has proved quite interesting in terms of working with reflection. My efforts porting Google’s Protocol Buffers are now pretty complete in terms of functionality, so I’ve been looking at improving the performance. The basic idea is that you specify your data types in a .proto file, and then generate C# from that. The generated C# allows you to manipulate the data, and serialize/deserialize it. When you generate the code, it can be optimised either for size or speed. The “small” code can end up being much smaller than the “fast” code – but it’s also significantly slower as it uses reflection when serializing and deserializing. My first rough-and-ready benchmark results (using a 130K data file based on Northwind) were slightly terrifying:

Operation	Time (ms)
Deserialize (fast)	5.18
Serialize (fast)	3.96
Deserialize (slow)	429.49
Serialize (slow)	103.67

Far from all of this difference was due to reflection, but it was a significant chunk – and provided the most interesting and challenging optimisation. This post doesn’t show the actual Protocol Buffer code, but demonstrates the three steps I required to radically improve the performance of reflection. The examples I’ve used are chosen just for simplicity.

Converting MethodInfo into a delegate instance

There are lots of things you can do with reflection, obviously – but I’m primarily interested in calling methods, using the associated MethodInfo. This includes setting properties, using the results of the GetGetMethod and GetSetMethod methods of PropertyInfo. We’ll use String.IndexOf(char) as our initial example.

Normally when you’re calling methods with reflection, you call MethodInfo.Invoke. Unfortunately, this proves to be quite slow. If you know the signature of the method at compile-time, you can convert the method into a delegate with that signature using Delegate.CreateDelegate(Type, object, MethodInfo). You simply pass in the delegate type you want to create an instance of, the target of the call (i.e. what the method will be called
on), and the method you want to call. It would be nice if there were a generic version of this call to avoid casting the result, but never mind. Here’s a complete example demonstrating how it works:

using System;
using System.Reflection;
public class Test
{
    static void Main()
    {
        MethodInfo method = typeof(string).GetMethod("IndexOf", new Type[] { typeof(char) });

        Func<char, int> converted = (Func<char, int>)
            Delegate.CreateDelegate(typeof(Func<char, int>), "Hello", method);

        Console.WriteLine(converted('l'));
        Console.WriteLine(converted('o'));
        Console.WriteLine(converted('x'));
    }
}

This prints out 2, 4, and -1; exactly what we’d get if we’d called "Hello".IndexOf(...) directly. Now let’s see what the speed differences are…

We’re mostly interested in the time taken to go from the main calling code to the method being called, whether that’s with a direct method call, MethodInfo.Invoke or the delegate. To make IndexOf itself take as little time as possible, I tested it by passing in ‘H’ so it would return 0 immediately. As normal, the test was rough and ready, but here are the results:

Invocation type	Stopwatch ticks per invocation
Direct	0.18
Reflection	120
Delegate	0.20

One important point is that I created a new parameter array for each invocation of the MethodInfo – obviously this is slightly costly in itself, but it mirrors real world usage. The exact numbers don’t matter, but the relative sizes are the important point: using a delegate invocation is only about 10% slower than direct invocation, whereas using reflection takes over 600 times as long. Of course these figures will depend on the method being called – if the direct invocation can be inlined, I’d expect that to make a significant difference in some cases. However, the benefit in converting reflection calls into delegate calls is obvious.

Now, what about if we wanted to vary the string we were calling IndexOf on?

Interlude: open and closed delegates

When you create a delegate directly in C# using a method group conversion, you (almost) always create an open delegate for static methods and a closed delegate for instance methods. To explain the difference between open and closed delegates, it’s best to start thinking of all methods as being static – but with instance methods having an extra parameter at the start to represent this. In fact, extension methods use exactly this model. Reality is more complicated than that due to polymorphism, but we’ll leave that to one side for the moment.

Going back to our String.IndexOf example, we can start thinking of the signature as being:

static int IndexOf(string target, char c)

At this point it’s easy to explain the difference between open and closed delegates: a closed delegate has a value which it implicitly passes in as the first argument, whereas with an open delegate you specify all the arguments when you invoke the delegate. The implicit first argument is represented by the Delegate.Target property. It’s null for open delegates – which is usually the case when you create a delegate directly in C#. Here’s a short program to demonstrate the difference when you create delegate instances using C# directly:

using System;
public class Test
{
    readonly string name;

    public Test(string name)
    {
        this.name = name;
    }

    public void Display()
    {
        Console.WriteLine("Test; name = {0}", name);
    }

    static void StaticMethod()
    {
        Console.WriteLine("Static method");
    }

    static void Main()
    {
        Test foo = new Test("foo");

        Action closed = foo.Display; // closed.Target == foo
        Action open = StaticMethod;  // open.Target == null

        closed();
        open();
    }
}

Before we go back to reflection, I’ll clarify the “almost” I used earlier on. You can’t currently create an open delegate referring to an instance method in C# using method group conversions – but you can create a closed delegate referring to a static method, if it’s an extension method. This makes sense, as extension methods are a strange sort of half-way house between static methods and instance methods – they’re truly static methods which can be used as if they were instance methods. I’ve got an example on my C# in Depth site.

Creating open delegates with reflection

Even though C# doesn’t support all the possible combinations of static/instance methods and open/closed delegates directly, Delegate.CreateDelegate has overloads to let you do just that. The signature we used earlier (with parameters Type, object, MethodInfo) always creates a closed delegate. There’s another overload without the middle parameter – and that always creates an open delegate. We can easily modify our earlier example to let us call String.IndexOf(char) varying both the needle and the haystack, so to speak:

using System;
using System.Reflection;
public class Test
{
    static void Main()
    {
        MethodInfo method = typeof(string).GetMethod("IndexOf", new Type[] { typeof(char) });

        Func<string, char, int> converted = (Func<string, char, int>)
            Delegate.CreateDelegate(typeof(Func<string, char, int>), method);

        Console.WriteLine(converted("Hello", 'l'));
        Console.WriteLine(converted("Jon", 'o'));
        Console.WriteLine(converted("Hello", 'n'));
    }
}

This prints 2, 1, -1, as if we’d called "Hello".IndexOf('l'), "Jon".IndexOf('o') and "Hello".IndexOf('n').

This can be a very powerful tool – in particular it’s crucial for my Protocol Buffers port: for a particular type, I can create a delegate which will set a property. I can keep that information around forever, and use the same delegate to set the property to different values on different instances of the type.

There’s just one more problem to overcome – and unfortunately this is where things get a little weird.

Adapting delegates for parameter and return types

Due to the way that the Protocol Buffer library works, I often need to call methods or set properties without knowing at compile-time what the parameter types are, or indeed the return type of the method. I can be confident that I’ll always call it with appropriate parameters, but I just don’t know what they’ll be ahead of time. Things are slightly better in terms of the type declaring the method – I know that at compile-time, although only as a generic type parameter. What I do know with confidence is the number of parameters (I’ll just specify a single parameter for our example), and whether or not the method will return a value (we’ll use an example which always returns a parameter).

What I need is a generic method which has a type parameter T representing the type which implements the method, and which returns a Func – a delegate instance which lets me pass the target and the argument value, and which will call the method and then return the value in a weakly typed manner. So we’d like this kind of program to work:

using System;
using System.Reflection;
using System.Text;
public class Test
{
    static void Main()
    {
        MethodInfo indexOf = typeof(string).GetMethod("IndexOf", new Type[] { typeof(char) });
        MethodInfo getByteCount = typeof(Encoding).GetMethod("GetByteCount", new Type[] { typeof(string) });

        Func<string, object, object> indexOfFunc = MagicMethod<string>(indexOf);
        Func<Encoding, object, object> getByteCountFunc = MagicMethod<Encoding>(getByteCount);

        Console.WriteLine(indexOfFunc("Hello", 'e'));
        Console.WriteLine(getByteCountFunc(Encoding.UTF8, "Euro sign: \u20ac"));
    }

    static Func<T, object, object> MagicMethod<T>(MethodInfo method)
    {
        // TODO: Implement this method!
        throw new NotImplementedException();
    }
}

Note: I was going to demonstrate this by calling DateTime.AddDays, but for value type instance methods the implicit first first parameter is passed by reference, so we’d need a delegate type with a signature of DateTime Foo(ref DateTime original, double days) to call CreateDelegate. It’s feasible, but a bit of a faff. In particular, you can’t use Func as that doesn’t have any
by-reference parameters.

Make sure you understand what we’re aiming for here. Notice that we’re not really type-safe – just like we wouldn’t be if we were calling MethodInfo.Invoke. Of course we’d normally want type safety, but in this case it would make the calling code much more complicated, and in some places it might effectively be impossible. So, with the goal in place, we know we need to implement MagicMethod. (It’s not called MagicMethod in the real source code, of course – but frankly it’s quite a tricky method to name sensibly, and at this stage it really does feel like magic.)

The first obvious attempt at implementing MagicMethod would be to use CreateDelegate as we’ve done before, like this:

// Warning: doesn't actually work!
static Func<T, object, object> MagicMethod<T>(MethodInfo method)
{
    return (Func<T, object, object>)
        Delegate.CreateDelegate(typeof(Func<T, object, object>), method);
}

Unfortunately, that fails – the call to CreateDelegate fails with an ArgumentException because the delegate type isn’t right for the method that we’re trying to call. The delegate types don’t have to be exactly right, just compatible (as of .NET 2.0) – but we need an explicit conversion from object to the right parameter type, and a potentially boxing conversion of the return value. We still want to call CreateDelegate though… so somewhere we’re going to have to create a Func where TTarget is a type parameter representing the type of object we’re going to call the method on, TParam is the type of the single parameter the method accepts, and TReturn is the return type of the method.

We could do that directly with reflection, using typeof(Func) to get the open type (not to be confused with an open delegate!), then calling Type.MakeGenericType to create the right constructed type. We’ll need to do something like that anyway, but it’s actually easier to write another generic method with the right type parameters for this part. That will let us convert the MethodInfo into a delegate, but then what are we going to do with it? How can we convert a Func into a Func? Well, we need to cast the parameter from object to TParam, and then convert the result from TReturn to object, which may involve boxing. If we were writing a method to do this, it would look something like this:

static object CallAndConvert<TTarget, TParam, TReturn>
    (Func<TTarget, TParam, TReturn> func, TTarget target, object param)
{
    // Conversion from TReturn to object is implicit
    return func(target, (TParam) param);
}

We don’t want to execute that code at the moment – we want to create a delegate which will execute it later. The easiest way to do that is to move the code into a lambda expression within a normal method which already has a reference to the Func. That lambda expression will then be converted into a delegate of the type we really want. It may feel like we’re just adding layer upon layer of indirection (and indeed we are) but we’re genuinely making progress. Honest. Here’s the new generic method:

static Func<TTarget, object, object> MagicMethodHelper<TTarget, TParam, TReturn>(MethodInfo method)
{
    // Convert the slow MethodInfo into a fast, strongly typed, open delegate
    Func<TTarget, TParam, TReturn> func = (Func<TTarget, TParam, TReturn>)
        Delegate.CreateDelegate(typeof(Func<TTarget, TParam, TReturn>), method);
    // Now create a more weakly typed delegate which will call the strongly typed one
    Func<TTarget, object, object> ret = (TTarget target, object param) => func(target, (TParam) param);
    return ret;
}

(We could return the lambda expression directly – the ret variable is only present as an attempt to add some clarity.)

We’re now just one step away from having a working program – we need to implement MagicMethod by calling MagicMethodHelper. There’s one obvious problem though – we need three type arguments to call MagicMethodHelper, and we’ve only got one of them in MagicMethod. We know the other two at execution time, based on the parameter type and return type of the MethodInfo we’ve been
passed. The fact that we only know them at execution time suggests the next step – we need to use reflection to invoke MagicMethodHelper. We need to fetch the generic method and then supply the type arguments. It’s easier to show this than to describe it:

static Func<T, object, object> MagicMethod<T>(MethodInfo method) where T : class
{
    // First fetch the generic form
    MethodInfo genericHelper = typeof(Test).GetMethod(
        "MagicMethodHelper", BindingFlags.Static | BindingFlags.NonPublic);
    // Now supply the type arguments
    MethodInfo constructedHelper = genericHelper.MakeGenericMethod(
        typeof(T), method.GetParameters()[0].ParameterType, method.ReturnType);

    // Now call it. The null argument is because it’s a static method.
    object ret = constructedHelper.Invoke(null, new object[] { method });

    // Cast the result to the right kind of delegate and return it
    return (Func<T, object, object>) ret;
}

I’ve added the where T : class constraint to make sure (at compile-time) that we don’t run into the problem I mentioned earlier around calling value type methods. It may seem slightly odd that we’re using reflection to call MagicMethodHelper when the whole point of the exercise was to avoid invoking methods by reflection – but we only need to invoke the method once, and we can use the returned delegate many times. Here’s the complete program, ready to compile and run:

using System;
using System.Reflection;
using System.Text;

public class Test
{
    static void Main()
    {
        MethodInfo indexOf = typeof(string).GetMethod("IndexOf", new Type[]{typeof(char)});
        MethodInfo getByteCount = typeof(Encoding).GetMethod("GetByteCount", new Type[]{typeof(string)});

        Func<string, object, object> indexOfFunc = MagicMethod<string>(indexOf);
        Func<Encoding, object, object> getByteCountFunc = MagicMethod<Encoding>(getByteCount);

        Console.WriteLine(indexOfFunc("Hello", 'e'));
        Console.WriteLine(getByteCountFunc(Encoding.UTF8, "Euro sign: \u20ac"));
    }

    static Func<T, object, object> MagicMethod<T>(MethodInfo method) where T : class
    {
        // First fetch the generic form
        MethodInfo genericHelper = typeof(Test).GetMethod("MagicMethodHelper", 
            BindingFlags.Static | BindingFlags.NonPublic);

        // Now supply the type arguments
        MethodInfo constructedHelper = genericHelper.MakeGenericMethod
            (typeof(T), method.GetParameters()[0].ParameterType, method.ReturnType);

        // Now call it. The null argument is because it's a static method.
        object ret = constructedHelper.Invoke(null, new object[] {method});

        // Cast the result to the right kind of delegate and return it
        return (Func<T, object, object>) ret;
    }    

    static Func<TTarget, object, object> MagicMethodHelper<TTarget, TParam, TReturn>(MethodInfo method)
        where TTarget : class
    {
        // Convert the slow MethodInfo into a fast, strongly typed, open delegate
        Func<TTarget, TParam, TReturn> func = (Func<TTarget, TParam, TReturn>)Delegate.CreateDelegate
            (typeof(Func<TTarget, TParam, TReturn>), method);

        // Now create a more weakly typed delegate which will call the strongly typed one
        Func<TTarget, object, object> ret = (TTarget target, object param) => func(target, (TParam) param);
        return ret;
    }
}

Conclusion

This isn’t the kind of thing which I enjoy having in production code. It’s frightfully complicated – we’re finding a method via reflection, invoking a different (and generic) method via reflection in order to turn the first method into a delegate and then return a different delegate which calls it. While I don’t like having “clever” code like this in production, I take immense pleasure from getting it to work in the first place. This is one of the rare occasions where the result makes all the cleverness worth it, too – combined with the other optimisations, my Protocol Buffers port is now much, much faster – the reflection invocations are no longer a bottleneck. (We lose a little bit of efficiency by having one delegate call another, but it’s still massively quicker than using reflection.)

Regardless of the complexity involved later on, the simpler parts of this post (calling Delegate.CreateDelegate where you already know the signature, and the possibility of creating open delegates) are likely to be more widely applicable. By using a delegate instead of MethodInfo, not only are there significant performance improvements, but also a strongly typed way of calling the method. From now on, I’ll certainly be considering whether or not it might be worth using a delegate any time I use reflection.

C#, CSharpDev, CSharpDevCenter

Making the most of generic type inference

August 6, 2008 jonskeet 2 Comments

Introduction

Specifying type arguments for generic types and methods can be a pain, especially when there are multiple type parameters involved. For instance, imagine having to explicitly specify TOuter, TInner, TKey and TResult for a call to Enumerable.Join! Fortunately the compiler can work out the type arguments most of the time – but only for generic methods. It doesn’t do anything for generic types. However, all is not lost…

Overloading type names by number of type parameters

One feature of C# and .NET which isn’t used terribly often (in my experience) is the ability to use the same type name for different types – so long as they have different numbers of type parameters. This is how System.Nullable and System.Nullable<T> coexist, for example. Like all language features this is open to massive abuse if you have several types with very different semantic meanings, but when applied with discretion it can be very helpful.

We can use this to our advantage when we want to call a constructor or static method of a generic type without specifying the type parameters. The basic idea is that you create a nongeneric type (or just one with fewer type parameters) and then put a generic method in that class. The generic method in the nongeneric class then calls a member in the generic class, using the method’s type parameters as the type arguments for the generic class. (Try getting all of that right after a few drinks!) Now that you’ve got a generic method, you can use type inference to avoid having to explicitly state the type arguments. Fortunately it’s a lot simpler than it sounds…

To show you what I mean, let’s look at a bit of code from my MiscUtil library. (This code isn’t in the latest release drop, but will be in the next one – and this post provides all the important code anyway.)

Projection comparisons

I’ve found the OrderBy method in LINQ very useful, and I wanted to be able to use the same “compare using a projection” idea elsewhere. The IComparer<T> interface is used in various places in the .NET API (List<T>.Sort being an obvious example) but implementing it can be a bit tedious – even though it’s a single method. So, let’s build a ProjectionComparer type which knows how to compare two objects by applying the same projection to both of them, and then using another comparer to compare the results.

There are two types involved – the source of the projection, and the key we’re projecting it to. This naturally suggests a type with two type parameters, TSource and TKey. For instance, when projecting from a Person type to their name, we might have TSource=Person and TKey=string.

The most obvious piece of information we need to create a projection comparer is the projection itself. A delegate is the obvious way of representing this – a Func<TSource, TTarget> which can be applied to each item we try to compare. We then need to know how to compare the names (e.g. case-insensitive, ordinal etc) – functionality which is provided by StringComparer in this example, and IComparer<TKey> in general. The Comparer<T>.Default property comes in handy to let us get away without specifying a comparer in many situations.

With those few design decisions, we can implement ProjectionComparer<TSource, TKey> pretty simply:

public class ProjectionComparer<TSource, TKey> : IComparer<TSource>
{
private readonly Func<TSource, TKey> projection;
private readonly IComparer<TKey> comparer;

    public ProjectionComparer(Func<TSource, TKey> projection)
        : this (projection, null)
    {
    }

    public ProjectionComparer(Func<TSource, TKey> projection, IComparer<TKey> comparer)
    {
        if (projection==null)
        {
            throw new ArgumentNullException(“projection”);
        }
        this.comparer = comparer ?? Comparer<TKey>.Default;
        this.projection = projection;
    }

    public int Compare(TSource x, TSource y)
    {
        // Don’t want to project from nullity
        if (x==null && y==null)
        {
            return 0;
        }
        if (x==null)
        {
            return -1;
        }
        if (y==null)
        {
            return 1;
        }
        return comparer.Compare(projection(x), projection(y));
    }
}

That’s functionally complete, but it’s a bit of a pain to create instances of it. Our previous example would require something like this:

var nameComparer = new ProjectionComparer<Person, string>(person => person.Name);

It’s not bad, but we can do better.

Introducing the nongeneric ProjectionComparer type

The next step is almost as simple as imagining how we want to create instances. We don’t have to use a nongeneric type with the same name as the generic type, but it keeps things consistent, and forms a simple pattern to follow at other times. So, let’s imagine being able to write this:

var nameComparer = ProjectionComparer.Create(person => person.Name);

Unfortunately we can’t quite achieve that. There’s no way for the compiler to know the type of the parameter in the lambda expression. However, we have three options we can use:

// Explicitly type the lambda expression’s parameter
var option1 = ProjectionComparer.Create((Person person) => person.Name);

// Pass in a dummy parameter of the right type
var option2 = ProjectionComparer.Create(dummyPerson, person => person.Name);

// Use a class with one generic type parameter, and infer the other
var option3 = ProjectionComparer<Person>.Create(person => person.Name);

Each of these options is just a way of telling the compiler what TSource should be. The first two are implemented in a totally nongeneric class. The third is implemented in a generic class with a type parameter for TSource but letting the compiler infer TKey. Note that we have to make this split because you can’t explicitly specify some type arguments and let the compiler infer the others. The actual code for these methods is very straightforward indeed. I haven’t included overloads where the comparer is explicitly specified, but it’s very simple to do so if required.

public static class ProjectionComparer
{
    // For option 1
    public static ProjectionComparer<TSource, TKey> Create<TSource, TKey>(Func<TSource, TKey> projection)
    {
        return new ProjectionComparer<TSource, TKey>(projection);
    }

    // For option 2
    public static ProjectionComparer<TSource, TKey> Create<TSource, TKey>(TSource ignored, Func<TSource, TKey> projection)
    {
        return new ProjectionComparer<TSource, TKey>(projection);
    }

}

// For option 3
public static class ProjectionComparer<TSource>
{
    public static ProjectionComparer<TSource, TKey> Create<TKey>(Func<TSource, TKey> projection)
    {
        return new ProjectionComparer<TSource, TKey>(projection);
    }
}

Conclusion

There’s nothing particularly difficult in this post, but it’s sometimes easy to forget that the C# compiler can help you out when it comes to filling in type arguments. Of course it only helps when you already providing enough information to the compiler with normal method parameters, but it’s still a nice little trick to have up your sleeve when you’re trying to make your APIs that bit more pleasant to use.