Returning from Copenhagen

I’m currently sitting in Copenhagen airport, with two hours to wait before my flight boards. It’s been a tiring day – my feet in particular are aching somewhat, and my throat is quite sort – but I’ve thoroughly enjoyed it. We started the actual talk at about 10am and finished at 4pm, with three breaks totalling about an hour – so I was speaking for five hours. You might have thought that with all that time, it would be easy to go into a lot of detail about C# 2 and 3, but there’s really too much to cover everything. Topics I remember talking about:

  • Generics (assuming everyone knew the basics – talked about static members, constraints)
  • Iterator blocks
  • Anonymous methods
  • Automatic properties
  • Anonymous types
  • Object/collection initializers (really briefly)
  • Lambda expressions
  • Extension methods
  • Query translation (at a very broad level)
  • The basics of LINQ to Objects, including streaming/buffering and deferred/immediate execution
  • Push LINQ and Protocol Buffers
  • C# 4
  • What I’d like to see in C# 5

The whole session was recorded – I’ll post a link when it’s ready, along with slides and code – if you really want to sit through 5 hours of me talking :)

Anyway, I’ve had a great time, and massive thanks to Brian Rasmussen for organising it and letting me stay in his beautiful home, as well as to Microsoft for hosting the event and Google for giving me the go-ahead to do it.

Next stop: London .NET User Group on November 19th, talking about Push LINQ (in more detail).

C# 4.0: dynamic ?

I’ve not played with the VS2010 CTP much yet, and I’ve only looked briefly at the documentation and blogs about the new C# 4.0 dynamic type, but a thought occurred to me: why not have the option of making it generic as a way of saying “I will dynamically support this set of operations”?

As an example of what I mean, suppose you have an interface IMessageRouter like this:

public interface IMessageRouter
    void Send(string message, string destination);

(This is an arbitrary example, by the way. The idea isn’t specifically more suitable for message routing than anything else.)

I may have various implementations, written in various languages (or COM) which support the Send method with those parameters. Some of those implementations actually implement IMessageRouter but some don’t. I’d like to be able to do the following:

dynamic<IMessageRouter> router = GetRouter();

// This is fine (but still invoked dynamically)
router.Send(“message”, “”);
// Compilation error: no such overload
router.Send(“message”, “”, 20);

Intellisense would work, and we’d still have some of the benefits of static typing but without the implementations having to know about your interface. Of course, it would be quite easy to create an implementation of the interface which did exactly this – but now imagine that instead of IMessageRouter we had MessageRouter – a concrete class. In this case the compiler would still restrict the caller to the public API of the class, but it wouldn’t have to be the real class. No checking would be performed by the compiler that your dynamic type actually supported the operations – given that we’re talking about dynamic invocation, that would be impossible to do. It would instead be an “opt-in” restriction the client places on themselves. It could also potentially help with performance – if the binding involved realised that the actual type of the dynamic object natively implemented the interface or was/derived from the class, then no real dynamic calls need be made; just route all directly.

This may all sound a bit fuzzy – I’m extremely sleepy, to be honest – but I think it’s a potentially interesting idea. Thoughts?


Apparently this post wasn’t as clear as it might be. I’m quite happy to keep the currently proposed dynamic type idea as well – I’d like this as an additional way of using dynamic objects.

What other Enumerable extension methods would you like to see?

A few questions on Stack Overflow have suggested to me that there might be some bits missing from LINQ to Objects. There’s the idea of a “zip” operator, which pairs up two sequences, for instance. Or the ability to apply the set operators with projections, e.g. DistinctBy, IntersectBy etc (mirroring OrderBy). They’re easy to implement, but it would be nice to get a list of what people would like to see.

So, what’s missing?

Mapping from a type to an instance of that type

A question came up on Stack Overflow yesterday which I’ve had to deal with myself before now. There are times when it’s helpful to have one value per type, and that value should be an instance of that type. To express it in pseudo-code, you want an IDictionary<typeof(T), T> except with T varying across all possible types. Indeed, this came up in Protocol Buffers at least once, I believe.

.NET generics don’t have any way of expressing this, and you end up with boxing and a cast. I decided to encapsulate this (in MiscUtil of course, although it’s not in a released version yet) so that I could have all the nastiness in a single place, leaving the client code relatively clean. The client code makes calls to generic methods which either take an instance of the type argument or return one. It’s a really simple class, but a potentially useful one:

/// <summary>
/// Map from types to instances of those types, e.g. int to 10 and
/// string to “hi” within the same dictionary. This cannot be done
/// without casting (and boxing for value types) as .NET cannot
/// represent this relationship with generics in their current form.
/// This class encapsulates the nastiness in a single place.
/// </summary>
public class DictionaryByType
    private readonly IDictionary<Type, object> dictionary = new Dictionary<Type, object>();

    /// <summary>
    /// Maps the specified type argument to the given value. If
    /// the type argument already has a value within the dictionary,
    /// ArgumentException is thrown.
    /// </summary>
    public void Add<T>(T value)
        dictionary.Add(typeof(T), value);

    /// <summary>
    /// Maps the specified type argument to the given value. If
    /// the type argument already has a value within the dictionary, it
    /// is overwritten.
    /// </summary>
    public void Put<T>(T value)
        dictionary[typeof(T)] = value;

    /// <summary>
    /// Attempts to fetch a value from the dictionary, throwing a
    /// KeyNotFoundException if the specified type argument has no
    /// entry in the dictionary.
    /// </summary>
    public T Get<T>()
        return (T) dictionary[typeof(T)];

    /// <summary>
    /// Attempts to fetch a value from the dictionary, returning false and
    /// setting the output parameter to the default value for T if it
    /// fails, or returning true and setting the output parameter to the
    /// fetched value if it succeeds.
    /// </summary>
    public bool TryGet<T>(out T value)
        object tmp;
        if (dictionary.TryGetValue(typeof(T), out tmp))
            value = (T) tmp;
            return true;
        value = default(T);
        return false;

It doesn’t implement any of the common collection interfaces, because it would have to do so in a way which exposed the nastiness. I’m tempted to make it implement IEnumerable<KeyValuePair<Type, object>> but even that’s somewhat unpleasant and unlikely to be useful. Easy to add at a later date if necessary.

(I know the XML documentation leaves something to be desired. One day I’ll learn how to really do it properly – currently I fumble around if I’m trying to refer to other types etc within the docs.)

Why boxing doesn’t keep me awake at nights

I’m currently reading the (generally excellent) CLR via C#, and I’ve recently hit the section on boxing. Why is it that authors feel they have to scaremonger about the effects boxing can have on performance?

Here’s a piece of code from the book:

using System;

public sealed class Program {
   public static void Main() {
      Int32 v = 5;   // Create an unboxed value type variable.

      // When compiling the following line, v is boxed
      // three times, wasting time and memory
      Console.WriteLine(“{0}, {1}, {2}”, v, v, v);
      // The lines below have the same result, execute
      // much faster, and use less memory
      Object o = v;

      // No boxing occurs to compile the following line.
      Console.WriteLine(“{0}, {1}, {2}”, o, o, o);

In the text afterwards, he reiterates the point:

This second version executes much faster and allocates less memory from the heap.

This seemed like an overstatement to me, so I thought I’d try it out. Here’s my test application:

using System;
using System.Diagnostics;

public class Test
    const int Iterations = 10000000;
    public static void Main()
        Stopwatch sw = Stopwatch.StartNew();
        for (int i=0; i < Iterations; i++)
            Console.WriteLine(“{0} {1} {2}”, i, i, i);           
            object o = i;
            Console.WriteLine(“{0} {1} {2}”, o, o, o);
            string s = i.ToString();
            Console.WriteLine(“{0} {1} {2}”, s, s, s);
            string.Format(“{0} {1} {2}”, i, i, i);
            object o = i;
            string.Format(“{0} {1} {2}”, o, o, o);
            string s = i.ToString();
            string.Format(“{0} {1} {2}”, s, s, s);
            string.Concat(i, ” “, i, ” “, i);
            object o = i;
            string.Concat(o, ” “, o, ” “, o);
#elif CONCAT_STRINGS           
            string s = i.ToString();
            string.Concat(s, ” “, s, ” “, s);
        Console.Error.WriteLine(“{0}ms”, sw.ElapsedMilliseconds);

I compiled the code with one symbol defined each time, with optimisations and without debug information, and ran it from a command line, writing to nul (i.e. no disk or actual console activity). Here are the results:

Symbol Results (ms) Average (ms)
FORMAT_NO_BOXING 15814 15657
CONCAT_NO_BOXING 11949 12240

So, what do we learn from this? Well, a number of things:

  • As ever, microbenchmarks like this are pretty variable. I tried to do this on a “quiet” machine, but as you can see the results varied quite a lot. (Over two seconds between best and worst for a particular configuration at times!)
  • The difference due to boxing with the original code in the book is basically inside the “noise”
  • The dominant factor of the statement is writing to the console, even when it’s not actually writing to anything real
  • The next most important factor is whether we convert to string once or three times
  • The next most important factor is whether we use String.Format or Concat
  • The least important factor is boxing

Now I don’t want anyone to misunderstand me – I agree that boxing is less efficient than not boxing, where there’s a choice. Sometimes (as here, in my view) the “more efficient” code is slightly less readable – and the efficiency benefit is often negligible compared with other factors. Exactly the same thing happened in Accelerated C# 2008, where a call to Math.Pow(x, 2) was the dominant factor in a program again designed to show the efficiency of avoiding boxing.

The performance scare of boxing is akin to that of exceptions, although I suppose it’s more likely that boxing could cause a real performance concern in an otherwise-well-designed program. It used to be a much more common issue, of course, before generics gave us collections which don’t require boxing/unboxing to add/fetch data.

In short: yes, boxing has a cost. But please look at it in context, and if you’re going to start making claims about how much faster code will run when it avoids boxing, at least provide an example where it actually contributes significantly to the overall execution cost.

DotNetRocks interview

Last Monday evening I had a chat with the guys from DotNetRocks, and today the show has gone live.

I wouldn’t claim to have said anything particularly earth-shattering, and regular readers will probably be familiar with many of the themes anyway, but I thoroughly enjoyed it and hope you will too. Amongst other things, we talked about:

  • Protocol buffers
  • Implicit typing and anonymous types
  • Why it doesn’t bother me that Office hasn’t been ported to .NET
  • C# 4
  • My wishlist for C#
  • Threading and Parallel Extensions
  • Working for Google
  • How to learn LINQ
  • C# in Depth

Feedback welcome. And yes, I know I sound somewhat like a stereotypical upper-class idiot at times. Unfortunately there’s not a lot I can do about that. Only the “idiot” part is accurate :)