Category Archives: CSharpDevCenter

The Snippy Reflector add-in

Those of you who’ve read C# in Depth will know about Snippy – a little tool which makes it easy to build complete programs from small snippets of code.

I’m delighted to say that reader Jason Haley has taken the source code for Snippy and built an add-in for Reflector. This will make it much simpler to answer questions like this one about struct initialization, where you really want to the IL generated for a snippet. Here’s a screenshot to show what it does:

This is really cool – if you want to dabble to see what the C# compiler does in particular situations, check it out. It comes as just the DLL, or a zipped version. Thanks for putting in all this work, Jason :)

Update: Jason now has his own (more detailed) blog entry too.

Copenhagen C# talk videos now up

The videos from my one day talk about C# in Copenhagen are now on the MSDN community site. There are eight sessions, varying between about 25 minutes and 50 minutes in length. I haven’t had time to watch them yet, but when I do I’ll submit brief summaries so you can quickly get to the bits you’re most interested in. (As far as I’m aware, they’re only available via Silverlight, which I realise isn’t going to be convenient for everyone.)

Feedback is very welcome.

.NET 4.0’s game-changing feature? Maybe contracts…

Update: As Chris Nahr pointed out, there’s a blog post by Melitta Andersen of the BCL team explaining this in more detail.

Obviously I’ve been looking at the proposed C# 4.0 features pretty carefully, and I promise I’ll blog more about them at some later date – but yesterday I watched a PDC video which blew me away.

As ever, a new version of .NET means more than just language changes – Justin van Patten has written an excellent blog post about what to expect in the core of thee framework. There are nice things in there – tuples and BigInteger, for example – but it was code contracts that really caught my eye.

Remember Spec#? Well, as far as I can tell the team behind it realised that people don’t really want to have to learn a new language – but if the goodness of Design By Contract can be put into a library, then everyone can use it. Enter CodeContracts

Actual examples are relatively few and far between at the moment, but the basic idea is that you write your contracts at the start of methods – not in attributes, presumably because that’s too limiting in terms of what you can express – and then a post-build tool will “understand” those contracts, find potential issues, and do a bit of code-rewriting where appropriate (e.g. to move the post-condition testing to the end points of the method). Object invariants can also be expressed as separate methods.

Rather than guess at the syntax in this blog post I highly recommend you watch the PDC 2008 video on both this and Pex (an intelligent code explorer and test generator). The teams have clearly thought through a lot of significant issues:

  • Contracts can be enforced at runtime, or stripped out for release builds. (I’ll be interested to see whether I can keep the pre-condition checks in the release build, just removing invariants and post-conditions etc.)
  • If you’re stripping out the contracts, you can still have them in a separate assembly – so if you supply a library to someone, they can still have all the Design by Contract goodness available, and see where they’re potentially violating your preconditions
  • Contracts will be automatically documented in the generated XML documentation file (although this has yet to be implemented, I believe)
  • Interfaces can be associated with contract classes where the contracts are expressed. (They couldn’t be in the interface, as they require method bodies.)
  • Pex will be able to generate tests in MS Test, NUnit and MbUnit. (Hooray! This got a massive cheer at PDC.)

Now I should point out that I haven’t tried any of this – I’ve just watched a video which was very slick and obviously used a well-tested scenario. If this genuinely works, however, I think it could change the way mainstream developers approach coding just as LINQ is changing the way we see data. (Obviously there’s nothing fundamentally new about DbC – but there’s a difference between it existing and it being mainstream.)

I’m really, really excited about this :) Definitely time to boot up the VPC image when I get a moment…

C# 4.0: dynamic ?

I’ve not played with the VS2010 CTP much yet, and I’ve only looked briefly at the documentation and blogs about the new C# 4.0 dynamic type, but a thought occurred to me: why not have the option of making it generic as a way of saying “I will dynamically support this set of operations”?

As an example of what I mean, suppose you have an interface IMessageRouter like this:

public interface IMessageRouter
{
    void Send(string message, string destination);
}

(This is an arbitrary example, by the way. The idea isn’t specifically more suitable for message routing than anything else.)

I may have various implementations, written in various languages (or COM) which support the Send method with those parameters. Some of those implementations actually implement IMessageRouter but some don’t. I’d like to be able to do the following:

dynamic<IMessageRouter> router = GetRouter();

// This is fine (but still invoked dynamically)
router.Send(“message”, “skeet@pobox.com”);
// Compilation error: no such overload
router.Send(“message”, “skeet@pobox.com”, 20);

Intellisense would work, and we’d still have some of the benefits of static typing but without the implementations having to know about your interface. Of course, it would be quite easy to create an implementation of the interface which did exactly this – but now imagine that instead of IMessageRouter we had MessageRouter – a concrete class. In this case the compiler would still restrict the caller to the public API of the class, but it wouldn’t have to be the real class. No checking would be performed by the compiler that your dynamic type actually supported the operations – given that we’re talking about dynamic invocation, that would be impossible to do. It would instead be an “opt-in” restriction the client places on themselves. It could also potentially help with performance – if the binding involved realised that the actual type of the dynamic object natively implemented the interface or was/derived from the class, then no real dynamic calls need be made; just route all directly.

This may all sound a bit fuzzy – I’m extremely sleepy, to be honest – but I think it’s a potentially interesting idea. Thoughts?

Update

Apparently this post wasn’t as clear as it might be. I’m quite happy to keep the currently proposed dynamic type idea as well – I’d like this as an additional way of using dynamic objects.

Mapping from a type to an instance of that type

A question came up on Stack Overflow yesterday which I’ve had to deal with myself before now. There are times when it’s helpful to have one value per type, and that value should be an instance of that type. To express it in pseudo-code, you want an IDictionary<typeof(T), T> except with T varying across all possible types. Indeed, this came up in Protocol Buffers at least once, I believe.

.NET generics don’t have any way of expressing this, and you end up with boxing and a cast. I decided to encapsulate this (in MiscUtil of course, although it’s not in a released version yet) so that I could have all the nastiness in a single place, leaving the client code relatively clean. The client code makes calls to generic methods which either take an instance of the type argument or return one. It’s a really simple class, but a potentially useful one:

/// <summary>
/// Map from types to instances of those types, e.g. int to 10 and
/// string to “hi” within the same dictionary. This cannot be done
/// without casting (and boxing for value types) as .NET cannot
/// represent this relationship with generics in their current form.
/// This class encapsulates the nastiness in a single place.
/// </summary>
public class DictionaryByType
{
    private readonly IDictionary<Type, object> dictionary = new Dictionary<Type, object>();

    /// <summary>
    /// Maps the specified type argument to the given value. If
    /// the type argument already has a value within the dictionary,
    /// ArgumentException is thrown.
    /// </summary>
    public void Add<T>(T value)
    {
        dictionary.Add(typeof(T), value);
    }

    /// <summary>
    /// Maps the specified type argument to the given value. If
    /// the type argument already has a value within the dictionary, it
    /// is overwritten.
    /// </summary>
    public void Put<T>(T value)
    {
        dictionary[typeof(T)] = value;
    }

    /// <summary>
    /// Attempts to fetch a value from the dictionary, throwing a
    /// KeyNotFoundException if the specified type argument has no
    /// entry in the dictionary.
    /// </summary>
    public T Get<T>()
    {
        return (T) dictionary[typeof(T)];
    }

    /// <summary>
    /// Attempts to fetch a value from the dictionary, returning false and
    /// setting the output parameter to the default value for T if it
    /// fails, or returning true and setting the output parameter to the
    /// fetched value if it succeeds.
    /// </summary>
    public bool TryGet<T>(out T value)
    {
        object tmp;
        if (dictionary.TryGetValue(typeof(T), out tmp))
        {
            value = (T) tmp;
            return true;
        }
        value = default(T);
        return false;
    }
}

It doesn’t implement any of the common collection interfaces, because it would have to do so in a way which exposed the nastiness. I’m tempted to make it implement IEnumerable<KeyValuePair<Type, object>> but even that’s somewhat unpleasant and unlikely to be useful. Easy to add at a later date if necessary.

(I know the XML documentation leaves something to be desired. One day I’ll learn how to really do it properly – currently I fumble around if I’m trying to refer to other types etc within the docs.)

Why boxing doesn’t keep me awake at nights

I’m currently reading the (generally excellent) CLR via C#, and I’ve recently hit the section on boxing. Why is it that authors feel they have to scaremonger about the effects boxing can have on performance?

Here’s a piece of code from the book:

using System;

public sealed class Program {
   public static void Main() {
      Int32 v = 5;   // Create an unboxed value type variable.

#if INEFFICIENT
      // When compiling the following line, v is boxed
      // three times, wasting time and memory
      Console.WriteLine(“{0}, {1}, {2}”, v, v, v);
#else
      // The lines below have the same result, execute
      // much faster, and use less memory
      Object o = v;

      // No boxing occurs to compile the following line.
      Console.WriteLine(“{0}, {1}, {2}”, o, o, o);
#endif
   }
}

In the text afterwards, he reiterates the point:

This second version executes much faster and allocates less memory from the heap.

This seemed like an overstatement to me, so I thought I’d try it out. Here’s my test application:

using System;
using System.Diagnostics;

public class Test
{
    const int Iterations = 10000000;
   
    public static void Main()
    {
        Stopwatch sw = Stopwatch.StartNew();
        for (int i=0; i < Iterations; i++)
        {
#if CONSOLE_WITH_BOXING
            Console.WriteLine(“{0} {1} {2}”, i, i, i);           
#elif CONSOLE_NO_BOXING
            object o = i;
            Console.WriteLine(“{0} {1} {2}”, o, o, o);
#elif CONSOLE_STRINGS
            string s = i.ToString();
            Console.WriteLine(“{0} {1} {2}”, s, s, s);
#elif FORMAT_WITH_BOXING
            string.Format(“{0} {1} {2}”, i, i, i);
#elif FORMAT_NO_BOXING
            object o = i;
            string.Format(“{0} {1} {2}”, o, o, o);
#elif FORMAT_STRINGS
            string s = i.ToString();
            string.Format(“{0} {1} {2}”, s, s, s);
#elif CONCAT_WITH_BOXING
            string.Concat(i, ” “, i, ” “, i);
#elif CONCAT_NO_BOXING
            object o = i;
            string.Concat(o, ” “, o, ” “, o);
#elif CONCAT_STRINGS           
            string s = i.ToString();
            string.Concat(s, ” “, s, ” “, s);
#endif           
        }
        sw.Stop();
        Console.Error.WriteLine(“{0}ms”, sw.ElapsedMilliseconds);
    }
}

I compiled the code with one symbol defined each time, with optimisations and without debug information, and ran it from a command line, writing to nul (i.e. no disk or actual console activity). Here are the results:

Symbol Results (ms) Average (ms)
CONSOLE_WITH_BOXING 33054 33444
  33898  
  33381  
CONSOLE_NO_BOXING 34638 33451
  32423  
  33294  
CONSOLE_STRINGS 29259 28337
  29071  
  26683  
FORMAT_WITH_BOXING 17143 17210
  18100  
  16389  
FORMAT_NO_BOXING 15814 15657
  15936  
  15222  
FORMAT_STRINGS 9178 8999
  9077  
  8742  
CONCAT_WITH_BOXING 12056 12563
  14304  
  11329  
CONCAT_NO_BOXING 11949 12240
  13145  
  11628  
CONCAT_STRINGS 5833 5936
  6263  
  5713  

So, what do we learn from this? Well, a number of things:

  • As ever, microbenchmarks like this are pretty variable. I tried to do this on a “quiet” machine, but as you can see the results varied quite a lot. (Over two seconds between best and worst for a particular configuration at times!)
  • The difference due to boxing with the original code in the book is basically inside the “noise”
  • The dominant factor of the statement is writing to the console, even when it’s not actually writing to anything real
  • The next most important factor is whether we convert to string once or three times
  • The next most important factor is whether we use String.Format or Concat
  • The least important factor is boxing

Now I don’t want anyone to misunderstand me – I agree that boxing is less efficient than not boxing, where there’s a choice. Sometimes (as here, in my view) the “more efficient” code is slightly less readable – and the efficiency benefit is often negligible compared with other factors. Exactly the same thing happened in Accelerated C# 2008, where a call to Math.Pow(x, 2) was the dominant factor in a program again designed to show the efficiency of avoiding boxing.

The performance scare of boxing is akin to that of exceptions, although I suppose it’s more likely that boxing could cause a real performance concern in an otherwise-well-designed program. It used to be a much more common issue, of course, before generics gave us collections which don’t require boxing/unboxing to add/fetch data.

In short: yes, boxing has a cost. But please look at it in context, and if you’re going to start making claims about how much faster code will run when it avoids boxing, at least provide an example where it actually contributes significantly to the overall execution cost.