Category Archives: Evil code

“Magic” null argument testing

Warning: here be dragons. I don’t think this is the right way to check for null arguments, but it was an intriguing idea.

Today on Stack Overflow, I answered a question about checking null arguments. The questioner was already using an extension similar to my own one in MiscUtil, allowing code like this:

public void DoSomething(string name)
{
    name.ThrowIfNull("name");

    // Normal code here
}

That’s all very well, but it’s annoying to have to repeat the name part. Now in an ideal world, I’d say it would be nice to add an attribute to the parameter and have the check performed automatically (and when PostSharp works with .NET 4.0, I’m going to give that a go, mixing Code Contracts and AOP…) – but for the moment, how far can we go with extension methods?

I stand by my answer from that question – the code above is the simplest way to achieve the goal for the moment… but another answer raised the interesting prospect of combining anonymous types, extension methods, generics, reflection and manually-created expression trees. Now that’s a recipe for hideous code… but it actually works.

The idea is to allow code like this:

public void DoSomething(string name, string canBeNull, int foo, Stream input)
{
    new { name, input }.CheckNotNull();

    // Normal code here
}

That should check name and input, in that order, and throw an appropriate ArgumentNullException – including parameter name – if one of them is null. It uses the fact that projection initializers in anonymous types use the primary expression’s name as the property name in the generated type, and the value of that expression ends up in the instance. Therefore, given an instance of the anonymous type initializer like the above, we have both the name and value despite having only typed it in once.

Now obviously this could be done with normal reflection – but that we be slow as heck. No, we want to effectively find the properties once, and generate strongly typed delegates to perform the property access. That sounds like a job for Delegate.CreateDelegate, but it’s not quite that simple… to create the delegate, we’d need to know (at compile time) what the property type is. We could do that with another generic type, but we can do better than that. All we really need to know about the value is whether or not it’s null. So given a "container" type T, we’d like a bunch of delegates, one for each property, returning whether that property is null for a specified instance – i.e. a Func<T, bool>. And how do we build delegates at execution time with custom logic? We use expression trees…

I’ve now implemented this, along with a brief set of unit tests. The irony is that the tests took longer than the implementation (which isn’t very unusual) – and so did writing it up in this blog post. I’m not saying that it couldn’t be improved (and indeed in .NET 4.0 I could probably make the delegate throw the relevant exception itself) but it works! I haven’t benchmarked it, but I’d expect it to be nearly as fast as manual tests – insignificant in methods that do real work. (The same wouldn’t be true using reflection every time, of course.)

The full project including test cases is now available, but here’s the (almost completely uncommented) "production" code.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using System.Linq.Expressions;

public static class Extensions
{
    public static void CheckNotNull<T>(this T container) where T : class
    {
        if (container == null)
        {
            throw new ArgumentNullException("container");
        }
        NullChecker<T>.Check(container);
    }

    private static class NullChecker<T> where T : class
    {
        private static readonly List<Func<T, bool>> checkers;
        private static readonly List<string> names;

        static NullChecker()
        {
            checkers = new List<Func<T, bool>>();
            names = new List<string>();
            // We can’t rely on the order of the properties, but we
            // can rely on the order of the constructor parameters
            // in an anonymous type – and that there’ll only be
            // one constructor.
            foreach (string name in typeof(T).GetConstructors()[0]
                                             .GetParameters()
                                             .Select(p => p.Name))
            {
                names.Add(name);
                PropertyInfo property = typeof(T).GetProperty(name);
                // I’ve omitted a lot of error checking, but here’s
                // at least one bit…
                if (property.PropertyType.IsValueType)
                {
                    throw new ArgumentException
                        ("Property " + property + " is a value type");
                }
                ParameterExpression param = Expression.Parameter(typeof(T), "container");
                Expression propertyAccess = Expression.Property(param, property);
                Expression nullValue = Expression.Constant(null, property.PropertyType);
                Expression equality = Expression.Equal(propertyAccess, nullValue);
                var lambda = Expression.Lambda<Func<T, bool>>(equality, param);
                checkers.Add(lambda.Compile());
            }
        }

        internal static void Check(T item)
        {
            for (int i = 0; i < checkers.Count; i++)
            {
                if (checkers[i](item))
                {
                    throw new ArgumentNullException(names[i]);
                }
            }
        }
    }
}

Oh, and just as a miracle – the expression tree worked first time. I’m no Marc Gravell, but I’m clearly improving :)

Update: Marc Gravell pointed out that the order of the results of Type.GetProperties isn’t guaranteed – something I should have remembered myself. However, the order of the constructor parameters will be the same as in the anonymous type initialization expression, so I’ve updated the code above to reflect that. Marc also showed how it could almost all be put into a single expression tree which returns either null (for no error) or the name of the "failing" parameter. Very clever :)

API design: choosing between non-ideal options

So, UnconstrainedMelody is coming on quite nicely. It now has quite a few useful options for flags enums, "normal enums" and delegates. However, there are two conflicting limitations which leave a couple of options. (Other related answers on Stack Overflow have suggested alternative approaches, basically.)

Currently, most of the enums code is in two classes: Flags and Enums. Both are non-generic: the methods within them are generic methods, so they have type parameters (and constraints). The main benefit of this is that generic type inference only applies to generic methods, and I definitely want that for extension methods and anywhere else it makes sense.

The drawback is that properties can’t be generic. That means my API is entirely expressed in terms of methods, which can be a pain. The option to work around this is to have a generic type which properties in. This adds confusion and guesswork – what call is where?

To recap, the options are:

// Option 1 (current): all methods in a nongeneric class:
// Some calls which are logically properties end up
// as methods…
IList<Foo> foos = Enums.GetValues<Foo>();
// Type infererence for extenion methods
// Note that we couldn’t have a Description property
// as we don’t have extension properties
string firstDescription = foos[0].GetDescription();
        
// Option 2: Use just a generic type:
// Now we can use a property…
IList<Foo> foos = Enums<Foo>.Values;
// But we can’t use type inference
string firstDescription = Enums<Foo>.GetDescription(foos[0]);
        
// Option 3: Use a mixture (Enums and Enums<T>):
IList<Foo> foos = Enums<Foo>.Values;
// All looks good…
string firstDescription = foos[0].GetDescription();
// … but the user has to know when to use which class

All of these are somewhat annoying. If we only put extension methods into the nongeneric class, then I guess users would never need to really think about that – they’d pretty much always be calling the methods via the extension method syntactic sugar anyway. It still feels like a pretty arbitrary split though.

Any thoughts? Which is more important – conceptual complexity, or the idiomatic client code you end up with once that complexity has been mastered? Is it reasonable to make design decisions like this around what is essentially a single piece of syntactic sugar (extension methods)?

(By the way, if anyone ever wanted justification for extension properties, I think this is a good example… Description feels like it really should be a property.)

Evil Code of the Day: variance and overloading

(Note that this kind of breakage was mentioned a long time ago in Eric Lippert’s blog, although not in this exact form.)

Whenever a conversion becomes available where it wasn’t before, overload resolution can change its behaviour. From C# 1 to C# 2 this happened due to delegate variance with method group conversions – now the same thing is true for generic variance for interfaces.

What does the following code print?

using System;
using System.Collections.Generic;

class Base
{
    public void Foo(IEnumerable<string> strings)
    {
        Console.WriteLine(“Strings”);
    }
}

class Derived : Base
{
    public void Foo(IEnumerable<object> objects)
    {
        Console.WriteLine(“Objects”);
    }
}

class Test
{
    static void Main()
    {
        List<string> strings = new List<string>();
        new Derived().Foo(strings);
    }
}

The correct answer is “it depends on which version of C# and .NET framework you’re using.”

If you’re using C# 4.0 and .NET 4.0, then IEnumerable<T> is covariant: there’s an implicit conversion from IEnumerable<string> to IEnumerable<object>, so the derived overload is used.

If you’re using C# 4.0 but .NET 3.5 or earlier then the compiler still knows about variance in general, but the interface in the framework doesn’t have the appropriate metadata to indicate it, so there’s no conversion available, and the base class overload is used.

If you’re using C# 3.0 or earlier then the compiler doesn’t know about generic variance at all, so again the base class overload is used.

So, this is a breaking change, and a fairly subtle one at that – and unlike the method group conversion in .NET 2.0, the compiler in .NET 4.0 beta 1 doesn’t issue a warning about it. I’ll edit this post when there’s an appropriate Connect ticket about it…

In general though, I’d say it’s worth avoiding overloading a method declared in a base class unless you really have to. In particular, overloading it using the same number of parameters but more general ones seems to be a recipe for unreadable code.