“Magic” null argument testing

Warning: here be dragons. I don’t think this is the right way to check for null arguments, but it was an intriguing idea.

Today on Stack Overflow, I answered a question about checking null arguments. The questioner was already using an extension similar to my own one in MiscUtil, allowing code like this:

public void DoSomething(string name)
{
    name.ThrowIfNull("name");

    // Normal code here
}

That’s all very well, but it’s annoying to have to repeat the name part. Now in an ideal world, I’d say it would be nice to add an attribute to the parameter and have the check performed automatically (and when PostSharp works with .NET 4.0, I’m going to give that a go, mixing Code Contracts and AOP…) – but for the moment, how far can we go with extension methods?

I stand by my answer from that question – the code above is the simplest way to achieve the goal for the moment… but another answer raised the interesting prospect of combining anonymous types, extension methods, generics, reflection and manually-created expression trees. Now that’s a recipe for hideous code… but it actually works.

The idea is to allow code like this:

public void DoSomething(string name, string canBeNull, int foo, Stream input)
{
    new { name, input }.CheckNotNull();

    // Normal code here
}

That should check name and input, in that order, and throw an appropriate ArgumentNullException – including parameter name – if one of them is null. It uses the fact that projection initializers in anonymous types use the primary expression’s name as the property name in the generated type, and the value of that expression ends up in the instance. Therefore, given an instance of the anonymous type initializer like the above, we have both the name and value despite having only typed it in once.

Now obviously this could be done with normal reflection – but that we be slow as heck. No, we want to effectively find the properties once, and generate strongly typed delegates to perform the property access. That sounds like a job for Delegate.CreateDelegate, but it’s not quite that simple… to create the delegate, we’d need to know (at compile time) what the property type is. We could do that with another generic type, but we can do better than that. All we really need to know about the value is whether or not it’s null. So given a "container" type T, we’d like a bunch of delegates, one for each property, returning whether that property is null for a specified instance – i.e. a Func<T, bool>. And how do we build delegates at execution time with custom logic? We use expression trees…

I’ve now implemented this, along with a brief set of unit tests. The irony is that the tests took longer than the implementation (which isn’t very unusual) – and so did writing it up in this blog post. I’m not saying that it couldn’t be improved (and indeed in .NET 4.0 I could probably make the delegate throw the relevant exception itself) but it works! I haven’t benchmarked it, but I’d expect it to be nearly as fast as manual tests – insignificant in methods that do real work. (The same wouldn’t be true using reflection every time, of course.)

The full project including test cases is now available, but here’s the (almost completely uncommented) "production" code.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
using System.Linq.Expressions;

public static class Extensions
{
    public static void CheckNotNull<T>(this T container) where T : class
    {
        if (container == null)
        {
            throw new ArgumentNullException("container");
        }
        NullChecker<T>.Check(container);
    }

    private static class NullChecker<T> where T : class
    {
        private static readonly List<Func<T, bool>> checkers;
        private static readonly List<string> names;

        static NullChecker()
        {
            checkers = new List<Func<T, bool>>();
            names = new List<string>();
            // We can’t rely on the order of the properties, but we
            // can rely on the order of the constructor parameters
            // in an anonymous type – and that there’ll only be
            // one constructor.
            foreach (string name in typeof(T).GetConstructors()[0]
                                             .GetParameters()
                                             .Select(p => p.Name))
            {
                names.Add(name);
                PropertyInfo property = typeof(T).GetProperty(name);
                // I’ve omitted a lot of error checking, but here’s
                // at least one bit…
                if (property.PropertyType.IsValueType)
                {
                    throw new ArgumentException
                        ("Property " + property + " is a value type");
                }
                ParameterExpression param = Expression.Parameter(typeof(T), "container");
                Expression propertyAccess = Expression.Property(param, property);
                Expression nullValue = Expression.Constant(null, property.PropertyType);
                Expression equality = Expression.Equal(propertyAccess, nullValue);
                var lambda = Expression.Lambda<Func<T, bool>>(equality, param);
                checkers.Add(lambda.Compile());
            }
        }

        internal static void Check(T item)
        {
            for (int i = 0; i < checkers.Count; i++)
            {
                if (checkers[i](item))
                {
                    throw new ArgumentNullException(names[i]);
                }
            }
        }
    }
}

Oh, and just as a miracle – the expression tree worked first time. I’m no Marc Gravell, but I’m clearly improving :)

Update: Marc Gravell pointed out that the order of the results of Type.GetProperties isn’t guaranteed – something I should have remembered myself. However, the order of the constructor parameters will be the same as in the anonymous type initialization expression, so I’ve updated the code above to reflect that. Marc also showed how it could almost all be put into a single expression tree which returns either null (for no error) or the name of the "failing" parameter. Very clever :)

51 thoughts on ““Magic” null argument testing”

  1. @Jonathan: You’re more than welcome to it should you wish, but I take *no* responsibility for it :)

    @Kirill: That still has duplication, and I suspect it’s less efficient. Interesting though.

    @James: Yes, I could use Predicate. I tend to think in terms of Func and Action these days though, just as a matter of habit.

    Like

  2. Clever idea, but why not just:

    // ASSUME: T is an anonymous type (properties all public; no indexers)
    private static class NullChecker where T : class
    {
    private static readonly List properties;

    static NullChecker()
    {
    foreach (PropertyInfo property in typeof(T).GetProperties())
    {
    if (property.PropertyType.IsValueType)
    throw new ArgumentException(“Property ” + property + ” is a value type”);

    // Omitted access tests; indexed properties etc …
    properties.Add(property);
    }
    }

    internal static void Check(T item)
    {
    foreach (PropertyInfo property in properties)
    {
    if (property.GetValue(item, null) == null)
    throw new ArgumentNullException(property.Name);
    }
    }
    }

    Like

  3. @James using Func is a better idea because delegates do not exhibit strutural type compatibility and, since 3.0, more code talks Func than Predicate (especially Linq).

    @jon – shame it allocates the container every time. It is the one reason not to use it for me compared to the MiscUtils one.

    Like

  4. @Jacob: That will use reflection to access the property every time – much, *much* slower than a compiled delegate.

    @Shuggy: Yes, it’s a shame it allocates the container on every call – but I don’t expect that’s particularly expensive. I’m just not keen on it as a horrible abuse of anonymous types… although I could be won round ;)

    Like

  5. Interesting ideas. A thought; rather than build lots of delegates – build one that finds the first bad argument. I don’t know how well this will paste, but…

    private static class NullChecker where T : class
    {
    private static readonly Func nullSeeker;

    static NullChecker()
    {
    Expression body = Expression.Constant(null, typeof(string));
    var param = Expression.Parameter(typeof(T), “obj”);
    foreach (PropertyInfo property in typeof(T).GetProperties())
    {
    Type propType = property.PropertyType;
    if (propType.IsValueType && Nullable.GetUnderlyingType(propType) == null)
    {
    continue; // can’t be null
    }
    body = Expression.Condition(
    Expression.Equal(
    Expression.Property(param, property),
    Expression.Constant(null, propType)),
    Expression.Constant(property.Name, typeof(string)),
    body);
    }
    nullSeeker = Expression.Lambda<Func>(body, param).Compile();
    }

    internal static void Check(T item)
    {
    string nullArg = nullSeeker(item);
    if (!string.IsNullOrEmpty(nullArg))
    {
    throw new ArgumentNullException(nullArg);
    }
    }
    }

    Like

  6. I like it. Still EVIL though.

    These days I’m still happy with the name.ThrowIfNull(“name”); approach, together with a R# live template to automagically sync the parameter name. I’m also using an InvokerParameterAttribute that R# understands and, again automagically, changes the string every time I use the rename refactoring on the parameter.

    Like

  7. @Marc: Trust you to make it more complicated ;) Unless I’m mistaken, that will reverse the order of the tests – I’d want to call .Reverse() on the result of GetProperties to make it pass my current unit tests. Very nice idea though…

    Like

  8. @Marc: You’re right! How could I have forgotten that? No matter – it’s easily remedied. I just need to take the constructor parameters, which *will* be in order, and get their names instead :) Will edit the code…

    Like

  9. Sorry, but this is taking really simple stuff too far. C# provides this handy keyword called ‘null’. Using it, you can write the following simple, efficient, and easy to understand code:

    if (arg == null) throw new ArgumentNullException();

    Despite how simple this solution is, I agree that it’s a good idea to wrap it in 40 layers of abstraction for no reason whatsoever.

    Like

  10. @Z: To do your callers a favour, you should specify the parameter name as an argument to ArgumentNullException. Doing so introduces redundancy, which means you can get out of sync. Furthermore, many coding conventions (including the one I use) really don’t like “if” without braces – so your code becomes four lines per parameter. When you’re validating four parameters, that’s a big chunk of code.

    So while I do think this is ugly in various ways, and I prefer a simple wrapper of arg.ThrowIfNull(“arg”), I dispute your claim that there is “no reason whatsoever”.

    Like

  11. There’s another benefit to the anonymous type — the call is guaranteed to bind to an extension method.

    Whereas when I slip a secret virtual ThrowIfNull method into some random Noda class (that is a community effort, right?), you’ll suddenly be throwing NullReferenceException from your unit tests instead of ArgumentNullException.

    Like

  12. While this is indeed cool, from a code maintainability standpoint, this is a disaster.

    What’s wrong with a tiny function that checks to see if the parameter is null that takes the parameter name? Is it wasteful typing? Maybe. Is the wasted typing static information? Of course. Can some dude/dudette fresh from college figure out what is going on? Absolutely.

    I don’t have a problem with complex solutions to complex problems, however, this is hardly a complex problem.

    Like

  13. @Robert: As I’ve mentioned many times, I’d take the trade-off of the ThrowIfNull extension method with the explicit parameter name for the moment – although I really do want to get attributes working with Code Contracts soon.

    The bit which pains me is that without extra tools like ReSharper, renaming the parameter is going to turn the line of code into a bug: the exception thrown will have the wrong parameter name. Is that a serious bug? No – but I think it’s reasonable to feel it as a pain point, and look for ways round it.

    I don’t think this is a *good* way round it, although it’s been a lot of fun to play with. I would like there to be a *better* way round it. I thought it would be worth blogging about it though – with the caveats of course.

    Like

  14. I don’t like the name “CheckNotNull”. It indicates that it’ll check, and the result is not clear. “ThrowIfNull” is better because the result of a null is clearer – an exception (type not known, but at least you know you have to catch it).

    I’m not suggesting “ThrowArgumentNullExceptionIfNull()” but something that indicates there is an expectation rather than just an inspection (check). Maybe “MustNotBeNull”?

    Like

  15. Don’t forget that your assembly (and thus the amount of compiled machine code) can get considerably bigger when using anonymous types throughout the code base like this. In the long run this can negatively impact performance.

    Like

  16. @Steven: Yes, that’s true.

    @Bernhard: This was more about the technology than making it ready for real use.

    @Steven: That still has the duplication of the parameter name as a string as far as I can see. If I don’t care about that, I’ll just use foo.ThrowIfNull(“foo”).

    Like

  17. @Jocob,Marc: Since when can static classes have generic type parameters?

    I made something similar, using expressions.

    At least this was you can ‘print’ the expression on failure, and have some info directly at your disposal.

    Like

  18. @leppie: The trouble is that evaluating the expression tree is likely to be a lot more expensive than my solution here.

    Oh, and static classes have always been able to be generic, as far as I’m aware :)

    Like

  19. @jon: Ah right, thanks :) I was thinking about static classes cant be generic if they contain extension methods. error CS1106: Extension methods must be defined in a non-generic static class.

    You can use expressions in DEBUG mode, and switch to delegates for RELEASE mode, if the speed (or disclosure of sensitive information) bothers you. Or just turn it off if you are confident the ‘condition’ cannot be violated.

    Like

  20. It’s fine. One of those ‘syntax brilliance/abusing something’ thingies. Going to use it just to piss off everyone around me. ^^

    Like

  21. @jon: Not so. I mean this.

    CheckNoNull(
    #if DEBUG
    Expression
    #else
    Action
    #endif
    action) { … }

    Or decorate with ConditionalAttribute if return type is void, and let the compiler sort it out.

    Unless you mean in the source code (not the built assembly)…

    Like

  22. @leppie: I’m not really with you – the point is if you want to still get argument checking in a release build (which I would) but make it fast (which I would) then there’s no benefit I can see in using an expression tree solution as well.

    Like

  23. Can someone else confirm that the Magic Null testing is much way longer than a traditional check?

    I’ve made a little benchmark where I do 100000000 calls to 2 methods with the 3 same parameters: 2 strings and 1 stream. The first method does a normal check whereas the second uses the Magic one.

    It takes around 700ms to do the 100000000 calls with the first method, and 19500ms with the second. (reduced to around 14500ms with Marc Gravell’s code)

    Obviously, it takes much longer time to operate :/

    I have another question though: how would you throw the relevant exception in the delegate in C# 4.0 ?

    Like

  24. @Mike: Did your method do anything else? It’s possible that you’re getting inlining in one case and not the other. I’d argue that in most cases, 20 seconds for 100 million checks is acceptable… but I’ll agree it’s a little disappointing.

    Out of interest, have you benchmarked any of the other suggested solutions?

    I might have a go at benchmarking it myself, in which case I’ll update this post.

    As for .NET 4.0 – you’d use Expression.Throw: http://msdn.microsoft.com/en-us/library/system.linq.expressions.expression.throw(VS.100).aspx

    Like

  25. @skeet : no, my methods don’t do anything, and I did only test your code and Marc’s one.

    I’ve changed them to write the two strings in parameters inside the stream and see what’s going on, and here are my results, notably less disapointing ;)

    For 10 millions iterations :
    Normal check : 5357ms
    “Magic” check : 6299ms

    Indeed, there was some inlining in my previous benchmark.

    Here are my tests :

    class Program
    {
    private const int iterations = 10000000;

    static void Main(string[] args)
    {
    Stopwatch sw = new Stopwatch();

    Console.Write(“Normal Test : “);

    sw.Start();

    for (int i = 0; i < iterations; i++)
    {
    using (Stream stream = new MemoryStream())
    NormalTest("Foo", "Bar", stream);
    }

    sw.Stop();

    Console.Write(" " + sw.ElapsedMilliseconds);

    sw.Reset();

    Console.WriteLine();

    Console.Write("Magic Test : ");

    sw.Start();

    for (int i = 0; i < iterations; i++)
    {
    using (Stream stream = new MemoryStream())
    MagicTest("Foo", "Bar", stream);
    }

    sw.Stop();

    Console.Write(" " + sw.ElapsedMilliseconds);

    Console.Read();
    }

    private static void NormalTest(string text, string text2, Stream stream)
    {
    if (text == null)
    throw new ArgumentNullException("text");

    if (text2 == null)
    throw new ArgumentNullException("text2");

    if (stream == null)
    throw new ArgumentNullException("stream");

    UnicodeEncoding utf = new UnicodeEncoding();
    byte[] b1 = utf.GetBytes(text);

    stream.Write(b1, 0, b1.Length);

    byte[] b2 = utf.GetBytes(text2);
    stream.Write(b2, 0, b2.Length);
    }

    private static void MagicTest(string text, string text2, Stream stream)
    {
    new { text, text2, stream }.CheckNotNull();

    UnicodeEncoding utf = new UnicodeEncoding();
    byte[] b1 = utf.GetBytes(text);

    stream.Write(b1, 0, b1.Length);

    byte[] b2 = utf.GetBytes(text2);
    stream.Write(b2, 0, b2.Length);
    }
    }

    Like

  26. @Mike: I suspect the vast majority of the time is now taken in the stream/string calls – which will mirror reality, but makes it less useful for benchmarking.

    You could try going back to methods which don’t do anything, but use MethodImplAttribute to prohibit inlining. If I get a chance I’ll have a go at this myself too…

    Like

  27. @skeet: @Kevin’s solution, posted by @Kirill, is very fast in the normal case. There’s a method call and a comparison vs. null.

    The expression tree code only runs when the value is null. In that case, you’re about to throw an exception anyway, and the cost of the expression tree is minimal by comparison.

    The duplication is real, but the Rename refactoring will change both, which is the main mistake I’m concerned with.

    I would still tend to make a mistake where I write it once, copy, paste, and edit in the name of the next parameter, but forget one case. I have made that kind of coding error many times, and I suspect I’m not done.

    Given an arbitrary piece of code, I believe that minimizing repetition is more important than micro-optimization. (Once I have real performance data, I’ll optimize where it tells me to).

    A way to remove the duplication, compromise performance, but still have an easy way to micro-optimize when needed, is to add this overload to Kevin’s solution:

    public static void NotNull(Expression<Func<Func>> arg)
    {
    NotNull(arg.Compile()()(), arg);
    }

    Now I can use:

    Argument.NotNull(() => () => foo);

    by default, and then add the additional ‘foo’ parameter wherever it is a perf issue.

    Like

  28. To be honest, I don’t consider this a “abuse” of anonymous types at all.

    – You’re not exposing object to a “public” API.
    – You’re using anonymous types to package data in a way that provides value (in the form of synchronized names in exception messages), in a case where you don’t have any reason to be able to name the type.

    IMHO people are way too fearful about the newer features.

    That said, I’d like to see integration with code contracts :)

    Like

  29. >> @Steven: That still has the duplication of
    >> the parameter name as a string as far as I can
    >> see. If I don’t care about that, I’ll just use
    >> foo.ThrowIfNull(“foo”).

    Jon, with CuttingEdge.Conditions you don’t have to specify the string name, but in that case Conditions will insert the parameter name “value” for you. Of course this is not ideal. While code snippets can remove the redundant typing, it doesn’t remove the change of getting out of sync, when renaming the argument.

    While your CheckNotNull extension method solves that problem, I feel strongly about adding extension methods to all types (the Framework Design Guidelines have a rule against this). This method will be visible in IntelliSense on all (non value type) types. For instance, what is the expected behavior of lines like: ‘((string)s).CheckNotNull()’ and ‘((Stream)x).CheckNotNull()’?

    In CuttingEdge.Conditions I used extension methods like this. Users were able to write lines like this one: ‘x.Requires().IsNotNull()’. However, the Requires() extension method was showing up everywhere, so I finally decided to remove the extension method behavior. Now users have to write lines like this one: ‘Condition.Requires(x).IsNotNull()’. While less fluent, for a reusable framework as Conditions, I believe this choice was correct. You can read more about the decisions I made here: http://www.cuttingedge.it/blogs/steven/pivot/entry.php?id=57.

    Like

  30. @Steven: The expected behaviour of those lines would be for the cast to be applied first, and then nullity checked. I don’t see what the problem is – what am I missing?

    I would agree that it’s worth being *careful* when applying extension methods globally, but I don’t think it’s really that bad. I’d also suggest that a solution to the CuttingEdge.Conditions issue would be to still include the extension methods, but in a namespace of CuttingEdge.Conditions.Extensions – then no-one has to write their own ones, but to avoid them showing up, you just don’t have a using directive for that namespace.

    Frankly I’m waiting for Code Contracts to be universal so we can just have Contract.Requires(x != null) – or even better, some PostSharp/Code Contracts amalgam allowing attributed non-nullity :)

    Like

  31. >> what am I missing?

    Your CheckNotNull implementation searches for public properties and checks whether they are null. Developers could accidentally use CheckNotNull on normal arguments, such as System.String. However, when checking a string with CheckNotNull a TypeInitializationException is thrown. Or just for the fun of it, let’s assume a developer writes this:
    var ceia = new ComEventInterfaceAttribute(null, null);
    ceia.CheckNotNull();
    Should this call succeed? It throws a ArgumentNullException :-)

    I thought about adding an extra CuttingEdge.Conditions.Extension namespace, but decided not to do this, because I didn’t want developers to add more than one namespace for such a simple API. The extension methods are very simple and I advice developers to add them to their own project (in a root namespace) if they still favor the fluent API.

    I agree with you about Code Contracts. Compile time support is superior. About using AOP: I’ve been thinking about building a code weaving (or AOP) mechanism that would allow translation of Conditions syntax to Code Contracts, so compile time support is possible even with Conditions. However, it hasn’t been much more than thinking about this. While MS CCI or Mono.Cecil can be used to achieve this, it expect this to be quite hard to do this.

    Like

  32. @Steven: Ah, right – I agree about CheckNotNull there. Admittedly it could be changed fairly easily to check that it’s only called on something which looks like an anonymous type, and throw some suitably “hard” exception otherwise, which shouldn’t make it through even the most cursory testing.

    Note that if you *did* add a CuttingEdge.Conditions.Extensions namespace, developers who wanted to use that may well be able to get away with *only* adding that as a using directive, so they’d still only have one… getting each developer to reimplement the same thing, even if it’s simple, is a bit icky IMO.

    Like

  33. Just write a snippet… when the exception gets thrown you can see where the real error is in the stack as opposed to having to look n frames down the stack to see which method was actually passed the null parameter. Nice idea conceptually, but overkill IMO.

    Thanks for the post.

    Like

  34. @mquinn: “Just write a snippet” suggests that you believe I have a problem with the *typing* side. That’s not true at all – the problem is with having a lot of distracting code at the start of the method. A snippet just lets you create that distracting code faster – it doesn’t help the readability.

    Like

  35. I’m probably wrong; but doesn’t the where clause on T ensure that you don’t have to perform the check for IsValueType?

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s