Using extension method resolution rules to decorate awaiters

This post is a mixture of three things:

  • Revision of how extension methods are resolved
  • Application of this to task awaiting in async methods
  • A rant about void not being a type

Compared with my last few posts, there’s almost nothing to do with genuine asynchronous behaviour here. It’s to do with how the language supports asynchronous behaviour, and how we can hijack that support :)

Extension methods redux

I’m sure almost all of you could recite the C# 4 spec section 7.6.5.2 off by heart, but for the few readers who can’t (Newton Microkitchen Breakfast Club, I’m looking at you) here’s a quick summary.

The compiler looks for extension methods (the ones that "pretend" to be instance methods on other types, and are declared in non-generic top-level static classes) when it comes across a method invocation expression1 and finds no applicable methods. We’ll assume we’ve got to that point.

The compiler then looks in successive contexts for extension methods. It only considers non-generic static types directly declared in namespaces (as opposed to being nested classes) but it’s the order in which the namespaces are searched which is interesting. Imagine that the compiler is looking at code in a namespace X.Y.Z. That has to be within at least one namespace declaration, and can have up three, like this:

namespace X
{
    namespace Y
    {
        namespace Z
        {
            // Code being compiled
        }
    }
}

The compiler starts with the "innermost" namespace, and works outwards to the global namespace. At each level, it first considers types within that namespace, then types within any using namespace directives within the namespace declaration. So, to give a really full example, consider this:

using UD.U0;

namespace X
{
    using UD.U1;

    namespace Y
    {
       using UD.U2;

        namespace Z
        {
            using UD.U3;

            // Code being compiled
        }
    }
}

The namespaces would be searched in this order:

  • Z
  • UD.U3
  • Y
  • UD.U2
  • X
  • UD.U1
  • "global"
  • UD.U0

Note that UD itself would not be searched. If a namespace declaration contains more than one using namespace directive, they’re considered as a set of directives – the order doesn’t matter, and all types within the referenced namespaces are considered equally.

As soon as an eligible method has been found, this brings the search to a halt – even if a "better" method might be available elsewhere. This allows us to effectively prioritise extension methods within a particular namespace by including a using namespace directive in a more deeply nested namespace declaration than the methods we want to ignore.

Async methods and extensions on Task/Task<T>

So, where am I heading with all of this? Well, I wanted to work out a way of getting the compiler to use my extension methods for Task and Task<T> instead of the ones that come in the CTP library. The GetAwaiter() methods are in a type called AsyncCtpThreadingExtensions, and they both return System.Runtime.CompilerServices.TaskAwaiter instances. You can tell this just by decompiling your own code, and see what it calls when you "await" a task.

Now, we can create our own complete awaiter methods, as shown in my previous post… but it’s potentially more useful just to be able to add diagnosis tools without changing the actual behaviour. For the sake of brevity, here are some extension methods and supporting types just for Task<T> – the full code targets the non-generic Task type as well.

using System;
using System.Runtime.CompilerServices;
using System.Threading.Tasks;

namespace JonSkeet.Diagnostics
{
    public static class DiagnosticTaskExtensions
    {
        /// <summary>
        /// Associates a task with a user-specified name before GetAwaiter is called
        /// </summary>
        public static NamedTask<T> WithName<T>(this Task<T> task, string name)
        {
            return new NamedTask<T>(task, name);
        }

        /// <summary>
        /// Gets a diagnostic awaiter for a task, based only on its ID.
        /// </summary>
        public static NamedAwaiter<T> GetAwaiter<T>(this Task<T> task)
        {
            return new NamedTask<T>(task, "[" + task.Id + "]").GetAwaiter();
        }

        public struct NamedTask<T>
        {
            private readonly Task<T> task;
            private readonly string name;

            public NamedTask(Task<T> task, string name)
            {
                this.task = task;
                this.name = name;
            }

            public NamedAwaiter<T> GetAwaiter()
            {
                Console.WriteLine("GetAwaiter called for task "{0}"", name);
                return new NamedAwaiter<T>(AsyncCtpThreadingExtensions.GetAwaiter(task), name);
            }
        }

        public struct NamedAwaiter<T>
        {
            private readonly TaskAwaiter<T> awaiter;
            private readonly string name;

            public NamedAwaiter(TaskAwaiter<T> awaiter, string name)
            {
                this.awaiter = awaiter;
                this.name = name;
            }

            public bool BeginAwait(Action continuation)
            {
                Console.WriteLine("BeginAwait called for task "{0}"…", name);
                bool ret = awaiter.BeginAwait(continuation);
                Console.WriteLine("… BeginAwait for task "{0}" returning {1}", name, ret);
                return ret;
            }

            public T EndAwait()
            {
                Console.WriteLine("EndAwait called for task "{0}"", name);
                // We could potentially report the result here
                return awaiter.EndAwait();
            }
        }
    }
}

So this lets us give a task a name for clarity (optionally), and logs when the GetAwaiter/BeginAwait/EndAwait methods get called.

The neat bit is how easy this is to use. Consider this code:

using System;
using System.Net;
using System.Threading.Tasks;

namespace Demo
{
    using JonSkeet.Diagnostics;

    class Program
    {
        static void Main(string[] args)
        {
            Task<int> task = SumPageSizes();
            Console.WriteLine("Result: {0}", task.Result);
        }

        static async Task<int> SumPageSizes()
        {
            Task<int> t1 = FetchPageSize("http://www.microsoft.com&quot;);
            Task<int> t2 = FetchPageSize("http://csharpindepth.com&quot;);

            return await t1.WithName("MS web fetch") +
                   await t2.WithName("C# in Depth web fetch");
        }

        static async Task<int> FetchPageSize(string url)
        {
            string page = await new WebClient().DownloadStringTaskAsync(url);
            return page.Length;
        }
    }
}

The JonSkeet.Diagnostics namespace effectively has higher priority when we’re looking for extension methods, so our GetAwaiter is used instead of the ones in the CTP (which we delegate to, of course).

Remove the using namespace directive for JonSkeet.Diagnostics, remove the calls to WithName, and it all compiles and runs as normal. If you don’t want to have to do anything to the code, you could put the using namespace directive within #if DEBUG / #endif and write a small extension method in the System.Threading.Tasks namespace like this:

namespace System.Threading.Tasks
{
    public static class NamedTaskExtensions
    {
        public static Task<T> WithName<T>(this Task<T> task, string name)
        {
            return task;
        }
    }
}

… and bingo, diagnostics only in debug mode. The no-op WithName method will be ignored for the higher-priority version one in debug builds, and will be harmless in a release build.

The diagnostics themselves can be quite enlightening, by the way. For example, here’s the result of the previous program:

GetAwaiter called for task "[1]"
BeginAwait called for task "[1]"…
… BeginAwait for task "[1]" returning True
GetAwaiter called for task "[2]"
BeginAwait called for task "[2]"…
… BeginAwait for task "[2]" returning True
GetAwaiter called for task "MS web fetch"
BeginAwait called for task "MS web fetch"…
… BeginAwait for task "MS web fetch" returning True
EndAwait called for task "[2]"
EndAwait called for task "[1]"
EndAwait called for task "MS web fetch"
GetAwaiter called for task "C# in Depth web fetch"
BeginAwait called for task "C# in Depth web fetch"…
… BeginAwait for task "C# in Depth web fetch" returning False
EndAwait called for task "C# in Depth web fetch"
Result: 6009

This shows us waiting to fetch both web pages, and both of those awaits being asynchronous. (Note that we launched the tasks before any diagnostics were displayed – it’s only awaiting the tasks that causes all of this to kick in.) After both of those "fetch and take the length" tasks have started, we await the result of the first one (for microsoft.com). This corresponds to task 1 – but task 2 (fetching csharpindepth.com) finishes first. When the microsoft.com page has finished fetching, the length is computed and that task completes. Now when we await the result of fetching the length of csharpindepth.com, we see that it’s already finished, and the await completes synchronously.

Obviously this was a small example, I deliberately left two tasks with just task IDs, and there could be a lot more information (such as timestamps and thread IDs, to start with) but I suspect this sort of thing could be invaluable when trying to work out what’s going on in async code.

And finally… a short rant

I’ve written all the diagnostic code twice. Not because it was wrong the first time, but because it only covered Task<T>, not Task. I couldn’t write it just on Task, because then EndAwait would have had the wrong signature… but the code was pretty much a case of "cut, paste, remove <T> everywhere".

I’ve never been terribly bothered by the void type before, and it not being a "proper" type like unit in functional programming languages. Now, I suddenly begin to see the point.

Perhaps the TPL should have introduced the Unit type before the Rx team got in there. With a single Task<T> type, I suspect there’d be significantly less code duplication in the framework (including the async CTP).

Is it enough to make me wish we didn’t have void at all? Maybe. Maybe not. Perhaps with sufficient knowledge in the CLR, there wouldn’t have to be any stack penalty for copying a "pretend" return value onto the stack every time we call a method which would currently return void. I’ll certainly be keeping an eye out for other places where it would make life easier.

Conclusion

I don’t normally advocate language tricks like the extension method "priority boost" described here. I love talking about them, but I think they’re nasty enough to avoid most of the time.

But in this case the diagnostic benefit is potentially huge! I don’t know how it would fit into the full framework – or where it would dump its diagnostics to – but I’d really like to see something like this in the final release, particularly with the ability to associate a name with a task.

Even if you don’t want to actually use this, I hope you’ve enjoyed it as an intellectual exercise and a bit of reinforcement about how GetAwait/BeginAwait/EndAwait works.


1 It has to be a method invocation on an expression, too. So if you’re writing code within an IEnumerable<T> implementation and you want to call the LINQ Count() method, you have to call this.Count() rather than just Count(), for example.

One thought on “Using extension method resolution rules to decorate awaiters”

  1. I don’t think that I have ever had a name collision in my extension methods, but now that I know it is possible i am sure it will happen. This is pretty cool, it seems like you override some of the .NET extensions if you namespace them cleverly? Thanks for the post.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s