Eduasync part 12: Observing all exceptions

(This post covers projects 16 and 17 in the source code.)

Last time we looked at unwrapping an AggregateException when we await a result. While there are potentially other interesting things we could look at with respect to exceptions (particularly around cancellation) I’m just going to touch on one extra twist that the async CTP implements before I move on to some weird ways of using async.

TPL and unobserved exceptions

The Task Parallel Library (TPL) on which the async support is based has some interesting behaviour around exceptions. Just as it’s entirely possible for more than one thing to go wrong with a particular task, it’s also quite easy to miss some errors, if you’re not careful.

Here’s a simple example of an async method in C# 5 where we create two tasks, both of which will throw exceptions:

private static async Task<int> CauseTwoFailures()
{
    Task<int> firstTask = Task<int>.Factory.StartNew(() => {
        throw new InvalidOperationException();
    });
    Task<int> secondTask = Task<int>.Factory.StartNew(() => {
        throw new InvalidOperationException();
    });

    int firstValue = await firstTask;
    int secondValue = await secondTask;

    return firstValue + secondValue;
}

Now the timing of the two tasks is actually irrelevant here. The first task will always throw an exception, which means we’re never going to await the second task. That means there’s never any code which asks for the second task’s result, or adds a continuation to it. It’s alone and unloved in a cruel world, with no-one to observe the exception it throws.

If we call this method from the Eduasync code we’ve got at the moment, and wait for long enough (I’ve got a call to GC.WaitForPendingFinalizers in the same code) the program will abort, with this error:

Unhandled Exception: System.AggregateException: A Task’s exception(s) were not observed either by Waiting on the Task or accessing its Exception property. As a result, the unobserved exception was rethrown by the finalizer thread. —> System.InvalidOperationException: Operation is not valid due to the current state of the object.

Ouch. The TPL takes a hard line on unobserved exceptions. They indicate failures (presumably) which you’ll never find out about until you start caring about the result of a task. Basically there are various ways of "observing" a task’s failure, whether by performing some act which causes it to be thrown (usually as part of an AggregateException) or just asking for the exception for a task which is known to be faulted. An unobserved exception will throw an InvalidOperationException in its finalizer, usually causing the process to exit.

That works well in "normal" TPL code, where you’re explicitly managing tasks – but it’s not so handy in async, where perfectly reasonable looking code which starts a few tasks and then awaits them one at a time (possibly doing some processing in between) might hide an unobserved exception.

Observing all exceptions

Fortunately TPL provides a way of us to get out of the normal task behaviour. There’s an event TaskScheduler.UnobservedTaskException which is fired by the finalizer before it goes bang. The handlers of the event are allowed to observe the exception using UnobservedTaskExceptionEventArgs.SetObserved and can also check whether it’s already been observed.

So all we have to do is add a handler for the event and our program doesn’t crash any more:

TaskScheduler.UnobservedTaskException += (sender, e) =>
{
    Console.WriteLine("Saving the day! This exception would have been unobserved: {0}",
                      e.Exception);
    e.SetObserved();
};

In Eduasync this is currently only performed explicitly, in project 17. In the async CTP something like this is performed as part of the type initializer for AsyncTaskMethodBuilder<T>, which you can unfortunately tell because that type initializer crashes when running under medium trust. (That issue will be fixed before the final release.)

Global changes

This approach has a very significant effect: it changes the global behaviour of the system. If you have a system which uses the TPL and you want the existing .NET 4 behaviour of the process terminating when you have unobserved exceptions, you basically can’t use async at all – and if you use any code which does, you’ll see the more permissive behaviour.

You could potentially add your own event handler which aborted the application forcibly, but that’s not terribly nice either. You should quite possibly add a handler to at least log these exceptions, so you can find out what’s been going wrong that you haven’t noticed.

Of course, this only affects unobserved exceptions – anything you’re already observing will not be affected. Still, it’s a pretty big change. I wouldn’t be surprised if this aspect of the behaviour of async in C# 5 changed before release; it feels to me like it isn’t quite right yet. Admittedly I’m not sure how I would suggest changing it, but effectively reversing the existing behaviour goes against Microsoft’s past behaviour when it comes to backwards compatibility. Watch this space.

Conclusion

It’s worth pondering this whole issue yourself (and the one from last time), and making your feelings known to the team. I think it’s symptomatic of a wider problem in software engineering: we’re not really very good at handling errors. Java’s approach of checked exceptions didn’t turn out too well in my view, but the "anything goes" approach of C# has problems too… and introducing alternative models like the one in TPL makes things even more complicated. I don’t have any smart answers here – just that it’s something I’d like wiser people than myself to think about further.

Next, we’re going to move a bit further away from the "normal" ways of using async, into the realm of coroutines. This series is going to get increasingly obscure and silly, all in the name of really getting a feeling for the underlying execution model of async, before it possibly returns to more sensible topics such as task composition.

10 thoughts on “Eduasync part 12: Observing all exceptions”

  1. A good detail to know! What does the CTP actually *do* with the unobserved exception? Ignore it, trace it to debug out? console?

    Like

  2. Yikes! That is rather scary; an innocent-looking piece of code can bring down the whole process.

    Given that it would be painful to have to wait and observe the second exception, seems like the default behavior should be changed.

    Perhaps the right path is that .NET 5 tasks have a new exception handling policy, where unobserved exceptions are *not* thrown by the finalizer.

    You said we should voice our opinions to the C# team – how do we do that?

    Like

  3. @Judah: I believe there’s either a Connect page or a forum linked off the Visual Studio Async page. That’s the place to start, certainly.

    Like

  4. “If you have a system which uses the TPL and you want the existing .NET 4 behaviour of the process terminating when you have unobserved exceptions, you basically can’t use async at all…”

    Sorry, I don’t follow this statement, at least in the above absolute phrasing.

    I agree that the behavior described certainly makes things more complicated. But prevents you from using async at all? While not necessarily pretty, it seems to me that there are work-arounds.

    For example, if you’re going to have a method start up multiple tasks, for later awaiting, you could also store references to those tasks elsewhere. The stored references could be used in a variety of ways, including either simply keeping the Task alive long enough to allow the client code to observe the exceptions for un-awaited tasks at a later time, or perhaps more practical, to control the behavior of an UnobservedTaskException handler (i.e. only call SetObserved() if the exception reported is in the collection of tasks used in an async method, allowing other TPL usages to maintain the previous behavior. Or maybe one could simply dispose the task on completion, so that the finalizer doesn’t get run at all.

    In the above cases, extension methods can help simplify the code that would be required to wrap the async tasks so that the task reference storage is kept up-to-date as tasks complete (and to perform the dispose, in the third option).

    I’ll bet there are other work-arounds, maybe even some much more elegant than the above suggestions. And no, I wouldn’t call any of these suggestions elegant, and yes they would complicate the code. But it seems like the scenarios where they’d be needed are a small intersection of all the possible async scenarios (i.e. use of TPL alongside async methods _and_ async methods awaiting on things that throw exceptions).

    I personally would prefer for the .NET and C# designers to come up with a solution that doesn’t require all of the above hijinks. Perhaps a way to make it convenient to “await” on multiple tasks and yet even if an exception is thrown, does not execute the completion (whether rethrowing or proceeding) until all tasks have completed. Or maybe provide a way to have the tasks awaited in an async method not rethrow exceptions in their finalizers. Either of those would be much better than hacking up one’s own code to work around the situation.

    But I don’t think people are literally prevented completely from using TPL directly at the same time async methods are in use. :)

    Like

  5. @pete.d: My point is that existing code will have its behaviour changed. Given that the point of the behaviour is to catch tasks which you’ve accidentally not checked for faulting, saying “You can add code to make sure you *do* observe everything” isn’t really preserving the same behaviour IMO. It’s still saying you can’t use the same behaviour to catch problems.

    I didn’t say that people were literally prevented from using the TPL directly at the same time as using async – I said they couldn’t do so *and get the previous behaviour*. Unless they throw an exception themselves from the UnobservedTaskException handler (and work out some way of not doing that for tasks created in async methods) I think that’s still a fair comment.

    Like

  6. I think I understand the intent of your statement, but I still think there are ways to use async methods without affecting the behavior of existing TPL-using code.

    The most obvious work-around is to simply not use awaitables that can throw exceptions, but I also described a variety of other approaches that could work (albeit inconveniently in some cases) without changing the behavior that the TPL-using code sees. And those approaches aren’t even close to being a complete enumeration of the options we have for work-arounds.

    Basically, any mechanism — of which we have lots of choices — that will either prevent or go ahead and observe exceptions for tasks being awaited in an async method will have the effect of allowing any existing TPL-using code to continue to work as before.

    I agree that the situation could be much better, and I do hope that Microsoft figures out a cleaner way to fix things so that both code that uses the TPL directly and async methods behave in intuitive ways without disturbing the observed behavior of each other.

    But I don’t see how it’s it’s impossible to mix and match, even while getting the previous behavior for legacy code. It’s inconvenient, to be sure. But impossible? That seems a stretch. :)

    Like

  7. @pete.d: “Basically, any mechanism — of which we have lots of choices — that will either prevent or go ahead and observe exceptions for tasks being awaited in an async method will have the effect of allowing any existing TPL-using code to continue to work as before.” – That’s simply not true.

    It’s not a matter of whether the async method *uses* the ability to ignore unobserved exceptions. *The very act of using async will mean the “catch-all” event handler is added*. That’s what alters the behaviour of other code. Code which would previous abort the application if TPL code forgot to observe an exception will now *not* do so. That would be the case even if you wrote an async method with no awaits in at all.

    Like

  8. Ah. I misunderstood and did not realize the event handler was added by the async CTP code itself. I thought you meant we had to add it ourselves to prevent a crash.

    And yes, I agree that’s unfortunate. In general, I feel that unhandled exceptions _should_ take the program down.

    In fact, I am not really all that sure that even in the async method case that I want exceptions that occur asynchronously and which are not eventually observed to be ignored.

    Anyway, thanks for sticking with me until I understood what the CTP code is really doing.

    Like

  9. (though, all that said, it seems to me that a work-around given what’s really going on would be to add one’s own event handler to the event and re-throw an exception from the handler. assuming the code raising the event isn’t catching exceptions, then _that_ should create the requisite unhandled exception that would take down the process :) ).

    Like

Leave a comment