Category Archives: C# 7

Implementing IXmlSerializable in readonly structs

Background

There are three things you need to know to start with:

Operations on read-only variables which are value types copy the variable value first. I’ve written about this before on this blog. C# 7.2 addresses this by introducing the readonly modifier for structs. See the language proposal for more details. I was touched to see that the proposal references my blog post :)

The ref readonly local variables and in parameter features in C# 7.2 mean that “read-only variables” are likely to be more common in C# than they have been in the past.

Noda Time includes many value types which implement IXmlSerializable. Noda Time implements IXmlSerializable.ReadXml by assigning to this: fundamentally IXmlSerializable assumes a mutable type. I use explicit interface implementation to make this less likely to be used directly on an unboxed variable. With a generic method using an interface constraint you can observe a simple method call mutating a non-read-only variable, but that’s generally harmless.

Adding the readonly modifier to Noda Time structs

I relished news of the readonly modifier for struct declarations. At last, I can remove my ReadWriteForEfficiency attribute! Hmm. Not so much. To be fair, some of the structs (7 out of 18) were fine. But every struct that implements IXmlSerializable gave me this error:

Cannot assign to ‘this’ because it is read-only

That’s reasonable: in members of a readonly struct, this effectively becomes an in parameter instead of a ref parameter. But how can we fix that? Like any sensible developer, I turned to Stack Overflow, which has precisely the question I’m interested in. It even has an answer! Unfortunately, it amounts to a workaround: assigning to this via unsafe code.

Violating readonly with unsafe code

To give a concrete example of the answer from Stack Overflow, here’s my current LocalTime code:

void IXmlSerializable.ReadXml([NotNull] XmlReader reader)
{
    Preconditions.CheckNotNull(reader, nameof(reader));
    var pattern = LocalTimePattern.ExtendedIso;
    string text = reader.ReadElementContentAsString();
    this = pattern.Parse(text).Value;
}

Here’s the code that compiles when LocalTime is marked as readonly, after enabling unsafe blocks:

unsafe void IXmlSerializable.ReadXml([NotNull] XmlReader reader)
{
    Preconditions.CheckNotNull(reader, nameof(reader));
    var pattern = LocalTimePattern.ExtendedIso;
    string text = reader.ReadElementContentAsString();
    fixed (LocalTime* thisAddr = &this)
    {
        *thisAddr = pattern.Parse(text).Value;
    }
}

Essentially, the unsafe code is bypassing the controls around read-only structs. Just for kicks, let’s apply the same change
throughout Noda Time, and think about what would happen…

As it happens, that fix doesn’t work universally: ZonedDateTime is a “managed type” because it contains a reference (to a DateTimeZone) which means you can’t create a pointer to it. That’s a pity, but if we can make everything else readonly, that’s a good start. Now let’s look at the knock-on effects…

Safe code trying to abuse the unsafe code

Let’s try to abuse our “slightly dodgy” implementations. Here’s a class we’ll put in a “friendly” assembly which is trying to be as helpful as possible:

using NodaTime;

public class AccessToken
{
    private readonly Instant expiry;
    public ref readonly Instant Expiry => ref expiry;

    public AccessToken(Instant expiry) => this.expiry = expiry;
}

Great – it lets you get at the expiry time of an access token without even copying the value.

The big test is: can we break this friendly’s code’s assumptions about expiry really not changing its value?

Here’s code I expected to mutate the access token:

static void MutateAccessToken(AccessToken accessToken)
{
    ref readonly Instant expiry = ref accessToken.Expiry;
    string xml = "<evil>2100-01-01T00:00:00Z</evil>";
    EvilReadXml(in expiry, xml);
}

static void EvilReadXml<T>(in T value, string xml) where T : IXmlSerializable
{
    var reader = XmlReader.Create(new StringReader(xml));
    reader.Read();
    value.ReadXml(reader);
}

We have an in parameter in EvilReadXml, so the expiry variable is being passed by reference, and then we’re calling ReadXml on that parameter… so doesn’t that mean we’ll modify the parameter, and thus the underlying expiry field in the object?

Nope. Thinking about it, the compiler doesn’t know when it compiles EvilReadXml that T is a readonly struct – it could be a regular struct. So it has to create a copy of the value before calling ReadXml.

Looking the the spec proposal, there’s one interesting point in the section on ref extension methods:

However any use of an in T parameter will have to be done through an interface member. Since all interface members are considered mutating, any such use would require a copy.

Hooray! That suggests it’s safe – at least for now. I’m still worried though: what if the C# team adds a readonly struct generic constraint in C# 8? Would that allow a small modification of the above code to break? And if interface methods are considered mutating anyway, why doesn’t the language know that when I’m trying to implement the interface?

But hey, now that we know unsafe code is able to work around this in our XML implementation, what’s to stop nasty code from using the same trick directly?

Unsafe code abusing safe code

Imagine we didn’t support XML serialization at all. Could unsafe code mutate AccessToken? It turns out it can, very easily:

static void MutateAccessToken(AccessToken accessToken)
{
    ref readonly Instant expiry = ref accessToken.Expiry;
    unsafe
    {
        fixed (Instant* expiryAddr = &expiry)
        {
            *expiryAddr = Instant.FromUtc(2100, 1, 1, 0, 0);
        }
    }
}

This isn’t too surprising – unsafe code is, after all, unsafe. I readily admit I’m not an expert on .NET security, and I know the
landscape has changed quite a bit over time. These days I believe the idea of a sandbox has been somewhat abandoned – if you care about executing code you don’t really trust, you do it in a container and use that as your security boundary. That’s about the limit of my knowledge though, which could be incorrect anyway.

Where do we go from here?

At this point, I’m stuck making a choice between several unpleasant ones:

  • Leave Noda Time public structs as non-read-only. That prevents users from writing efficient code to use them without copying.
  • Remove XML serialization from Noda Time 3.0. We’re probably going to remove binary serialization anyway on the grounds that Microsoft is somewhat-discouraging it going forward, and it’s generally considered a security problem. However, I don’t think XML serialization is considered to be as bad, so long as you use it carefully, preventing arbitrary type loading. (See this paper for more information.)
  • Implement XML serialization using unsafe code as shown above. Even though my attempt at abusing it failed, that doesn’t provide very much comfort. I don’t know what other ramifications there might be for including unsafe code. Does that limit where the code can be deployed? It also doesn’t help with my concern about a future readonly struct constraint.

Thoughts very welcome. Reassurances from the C# team that “readonly struct” constraints won’t happen particularly welcome… along with alternative ways of implementing XML serialization.

C# 7 feature request #1… extension attributes

Last week I learned that using static is going to be the syntax for importing static members (including extension methods) in C# 6. That fulfils a feature request I made in September 2005 (my fourth ever blog post, as it happens). With a feature request turnaround of 10 years, I figure I should get put everything I could ever want out there now… (Just kidding really – more seriously, I’m really pleased to see this change in C# 6, relative to both C# 5 and the earlier designs of C# 6.)

Last week at CodeMash, Dustin Campbell demonstrated one Roslyn DiagnosticAnalyzer for ArgumentNullException and friends, and I demonstrated one I’d written for the JetBrains InvokerParameterName attribute. The two diagnostics check largely the same thing: some parameters are “special” in that the argument for them should be a parameter in the current context, and should use the C# 6 nameof operator.

The most significant difference between them (beyond quality – it’s not hard to spot which is written by a C# team member…) is that my diagnostic spots these special parameters because they are decorated with a particular attribute, whereas Dustin’s has the relevant information hard-coded within the diagnostic. Mine can’t spot the ArgumentNullException constructor, and Dustin’s can’t spot the Preconditions.CheckNotNull method in Noda Time. It would be really nice if I could pretend that certain members of existing types had particular attributes applied to them. For example, wouldn’t it be nice if I could write something like:

using JetBrains.Annotations;

namespace System
{
    public extern partial class ArgumentNullException
    {
        public ArgumentNullException([InvokerParameterName] string paramName);
    }
}

The rules would be roughly like this:

  • A type declared with public extern partial modifiers would indicate that attributes should be applied to an existing type.
  • Each member listed would be checked for existence, and any attributes present in the declaration would be noted in the IL for the “extending” assembly.
  • A new call on Type, MethodInfo etc would be created to allow all attributes to be fetched for a member, whether “extended” or not. This would find all attributes contributed by all assemblies loaded in the current AppDomain (so you’d need to make sure that you did something to initialize any assembly containing extension attributes).
  • A similar new call would be available within Roslyn – we’d need to think carefully about which assemblies that examined, of course.

By making this an “opt-in” mechanism from the examiner perspective, it would be safe – anything dealing in security attributes would want to use the existing mechanism, for example.

This wouldn’t just be useful for diagnostic purposes, mind you. There are a number of situations where you might want to augment existing types with more metadata, whether those attributes are your own custom ones or existing ones in the framework. Want to apply an attribute-based serialization framework to a different third-party type? Sure, just use those extensions. Want to give system enums localization-oriented attributes for your own framework? No problem.

I’m not actually expecting this feature to go very far – it’s relatively niche, as well as possibly requiring CLR changes (rather than just framework and language changes). Still, having thought of it, I decided it would be odd to keep it to myself. And hey, it’s a good excuse to create the C# 7 category on my blog…