Background
There are three things you need to know to start with:
Operations on read-only variables which are value types copy the variable value first. I’ve written about this before on this blog. C# 7.2 addresses this by introducing the readonly
modifier for structs. See the language proposal for more details. I was touched to see that the proposal references my blog post :)
The ref readonly
local variables and in
parameter features in C# 7.2 mean that “read-only variables” are likely to be more common in C# than they have been in the past.
Noda Time includes many value types which implement IXmlSerializable
. Noda Time implements IXmlSerializable.ReadXml
by assigning to this
: fundamentally IXmlSerializable
assumes a mutable type. I use explicit interface implementation to make this less likely to be used directly on an unboxed variable. With a generic method using an interface constraint you can observe a simple method call mutating a non-read-only variable, but that’s generally harmless.
Adding the readonly modifier to Noda Time structs
I relished news of the readonly
modifier for struct declarations. At last, I can remove my ReadWriteForEfficiency
attribute! Hmm. Not so much. To be fair, some of the structs (7 out of 18) were fine. But every struct that implements IXmlSerializable
gave me this error:
Cannot assign to ‘this’ because it is read-only
That’s reasonable: in members of a readonly struct
, this
effectively becomes an in
parameter instead of a ref
parameter. But how can we fix that? Like any sensible developer, I turned to Stack Overflow, which has precisely the question I’m interested in. It even has an answer! Unfortunately, it amounts to a workaround: assigning to this
via unsafe code.
Violating readonly with unsafe code
To give a concrete example of the answer from Stack Overflow, here’s my current LocalTime
code:
void IXmlSerializable.ReadXml([NotNull] XmlReader reader) { Preconditions.CheckNotNull(reader, nameof(reader)); var pattern = LocalTimePattern.ExtendedIso; string text = reader.ReadElementContentAsString(); this = pattern.Parse(text).Value; }
Here’s the code that compiles when LocalTime
is marked as readonly
, after enabling unsafe blocks:
unsafe void IXmlSerializable.ReadXml([NotNull] XmlReader reader) { Preconditions.CheckNotNull(reader, nameof(reader)); var pattern = LocalTimePattern.ExtendedIso; string text = reader.ReadElementContentAsString(); fixed (LocalTime* thisAddr = &this) { *thisAddr = pattern.Parse(text).Value; } }
Essentially, the unsafe code is bypassing the controls around read-only structs. Just for kicks, let’s apply the same change
throughout Noda Time, and think about what would happen…
As it happens, that fix doesn’t work universally: ZonedDateTime
is a “managed type” because it contains a reference (to a DateTimeZone
) which means you can’t create a pointer to it. That’s a pity, but if we can make everything else readonly, that’s a good start. Now let’s look at the knock-on effects…
Safe code trying to abuse the unsafe code
Let’s try to abuse our “slightly dodgy” implementations. Here’s a class we’ll put in a “friendly” assembly which is trying to be as helpful as possible:
using NodaTime; public class AccessToken { private readonly Instant expiry; public ref readonly Instant Expiry => ref expiry; public AccessToken(Instant expiry) => this.expiry = expiry; }
Great – it lets you get at the expiry time of an access token without even copying the value.
The big test is: can we break this friendly’s code’s assumptions about expiry
really not changing its value?
Here’s code I expected to mutate the access token:
static void MutateAccessToken(AccessToken accessToken) { ref readonly Instant expiry = ref accessToken.Expiry; string xml = "<evil>2100-01-01T00:00:00Z</evil>"; EvilReadXml(in expiry, xml); } static void EvilReadXml<T>(in T value, string xml) where T : IXmlSerializable { var reader = XmlReader.Create(new StringReader(xml)); reader.Read(); value.ReadXml(reader); }
We have an in
parameter in EvilReadXml
, so the expiry
variable is being passed by reference, and then we’re calling ReadXml
on that parameter… so doesn’t that mean we’ll modify the parameter, and thus the underlying expiry
field in the object?
Nope. Thinking about it, the compiler doesn’t know when it compiles EvilReadXml
that T
is a readonly struct
– it could be a regular struct
. So it has to create a copy of the value before calling ReadXml
.
Looking the the spec proposal, there’s one interesting point in the section on ref
extension methods:
However any use of an
in T
parameter will have to be done through an interface member. Since all interface members are considered mutating, any such use would require a copy.
Hooray! That suggests it’s safe – at least for now. I’m still worried though: what if the C# team adds a readonly struct
generic constraint in C# 8? Would that allow a small modification of the above code to break? And if interface methods are considered mutating anyway, why doesn’t the language know that when I’m trying to implement the interface?
But hey, now that we know unsafe code is able to work around this in our XML implementation, what’s to stop nasty code from using the same trick directly?
Unsafe code abusing safe code
Imagine we didn’t support XML serialization at all. Could unsafe code mutate AccessToken
? It turns out it can, very easily:
static void MutateAccessToken(AccessToken accessToken) { ref readonly Instant expiry = ref accessToken.Expiry; unsafe { fixed (Instant* expiryAddr = &expiry) { *expiryAddr = Instant.FromUtc(2100, 1, 1, 0, 0); } } }
This isn’t too surprising – unsafe code is, after all, unsafe. I readily admit I’m not an expert on .NET security, and I know the
landscape has changed quite a bit over time. These days I believe the idea of a sandbox has been somewhat abandoned – if you care about executing code you don’t really trust, you do it in a container and use that as your security boundary. That’s about the limit of my knowledge though, which could be incorrect anyway.
Where do we go from here?
At this point, I’m stuck making a choice between several unpleasant ones:
- Leave Noda Time public structs as non-read-only. That prevents users from writing efficient code to use them without copying.
- Remove XML serialization from Noda Time 3.0. We’re probably going to remove binary serialization anyway on the grounds that Microsoft is somewhat-discouraging it going forward, and it’s generally considered a security problem. However, I don’t think XML serialization is considered to be as bad, so long as you use it carefully, preventing arbitrary type loading. (See this paper for more information.)
- Implement XML serialization using unsafe code as shown above. Even though my attempt at abusing it failed, that doesn’t provide very much comfort. I don’t know what other ramifications there might be for including unsafe code. Does that limit where the code can be deployed? It also doesn’t help with my concern about a future readonly struct constraint.
Thoughts very welcome. Reassurances from the C# team that “readonly struct” constraints won’t happen particularly welcome… along with alternative ways of implementing XML serialization.
The IXmlSerializable interface is clearly an abomination and should be destroyed with cleansing fire.
Maybe when “readonly struct” constraints happen they’ll bring Shapes along for the ride, so there can be an SXmlSerializable shape like
shape SXmlSerializable
{
void Write(XmlWriter writer);
static T Read(XmlReader reader);
}
Shapes on GitHub: https://github.com/dotnet/csharplang/issues/164
LikeLike
Maybe when “readonly struct” constraints happen they’ll bring Shapes along for the ride, so we can have something like
public shape SXmlSerializable
{
void WriteXml(XmlWriter writer);
static T ReadXml(XmlReader reader);
}
LikeLike
I asked this on SO some time ago, VSadov suggested using Unsafe.AsRef.
https://stackoverflow.com/a/49075853/704144
LikeLike
Ideas:
– Implement XML serialization without using IXmlSerializable (e.g. with a factory function for read as shown above).
– Introduce a 2nd type (that may implement IXmlSerializable) and offer from/to options for your struct.
In all the years I’ve been using XML serialization I’ve never needed IXmlSerializable. It’s no loss at all to not support it.
LikeLike
Both of those options seem to make it harder for the users of my library to serialize their data. They expect to have data models with
LocalDate
properties etc, and for those to be serialized transparently for them. How would either of your suggestions help there? I think there’s a huge difference between an application author who wants to serialize their own data models, and a library author who wants application authors to be able to use their components in a natural way.LikeLike
Just FYI, a much easier (and more performant, I think?) way to write to read-only variables than unsafe blocks with fixed pointers is the “Unsafe.AsRef(in T source)” method.
LikeLike
I think I ended up with something like that, yes.
LikeLike