First note: this blog post is very much tongue in cheek. I’m not actually planning on using the idea. But it was too fun not to share.
As anyone following my activity on GitHub may be aware, I’ve been quite a lot of work on Protocol Buffers recently – in particular, a mostly-new port for proto3. I’ve recently been looking at JSON support, and thinking about how to implement “overriding” ToString()
for a few well-known types. I generate partial classes, so that gives me a hook to provide extra functionality. Indeed, I’m planning on using this to provide conversion methods for Timestamp
and Duration
, for example. However, you can’t really override anything in partial methods.
Refresher on partial methods
While partial classes were introduced in C# 2, partial methods were introduced in C# 3. The idea is that one source file (usually the generated one) can provide a partial method signature, and another source file (usually the manually-written one) can provide an implementation if it wants to. Any part of the source can call the method, and the call will be removed at compile-time if nothing provides an implementation. The fact that the method may not be there leads to some limitations:
- Partial methods are implicitly private, but you can’t specify an access modifier explicitly
- Partial methods are always
void
– they can’t return any values - Partial methods cannot have
out
parameters
(Interestingly, a partial method implementation can be an async method – but with a return type of void
, which is never a nice situation to be in.)
There’s more in the spec, but the last two bullets are the important part.
So, suppose I want to override ToString()
in the generated code, but provide a mechanism for that override to be “further overridden” effectively, in the manual code for the same class? How do I get the value from an “extra override”? How do I even detect whether or not it’s there?
Side effects to the rescue!
(Now there’s a phrase you never thought you’d hear from me.)
I mentioned before that if a partial method is called but no implementation is provided, the call is removed. That includes all aspects of the call – including the evaluation of the arguments. So if evaluating the argument has a side-effect… we can spot that side effect.
Next, we have to work out how to get a value back from a method. We can’t use the return value, and we can’t use an out
parameter. There are two options here: we could either pass a wrapper (e.g. an array with a single element) and allow the “extra override” to populate the wrapper… or we can use a ref
parameter. The latter feels ever-so-slightly cleaner to me.
And so the ugly hack is born. The code generator can always generate code like this:
partial void ToStringOverride(bool ignored, ref string value); public override string ToString() { string value = null; bool overridden = false; ToStringOverride(overridden = true, ref value); return overridden ? value : "Original"; }
For any partial class where the ToStringOverride
method isn’t implemented, overridden
will still be false, so we’ll fall back to returning "Original"
. (I would hope that any decent JIT would remove the overridden
and value
local variables entirely at that point.) Otherwise, we’ll return whatever the method has changed value
to.
Here’s a short but complete example:
using System; // Generated code partial class UglyHack1 { partial void ToStringOverride(bool ignored, ref string value); public override string ToString() { string value = null; bool overridden = false; ToStringOverride(overridden = true, ref value); return overridden ? value : "Original"; } } // Generated code partial class UglyHack2 { partial void ToStringOverride(bool ignored, ref string value); public override string ToString() { string value = null; bool overridden = false; ToStringOverride(overridden = true, ref value); return overridden ? value : "Original"; } } // Manual code partial class UglyHack2 { partial void ToStringOverride(bool ignored, ref string value) { value = "Different!"; } } class Test { static void Main() { var g1 = new UglyHack1(); var g2 = new UglyHack2(); Console.WriteLine(g1); Console.WriteLine(g2); } }
Horribly ugly, but it works…
Alternatives?
Obviously this isn’t really pleasant. Some alternatives:
- Derive from the generated class in order to override
ToString
again. Doesn’t work with sealed classes, and will only work if clients create instances of the derived class. - Introduce a new interface, and allow manual code to implement it on the partial class. The
ToString
method can then checkthis is IMyOtherToString
or whatever, and call it appropriately. This introduces another virtual call for no great reason, and exposes the interface to the outside world, which we may not want to do. - Don’t override
ToString
in the generated code at all. Not good if you normally want to override it. - Introduce an abstract base class which the generated class derives from. Override
ToString()
in that base class, possibly calling an abstract member which is then provided in the generated class – but allowing the manual code to overrideToString()
again.
Conclusion
Ugly hacks are fun. But it’s much better to keep them where it belongs: in a blog post, not in production code.
I’d be more inclined to make the method ToStringOverride(ref bool override, ref string value) and set the bool in the method. Feels less dirty than the side effect.
LikeLike
But that means the “overrider” has more work to do. We don’t want to put them in control of that – we just want to detect the method presence.
LikeLiked by 1 person
Could you do something like:
LikeLike
Yes, if you don’t need to know whether or not the method was actually implemented… Which admittedly in .y example I don’t :)
LikeLike
This looks cleaner IMO:
LikeLike
That means the “overriding” method can’t decide it wants to return null though.
(Admittedly making ToString() return null is evil, but…)
LikeLike
Yes, precisely. So this approach prevents the “overriding” method from doing something evil ;)
LikeLiked by 1 person
I hope you figure something along these lines out for proto3. Currently I have to use my own json serialization and deserialization implementation for proto2 in c#. I do this because I want to send efficient binary date/time for the proto case but a text string for the json case. The substitution has to happen at the class/message level, not the field level.
I do the same for a couple other cases such as GUIDs.
LikeLike
I’m not anticipating making the conversion customizable – proto3 has a fixed JSON format, with some particular well-known types (such as Timestamp), but I wouldn’t want to allow arbitrary users to change the JSON sent… that wouldn’t interoperate with other proto3 JSON platforms.
The GUID case is an interesting one which I don’t think has been picked up yet – I’ll mention that internally.
LikeLike
If both ends were proto3, wouldn’t it be better to send the binary message? I use the JSON output to interface exclusively with a web front-end. All other messages stay binary.
I have custom JSON conversions for several standard C# type classes.
* DateTime (timezone unknown, UTC only, or don’t care because it’s a date with no time)
* DateTimeOffset (for timezone per RFC 3339)
* Guid
* TimeSpan
* Object or Variant (the .proto is a union of the types and the json is serialized as native as possible: e.g. “true” vs true and “342” vs 342). It can guess in the reverse direction as well.
I also have some custom conversions for union types specific to our system–where having having a single field in the JSON with different types matches the database use of Variant and made the web front-end easier.
I see that proto3 has a timestamp.proto with an epoc from the year 0001 and no timezone information. I prefer to use separate fields for each of year, month, day, hour, minute, second, and so-on. Each field is optional, so the specificity can be determined by what is set.
In the proto2 code I was able to switch out the JsonFormatReader and JsonFormatWriter easy enough. I never did try Rogers suggestion of AggregateInputStream (https://github.com/jskeet/protobuf-csharp-port/issues/44). I hope you at least leave one of those option open for proto3. Idealy allow what I call a list of IJsonConverter’s be injected or configured on the reader/writer to and the converters can be polled if there is a custom conversion. They each implement this:
If you are open to a pull request, I can put one together.
Thanks.
LikeLike
Yes, if both sides end up being proto3 then binary is preferrable – although there are probably cases where JSON would be used. (In particular, imagine a storage layer in between which insists on JSON…) But my point is that I’d rather not end up with people having their own “flavours” of proto3 JSON… it sounds like a road to killing interoperability. I’ll talk with the team about JSON customization, but I don’t expect it to be there from the first release…
LikeLike
Another option could be something like TypeConverter, delegate the string representation to another type, then associate that type on the partial class via an attribute? If the attribute isn’t present use the default implementation, otherwise get the type from the attribute and call the appropriate method for the “overridden” implementation.
Maybe another option is making ToStringOverride a property whose type is a delegate (in this case, Func). If the property is null use the default implementation, otherwise call the delegate for the “overridden” implementation?
LikeLike