Jon Skeet's coding blog

2015-06-03T19:53:10+00:00

A hypothetical “type such that there is no valid expression which is convertible to that type” could have no default argument, since they cause the implicit use of a constant expression at compile time.

LikeLiked by 2 people

Reply

2015-06-03T20:25:52+00:00

Interesting post, and of course I couldn’t resist trying to solve the problem :)

How about a fluent API? Properties can still be immutable; the actual constructor stays private so the public API can change without breaking anything. The main ugly thing is naming, since the method to set a property obviously can’t have the same name as the property.

	namespace PersonApp
	{
	public class Person
	{
	public Person()
	{

	}

	private Person(string firstName, string lastName)
	{
	FirstName = firstName;
	LastName = lastName;
	}

	public string FirstName { get; }
	public string LastName { get; }

	// Be nice if C# could cope with Properties and Methods having the same name…
	// Let's protest by violating naming conventions :)
	public Person firstName(string firstName)
	{
	return new Person(firstName, LastName);
	}

	public Person lastName(string lastName)
	{
	return new Person(FirstName, lastName);
	}

	public override string ToString()
	{
	return $"{FirstName} {LastName}";
	}
	}
	}

view raw

Person.cs

hosted with ❤ by GitHub

	using System;

	namespace PersonApp
	{
	public class Program
	{
	public static void Main(string[] args)
	{
	var person = new Person().firstName("Jon").lastName("Skeet");
	Console.WriteLine(person.ToString());
	}
	}
	}

view raw

Program.cs

hosted with ❤ by GitHub

LikeLike

Reply

2015-06-03T21:11:05+00:00

That’s the kind of pattern I use in Noda Time, yes. It does end up with a lot of objects being created though – and while normally I don’t like micro-optimizing this sort of thing, setting all the properties on a message with N fields ends up using O(N^2) memory… I know the GC’s good, but it feels a little flamboyant. I suppose with the builder as well, clients could always use that when they’re concerned about the cost. I doubt that we’ll end up going this way, but it’s a good thought.

LikeLiked by 1 person

Reply

2015-06-04T05:53:40+00:00

A fluent API returning a type that requires a finalizing call or an implicit conversion to the target type could be implemented via a builder to avoid creating an object for every property set. It would unfortunately create two objects when modifying a single property, though.

LikeLike

Reply

2015-06-04T06:16:48+00:00

That’s what we’ve got in the current implementation, other than the implicit conversion. It’s still ugly, and an implicit conversion would also cause confusion when using var I suspect.

It’s not so bad for more advanced developers who are used to the builder pattern, but it’s quite a burden to enforce on everyone wanting to use Protocol Buffers. It’s a matter of judging the right API for your target audience, and I’m hoping that GRPC will expand the target audience for Protocol Buffers a lot…

LikeLike

Reply

2015-06-09T13:21:31+00:00

Wouldn’t it be best to fix this through a compiler or CLR optimization that reuses the ‘immutable’ object (since it should be able to detect that there are no future references to that object after the dot).

LikeLike

Reply

2015-06-09T13:36:34+00:00

Not sure which “this” you’re referring to, but if you’re basically saying “Trust the GC/CLR to deal with garbage quickly” I’m happy to do so to some extent – but not when there can be hundreds of allocations, each involving copying hundreds of fields.

LikeLike

Reply

2015-06-09T18:49:22+00:00

I mean the O(n^2) memory thing. And I’m not suggesting you just trust the GC/CLR – I’m suggesting that the .NET/C# devs add in an optimization for this specific use case.

LikeLike

Reply

2015-06-04T08:33:07+00:00

…why not using WithFirstName() and WithLastName()?

LikeLiked by 1 person

Reply

2015-06-03T20:28:56+00:00

Fluent APIs are a good pattern for building immutable objects without breaking binary compatibility between versions, although in this instance it’s made ugly by the fact that ideally the setter methods would have the same name as the properties.

Anyway, here’s a spike: https://gist.github.com/markrendle/13c84348970224975837

LikeLike

Reply

2015-06-03T21:39:18+00:00

As requested by via Twitter, a more detailed explanation:

As far as I understand it, these feature requests are designed to improve backwards compatibility of APIs by improving how constructors handle optional parameters from client code consuming this API.

However, C# itself is ALSO an API of sorts. If we change how C# handles constructor calls in the way your second request does and no longer allow unnamed arguments for optional parameters, that’s going to break a lot of legacy code, even very recent code.

Suppose this was implemented in a future version of C# and you (Jon) decide to port the next major version of Nodatime over to it because of other new functionality being critical. Depending on the implementation of the C# change and how many optional parameters are called without a name, you could be looking at a long list of errors on the first attempt at building. You also then need to consider which constructors of the Nodatime API itself should be changed to make use of this new feature.

In turn, anyone upgrading to the next version of Nodatime later on (because their own product is updating to take advantage of the new features in C#) will also need to change their own API calls that make use of the updated constructors.

I realize that these changes likely make structural improvements to the readability and functionality of any code using these APIs, but many businesses also need to consider the implications for not just their own projects, but also for their clients and their users. If the necessary changes are too extreme, they might just not bother with this new feature, leading to a net result of C# being more complicated for no gain in code clarity and functionality.

LikeLike

Reply

2015-06-03T21:46:13+00:00

I wasn’t suggesting changing the behaviour of any existing code. I was suggesting something like adding a modifier to a method which would enforce that calls specifying optional parameters do so by name. Adding that modifier to an existing method would indeed be an incompatible change, but then so do many other changes on existing code. That doesn’t mean it wouldn’t be useful for new code. (In the same way, if you have an existing API with many overloads to make up for the lack of optional parameters, converting those into a single method call with optional parameters breaks binary compatibility – but that doesn’t mean that optional parameters are useless, does it?)

LikeLike

Reply

2015-06-03T21:57:14+00:00

I see. I was thinking more along the lines of a project-wide setting, but your solution indeed does make more sense. Something like a “named” modifier would indeed be useful. I also just realized that my example was inconsistent, with Nodatime itself having the project-wide setting configured, but the methods still requiring a flag. A modifier would be much more useful than a combination of the 2.

LikeLiked by 1 person

Reply

2015-06-04T22:11:08+00:00

So you’re basically wishing for Python 3’s * guard?

def foo(bar, *, baz): pass
foo(1,2) #error

LikeLike

Reply

2015-06-04T22:12:05+00:00

Yes, something like that. (A friend made the same comparison this morning.)

LikeLike

Reply

2015-06-03T23:52:40+00:00

What we really need is for people stopping using optional arguments on public APIs!

LikeLike

Reply

2015-06-04T07:43:27+00:00

Isn’t your second feature request already possible (and a nice example for) using a roslyn analyzer?
The analyzer could warn on or even prohibit using the API without named arguments.

LikeLike

Reply

2015-06-04T07:59:09+00:00

Funnily enough, I was thinking about that just on the train this morning. Yes, it could – it would only be useful for VS2015 users of course, but better than nothing. Will edit to mention that.

LikeLike

Reply

Pingback: The Morning Brew - Chris Alcock » The Morning Brew #1874

2015-06-04T09:03:27+00:00

I actually like the mechanism StringBuilder uses: its Append-methods return the StringBuilder-instance itself, and the ToString-method creates the immutable string, like the current ProtoBuf builder implementation. I like the following kind of code for that:

var person = Person
.CreateBuilder()
.SetFirstName(“Jon”)
.SetLastName(“Skeet”)
.Build();

The advantage of the builder-pattern that the constructor-pattern does not have: it’s easy to build an instance gradually, like this:

var builder = Person
.CreateBuilder()
.SetFirstName(“Jon”)
.SetLastName(“Skeet”);

if (isDoctor) { builder.SetTitle(“Dr”); }
else if (gender == Gender.Male) { builder.SetTitle(“Mr”); }
else if (gender == Gender.Female)
{
if (isMarried) { builder.SetTitle(“Mrs”); }
else { builder.SetTitle(“Ms”); }
}

LikeLike

Reply

2015-06-04T09:08:49+00:00

Yes, it’s definitely useful to still have the builder – it’s just a pain to need it. Look at how much cruft is in your original code vs var person = new Person(firstName: "Jon", lastName: "Skeet"). Unless I am building something gradually, I don’t want to have to care about a builder – I just want to say what I want the end result to be.

It’s not huge, and not awful to get used to – but it’s a barrier to entry.

LikeLike

Reply

2015-06-05T13:49:03+00:00

I would argue that the two code samples are pretty similar, other than some simple coding preferences…

I could just as easily write:
var person = Person.Builder.withFirstName(“Jon”).withLastName(“Skeet”).Build();

sure, the call to Build is an extra step (as mentioned first thing in your article), but I think it’s pretty minimal all said and done… but initializers/parameters on one line vs several is just preference (personally, I often place my named parameters in several lines for readability).

I would also argue that the example above includes extra logic (dr/mr/mrs/ms) which would be just as messy for your approaches (perhaps cleaned slightly with ternary’s, but still not “ideal”)… though for those cases, I would actually argue in favor of those being calculated (readonly) properties anyway… the need to pass them in a proto buffer would only depend on whether you’d need those values on the other side, or whether they can be re-calculated after deserialization.

Alternatively, what about a lambda style builder?
var person = Person.Build(p => p.withFirstName(“Jon”).withLastName(“Skeet”));

again, not entirely as intuitive as initializers, but addresses the issues.

LikeLike

Reply

2015-06-04T10:15:56+00:00

Marc Gravell had a similar problem (binary incompatibility when using optional arguments) last year. http://blog.marcgravell.com/2014/08/optional-parameters-considered-harmful.html

If you don’t add optional arguments in the middle, but only at the end, it is possible to remain binary compatible. Also this fix is only practical when you change the api by hand and not rely on code generation.

In that case what you do is go from one method:
Method(string firstName = null, string lastName = null)
to two methods:
Method(string firstName, string lastName)
Method(string firstName = null, string lastName = null, string title = null)

The first method will make the code binary compatible, the second method will be source compatible.

LikeLike

Reply

2015-06-04T10:18:27+00:00

Right – as you say, it’s only viable for non-generated code. Definitely a technique to remember though, so thanks for pointing it out.

LikeLike

Reply

2015-06-04T10:20:48+00:00

I think it is way worse to loose immutability than improve the construction of objects. There is nothing wrong with the builder pattern, in fact it is used for ImmutableCollections so is a known pattern used by Microsoft for key data structures, so please please do not make objects mutable, in fact please allow us to specify whether object can be value types as well. GC pressure is a big thing for apps that process a lot of data.

LikeLike

Reply

2015-06-04T10:23:04+00:00

Optional arguments are a much worse solution to the problem than a builder pattern. Its just plain awful. It would be good if one could specify whether “null” is allowed for certain ref fields so guarantees can be made about all ref fields having non-null values.

LikeLike

Reply

2015-06-04T10:28:59+00:00

I very much doubt that we’ll add an option to turn protobuf messages into structs instead of classes, especially as that would lead to a lot of boxing if we weren’t really careful.

While I’m a big fan of immutability in general, and was perfectly happy for protobufs to use the builder approach previously, I think there’s a justifiable concern about it being too “different” for many developers. Yes, it’s far from unheard of – but I suspect the proportion of developers using ImmutableCollections is small too.

I’ll take your feedback in mind, but I suspect we’ll still be going the mutable route – possibly with a “freeze” operation.

LikeLike

Reply

2015-06-04T11:17:32+00:00

Ouch Jon, that freeze() operation really looks ugly to me.

We solved this problem a few weeks ago by indeed using a fluent builder pattern. So in the end we achieve an immutable object and the builder genlty forces you to fill in the parameters.

var person = Person.Create().WithFirstname(“Dwight”).WithLastname(“Matthys”).Build();

Doesn’t look so akward to us, since using a constructor with named parameters also introduces some cruft in your code. The new can be easely associated with Create(), the With methods speak for themselves opposed to the named parameters and yes, we have one extra thingy at the end. But the benefits are greater than the losses.

Just my 2 cents of course ;)

Kind regards,
Dwight

LikeLike

Reply

2015-06-04T12:00:38+00:00

I think we’ll have to agree to disagree on this. In particular, I don’t see how “the with methods speak for themselves opposed to the named parameters” – while I agree that WithLastName(x) is clear, I’d say that lastName: x is at least as clear. What I like to do is look at a statement or expression and try to identify every identifier or value which is intrinsically part of what I’m trying to express – versus ones which are just formality. Additionally, using Person.Create() instead of new Person.Builder prevents the use of object initializers.

LikeLike

Reply

2015-06-04T11:24:23+00:00

Here’s an idea you could try: make a Fody waiver which would substitute DoNotCallThisMethodWithPositionalArguments with System.Void. It’s perfectly legal in IL but C# won’t let you write default(System.Void) no matter how hard you try :-)

LikeLiked by 1 person

Reply

2015-06-04T15:49:30+00:00

Crazy.

LikeLike

Reply

2015-06-04T11:27:30+00:00

Here’s an idea: Make a Fody waiver that substitutes DoNotCallThisMethodWithPositionalArguments with System.Void. It’s perfectly legal in IL, but C# won’t let you write default(System.Void) no matter how hard you try :)

LikeLike

Reply

2015-06-17T16:52:24+00:00

+1
Jon do you have an opinion on this?

LikeLike

Reply

2015-06-17T16:58:56+00:00

It’s wonderfully nasty, but I don’t think it would be practical – way to confuse callers :) I don’t know whether it would actually work from a CLR perspective, either.

LikeLike

Reply

2015-06-04T12:59:09+00:00

Maybe you can use bond (https://microsoft.github.io/bond/manual/bond_cs.html) instead of protocl buffer.s

LikeLike

Reply

2015-06-04T13:00:42+00:00

I think you’ve missed the point of my post: a) it’s about limitations of optional parameters around compatibility; b) I’m implementing Protocol Buffers. I don’t think asking the GRPC project to switch to a Microsoft serialization format is going to fly…

LikeLike

Reply

2015-06-04T15:48:50+00:00

What post is really stating is that C# needs to support record types that can be duck typed to strong types.

Problem solved.

LikeLike

Reply

2015-06-04T19:49:34+00:00

How about a language request for a read-only enabled Object Initializer or maybe a Set-Once property?

LikeLike

Reply

2015-06-04T22:10:14+00:00

Not sure if this is better at all but you could define a wrapper for each field which encodes its field position. This way you always know where the data is stored in old and new messages. Old code will not be disturbed by new fields and new code can use new fields wherever it wants to.

class Person
{
    public string First { get; private set; }
    public string Middle { get; private set; }
    public string Last { get; private set; }

    public Person(FieldPos1<string> first, FieldPos3<string> last)
    {
        First = first.Value;
        Last = last.Value;
    }

    public Person(FieldPos1<string> first, FieldPos2<string> middle, FieldPos3<string> last)
    {
        First = first.Value;
        Middle = middle.Value;
        Last = last.Value;
    }
}

struct FieldPos1<T>
{
    public T Value { get; private set; }
    public FieldPos1(T value)
        : this()
    {
        Value = value;
    }
}

struct FieldPos2<T>
{
    public T Value { get; private set; }
    public FieldPos2(T value)
        : this()
    {
        Value = value;
    }
}

struct FieldPos3<T>
{
    public T Value { get; private set; }
    public FieldPos3(T value)
        : this()
    {
        Value = value;
    }
}

class Program
{
    static void Main(string[] args)
    {
        var p =     new Person(new FieldPos1<string>("Alois"), new FieldPos3<string>("Kraus"));
        var pV2 = new Person(new FieldPos1<string>("Alois"), new FieldPos2<string>("Christian"), new FieldPos3<string>("Kraus"));
    }
}

You “only” need to ensure that you generate all optional overloads which leave out one or more fields because the did not exist yet to make it truly binary compatible. I am not sure if this positional wrapper is worth it. It is not as easy as using the person ctor with the real values but the concept it is quite easy to follow even for beginners which try to make their code compile by using Intellisense.

LikeLiked by 1 person

Reply

Pingback: Ideal Windows 10 App Development Environment

2015-06-05T08:42:24+00:00

Are protobufs only used with GRPC? And if so, what if you think about it from a procedure POV instead of from the message POV? That’s what the P in RPC stands for, after all. : )

LikeLike

Reply

2015-06-05T08:46:28+00:00

No, protobufs can certainly be used for scenarios other than GRPC. The “P” part is indicated in the service definitions, not in the message definitions…

LikeLike

Reply

2015-06-06T11:01:25+00:00

I think your proposed optional parameter syntax is bad idea irrespective of the C# limitations you point out.

Most people interacting with your API won’t be experts; and even experts have better things to do than memorize a large API – and that means you want to avoid many “practical” but slightly different ways of achieving the same thing. The downside to the optional parameter approach is that it composes poorly. If you want to build a message with a few fields, you cannot delegate most of those fields to a helper method and then set the final field yourself – the construction call is indivisible; all fields must be set, or none.

At the end of the day, what’s wrong with the new Person { FirstName = ... }.Build() syntax, or, if you prefer, a fluent api?

If GC pressure is an issue with a fluent interface, you could always consider using value types.

LikeLike

Reply

2015-06-06T11:16:46+00:00

Well, the constructor would probably have been an option rather than the only way of doing it – I’d probably have a builder as well, for those scenarios. I think it’s entirely reasonable for there to be multiple ways of achieving the same thing when they’re useful in different scenarios. As for what’s wrong with new Person.Builder {... }.Build() – the builder-specific parts are ugly, particularly if you’re only specifying a field or two. The language feature I’d really like is to be able to specify a builder type alongside the immutable type, so that the compiler would allow object-initializers to work with the builder and build the immutable type automatically… but that’s another matter. There are various other issues with using the builder pattern here that I don’t want to go into too much detail on – that wasn’t the point of the post, after all.

As for the fluent interface with just an immutable type – O(n^2) space allocation and assignments (where n is the number of fields) is really nasty – and the places where it’s nasty are places where you don’t want a value type with all those fields, either. This is where the context is important – protocol buffers are defined by other people, and can have many, many fields… and the same codegen applies whether there are 2 fields or 100. In a tightly controlled API like Noda Time, the fluent syntax is absolutely fine, because I can make the right decision about what to do based on the context. For codegen of arbitrary protobufs, I don’t believe it’s the right answer.

I’m still torn between the mutable API and the builder approach – both are ugly in different ways :(

LikeLike

Reply

2015-06-06T19:17:31+00:00

What about having 2 versions of of each message class: mutable and immutable? You could implicitly convert mutable messages to immutable ones, and explicitly convert the other way round (.ToMutable()).
You can then enable construction only for the mutable messages, which means they could serve as your builder, but could also be used as real messages. You can then nudge people toward preferring the immutable messages by naming the mutable messages MutablePerson and the immutable ones as Person which hint that they should be preferred.

LikeLike

Reply

2015-06-06T19:36:22+00:00

Yes, that’s mostly what we’ve got now, but it’s annoying in various ways – basically there’s no clearly good option here :(

LikeLike

Reply

Pingback: Automate the Planet

2015-06-19T13:26:56+00:00

Have you considered using a Dictionary<String, String> as your one and only parameter? That’s a mapping of parameter names to serialized objects your method needs to use, and then, you can simply just use what you need in your method from whatever the contents of this dictionary are, and/or serialize and de-serialize whatever you wish, manually, using XElement parsing.

LikeLike

Reply

2015-06-19T13:48:48+00:00

No, I’d really rather not do that. Complete lack of compile-time assistance, requirement to convert everything to strings and back… nope, not for me :)

LikeLike

Reply

2015-06-19T18:55:28+00:00

For compile-time assistance, you could define your own class that returns a parameter name and serialized object for a particular set of inputs. Your problem revolves around changing parameters anyways, so everything should work okay since you’ll change this construction logic with each build.

LikeLike

Reply

2015-07-20T16:48:23+00:00

I sometimes use the following variant of the fluent pattern which, while more complex in some ways, helps eliminate some of the issues mentioned (e. g. excess copying):

public class Person
{
    public Person(Action<Builder> builder)
    {
        if (builder == null) { throw ... }
        var personBuilder = new Builder(this);
        builder(personBuilder);
        // ensures that even if someone captures a reference to the
        // builder in the passed-in action they can't use it to further mutate this objkect
        personBuilder.End();
    }

    // all "immutable" properties allow private set
    public string FirstName { get; private set; }

    public class Builder
    {
        private Person person;

        // it's important that this isn't publicly accessible or else someone could use it to mutate a Person.
        // If we have our own assembly, internal will do just fine. However, if this source code will
        // go right in the consumer's assembly we're probably better off making our public Builder class
        // abstract and then providing a private implementation to forbid external construction
        internal Builder(Person person)
        {
            this.person = person;
        }

        public Builder FirstName(string firstName) 
        {
            this.person.FirstName = firstName;
            return this;
        }

        // other builder methods

        // comments about internal on the Builder constructor apply here too. If we had a private implementation
        // this could just be Dispose()
        internal void End() { this.person = null; }
    }
}

// we can then use this as
var person = new Person(p => p.FirstName("Jon"));

Not as nice as optional parameters, but has the benefit of being fully backwards compatible (both source and binary) and being discoverable (the Person class is constructed through its constructor). As far as performance, we do no copying of fields since we write straight to the person object. On the other hand, we allocate the builder object and (likely) a delegate and closure object.

Another thought I had when reading this is that I could see the library generating the optional parameters constructor as a nice-to-have on top of another more verbose pattern for any message types that have contiguously numbered fields from 0-N. That still leaves us with the binary backwards compatibility problem, but that can be easily solved by adding N constructor overloads.

LikeLike

Reply

2016-06-12T15:07:31+00:00

I’m not sure why immutability is such a must have. These generated objects are purely used to serialise data over the wire, it’s not like anyone is using the generated classes as domain objects (if you are… oh dear). What’s the benefit of immutability in this scenario? Sure, it’s a nice a to have but not at the expense of additional GC pressure. In fact the uglier these generated objects are the better(!) as it would discourage people to reuse them where they shouldn’t (leaking outside their data layer).

LikeLike

Reply

2017-06-23T08:59:49+00:00

Thx for gr8 writeup.

PS!
I explicitly invented this design pattern to solve the problems you’re describing … ;)

LikeLike

Reply

Jon Skeet's coding blog

Backwards compatibility is (still) hard

Context and current state: the builder pattern

Optional parameters to the rescue!

Let’s just add a field at the end…

Let’s just add a field in the middle…

Feature requests and a workaround

Conclusion

53 thoughts on “Backwards compatibility is (still) hard”

Leave a comment Cancel reply

Context and current state: the builder pattern

Optional parameters to the rescue!

Let’s just add a field at the end…

Let’s just add a field in the middle…

Feature requests and a workaround

Conclusion

Share this:

Related

53 thoughts on “Backwards compatibility is (still) hard”

Leave a comment Cancel reply