C# 4: Immutable type initialization

(I’m giving up with the numbering now, unless anyone particularly wants me to keep it up. What was originally going to be a limited series appears to be growing without end…)

As Chris Nahr pointed out in my previous post, my earlier idea about staged initialization was very half-baked. As he’s prompted me to think further about it, I’ve come up with another idea. It’s slightly more baked, although there are lots of different possibilities and open questions.

Let’s take a step back, and look at my motivation: I like immutable types. They’re handy when it comes to thread safety, and they make it a lot easier to reason about the world when you know that nothing can change a certain value after it’s been created. Now, the issues are:

  • We really want to be able to fully construct the object in the constructor. That means we can mark all fields as initonly in the generated IL, potentially giving the CLR more scope for optimisation.
  • When setting more than two or three values (while allowing some to be optional) constructor overloading ends up being a pain.
  • Object initializers in C# 3 only apply to properties and fields, not method/constructor arguments – so we can’t get the clarity of naming.
  • Ideally we want to support validation (or possibly other code) and automatic properties.
  • The CLR won’t allow initonly fields being set anywhere other than in the constructor – so even if we made sure we didn’t call any setters other than in the constructor, we still couldn’t use them to set the fields.
  • We want to allow simple construction of immutable types from code other than C#. In particular, I care about being able to use projects like Spring.NET and Castle/Windsor (potentially after changes to those projects) to easily create instances of immutable types without resorting to looking up the order of constructor parameters.

The core of the proposal is to be able to mark properties as initonly, and get the compiler to create an extra type which is thoroughly mutable, and contains those properties – as well as a constructor which accepts an instance of the extra type and uses it to populate the immutable instance of the main type before returning.

Extra syntax could then be used to call this constructor – or indeed, given that the properties are actually readonly, thus avoiding any ambiguity, normal object initializers could be used to create instances.

Just as an example, imagine this code:

public class Address
{
    public string Line1 { get; initonly set; }
    public string Line2 { get; initonly set; }
    public string Line3 { get; initonly set; }
    public string County { get; initonly set; }
    public string State { get; initonly set; }
    public string Country { get; initonly set; }
    public string ZipCode { get; initonly set; }
   
    // Business methods as normal
}

// In another class
Address addr = new Address
{
    Line1=“10 Fairview Avenue”,
    Line3=“Makebelieve Town”,
    County=“Mono County”,
    State=“California”,
    Country=“US”
};

That could be transformed into code a bit like this:

// Immutable class

// Let tools (e.g. the compiler!) know how we
// expect to be initialized. Could be specified
// manually to avoid using the default class name
[InitializedWith(typeof(Address.Init))]
public class Address
{
    // Nested mutable class used for initialization
    [CompilerGenerated]
    public class Init
    {
        public string Line1 { get; set; }
        public string Line2 { get; set; }
        public string Line3 { get; set; }
        public string County { get; set; }
        public string State { get; set; }
        public string Country { get; set; }
        public string ZipCode { get; initonly set; }
    }

    // Read-only “real” properties, automatically
    // implemented and backed with initonly fields
    public string Line1 { get; }
    public string Line2 { get; }
    public string Line3 { get; }
    public string County { get; }
    public string State { get; }
    public string Country { get; }
    public string ZipCode { get; }
   
    // Automatically generated constructor, using
    // backing fields directly
    public Address(Address.Init init)
    {
        <>_line1 = init.Line1;
        <>_line2 = init.Line2;
        <>_line3 = init.Line3;
        <>_county = init.County;
        <>_state = init.State;
        <>_country = init.Country;
        <>_zipCode = init.ZipCode;
    }

    // Business methods as normal
}

// In another class
Address addr = new Address(new Address.Init
{
    Line1=“10 Fairview Avenue”,
    Line3=“Makebelieve Town”,
    County=“Mono County”,
    State=“California”,
    Country=“US”
});

That’s the simple case, of course. Issues:

  • Unlike other compiler-generated types (anonymous types, types for iterator blocks, types for anonymous functions) we do want this to be public, and have a name which can be used elsewhere. We need to find some way of making sure it doesn’t clash with other names. In the example above, I’ve used an attribute to indicate which type is used for initialization – I could imagine some way of doing this in the “pre-transform” code to say what the auto-generated type should be called.
  • What happens if you put code in the setter, instead of making it automatically implemented? I suspect that code should be moved into the setter of the initialization class – but at that point it won’t have access to the rest of the state of the class (beyond the other properties in the initialization class). It’s somewhat messy.
  • What if you want to add code to the generated constructor? (Possibly solution: allow constructors to be marked somehow in a way that means “add on the initialization class as a parameter at the end, and copy all the values as a first step.)
  • How can you indicate that some parameters are mandatory, and some are optional? (The mandatory parameters could just be marked as readonly properties rather than initonly, and then the initialization class specified as an extra parameter for a constructor which takes all the mandatory ones. Doesn’t feel elegant though, and leaves you with two different types of initialization code being mixed in the client – some named, some positional.)
  • How do you specify default values? (They probably end up being the default values of the automatically generated properties of the initialization class, but there needs to be some syntax to specify them.)

I suspect there are more issues too – but I think the benefits would be great. I know the C# team has been thinking about immutability, but I’ve no idea what kind of support they’re currently envisioning. Unlike my previous ideas, which were indeed unpalatable for various reasons, I think this one has real potential. Mind you, given that I’ve come up with it after only mulling this over in “spare” time, I highly doubt that it will be a new concept to the team…

17 thoughts on “C# 4: Immutable type initialization”

  1. Certainly cleaner than many of the other options. Interesting…
    Although I’m not sure “initonly” is a good term here – it might map to the CLR flag, but as a language keyword I suspect there are better options. But in typical style I can’t think of any at the moment (perhaps just “init”?). I guess it does the job for discussion purposes ;-p

    Like

  2. I don’t think that a two-type implementation is going to be very effective, because of the issues you raise, as well as others. I think true immutability support is only going to come when it is built into the type system. That of course would require changes to the CLR, and I have no idea if immutability is even on the radar for the next CLR release (or even when it is going to be).

    Like

  3. @Marc: Yes, initonly was just an initial “for discussion purposes” idea :)

    @David: I don’t think any of the issues I’ve raised is insurmountable. Having custom setter code is probably the biggest issue, and I’d rather have the ability just for automatic properties than not to have it at all. We’ll see what happens though.

    Like

  4. Another thought; one of the outstnadning issues was default values. If that can be solved, what difference between a compiler-generated init class, and a compiler-generated constructor? i.e. those members marked “initonly” actually become an additional (single) ctor, with the defaults being provided by the compiler. For every standard public/protected ctor(foo,bar), there might be an additional ctor(foo,bar,[the various initonly members)…

    At the language level, these additional ctors could perhaps be attributed and available by member-name via initializers – but from other (unaware) languauges could be used as the “lots of parameters” ctor.

    Like

  5. Marc, I was thinking about that. The difference would basically be in terms of named vs positional arguments. The defaults issue goes back to “provided by the caller” or “provided by the callee” issue – the same reason Anders didn’t want optional parameters in the first place. If the value could be moved to the callee somehow, with the caller indicating which values they’re really providing, that would be an alternative approach.

    Like

  6. Re defaults… perhaps simply insist that *all* initonly values must be provided in the initializer? A half-formed immutable object isn’t that much use… I think this would address the 90% case… for the remaining case when there are scenarios to use different sets of options, then this “initonly” isn’t suitable, and the author must write ctors for each valid combination.

    But a simple “you can set all, by name, w/o writing a ctor” would go a long way…
    In fact, in this case the *final* ctors probably *replace* the existing ones – i.e. Foo(string) becomes Foo(string,[…]), but there is no Foo(string).

    Of course, at this point all we’ve done is provide “pass by name” to ctors, and an auto-generated ctor… not sure…

    Liked by 1 person

  7. Why not use optional parameters and require the defaults to be Null/0. That way their is no default to speak of.

    You could even go one step further and require all optional types to be Reference or Nullable.

    Like

  8. Restricting the defaults that far would severely limit the usefulness of the feature, IMO. I’m sure it wouldn’t be that hard to include defaults in a useful manner.

    Like

  9. Its pretty rare that I have an immutable object with a large number of attributed. If you have any class with a lot, you probably need to revisit the design.

    The cases I have had them are configuration, where I use setters to avoid the ugliness of a long constructor. Example – a distributed master/worker framework, where configuration is set before the task’s execution. In these cases, I’ve cloned the object (deep copy) so that the client can’t modify it. I’ll also add a validation method before execution starts.

    If you’re concern is objects like a user’s profile, as in your example, it really isn’t a big deal once you have an xml-style type system and supportive framework. Its all generated code, such that the mess of large constructors isn’t seen.

    Anyway, the point is that good design resolves all these issues. Everyone seems to want to put their ‘touch’ on their favorite language and add something new. If you focus on a clean design even the most complex tasks really becomes simple and elegant. At that point, you don’t need the compiler or language to do anything but get faster.

    Like

  10. Ben: “The cases I have had them are configuration, where I use setters to avoid the ugliness of a long constructor.”

    Exactly – that’s precisely why I’ve mentioned IoC containers a couple of times. I don’t like deep cloning and then explicit validation in terms of design – it would be nicer for the validation to happen automatically on construction, and for there to be no need to do cloning at all.

    I don’t see explicit validation and cloning as good design – I see it as a way of getting round the lack of simple initialization of immutable types.

    Not sure what you mean about the xml-style type system and supportive framework, but even if there’s a lot of generated code for production stuff, I think it’s important to be able to easily construct objects manually for unit tests and the like.

    Like

  11. Sorry, ignore the type system aspect. I had a very specific design in mind to cover the general case and I shouldn’t have brought it up so vaguely on a blog.

    I agree that what you’re asking is the ideal, but my point was you don’t sacrafice much by working with what you have smartly. And from the Java world, everyone I meet seems to have their own language extension proposal, so perhaps I’m a bit overly cautious!

    Extensive deep cloning is horrible, but very careful usages is fine. In both cases you’ll get validation and a runtime error in the same execution flow. If the framework that digest it automatically validates the configuration, its equivilant to the IOC approach with huge constructors as the bean is instantiated when its specifically required. So that’s my reasoning! :)

    You’re idea would be a nice asset, but having seen the horrific abuses proposed for cramming garbage into Java’s type system I’m overly cautious of any type system propsals.

    Like

  12. Oh it’s definitely worth being cautious of *any* new feature in a language – and I can certainly see how some of Java’s features haven’t been as carefully thought out as they might have been.

    The good thing about just blogging on these matters is that I know my blog is read by much smarter people than myself (or at the very least people with much more experience of language design than I have). I can make straw man proposals which may have *some* good points along with distinct downsides. I’m relying on the professionals to polish the good stuff while avoiding the problems :)

    Like

  13. Obviously, this has had a lot of thought on many fronts. I’d like to see readonly properties in general, not just autoprops. This is a little more complex, since it potentially means that readonly fields *could* be referred to outside the constructor, but only in other readonly blocks.

    With these, and getter/setter args instead of (or in addition to) property-scoped fields, I’d be happy to do away with (non-property-scoped) fields altogether.

    public readonly int IntValue
    {
    get;
    set(field)
    {
    if (field < 0) throw new ArgumentOutOfRangeException();

    field = value;
    }
    }

    Like

  14. Hi!

    I’ve posted this same question to a couple of other places but was unable to find the answer:

    I’ve been looking around for a simple example of the “right way” to implement immutable object xml serialization/deserialization.

    Since IXmlSerializable.ReadXml(System.Xml.XmlReader reader) requires the object inner state to be changed, I have to make private fields writable, which I would like to avoid. I’ve tried googling it but was unable to find the answer I was looking for.

    Thanks a lot,

    Veki

    Like

  15. @Veki: I’m afraid I don’t know. I don’t even know if it’s possible. I’m pretty ignorant about XML serialization, I’m afraid :(

    Jon

    Like

  16. @Jon: “I’m pretty ignorant about XML serialization” — I find this hard to believe :)

    But anyway I found a way to do it at the end, so I will just post a quick answer if anyone comes across the same problem:

    I ended up using Reflection to change the state of the object inside the implementation of IXmlSerializable.ReadXml, while leaving the private fields readonly.

    I made a new ImmutableObject class as a base class for other immutable classes which I would like to make xml-serializable, so all the messing with Reflection is confined to this base class only.

    Thanks anyway – you actually answered it, I came across one of your posts: http://bytes.com/forum/post2178883-17.html

    Best regards,

    Veki

    Like

Leave a comment