C# 4, part 5: Other bits and bobs which probably don’t merit inclusion

Okay, I know I said that part 4 would be the last part in this series… but since then I’ve not only thought about iterator block parameter checking, but a few other things. Some of these I simply forgot about before, and some I hadn’t thought of yet. I’m not sure any of these are actually worthy of inclusion, but they may provoke further thought.

Tuple returns

I’ve been reading Programming Erlang and I suspect that being able to return tuples (i.e. multiple values, strongly typed but without an overall predefined type) would be a good thing. For instance, in a tuple-returning world, int.TryParse could be redesigned to return both the true/false and the parsed value. It could have a signature like this:

public static (int, bool) TryParse(string text)

… and then be called like this:

int value;
bool parsed;

(value, parsed) = int.TryParse(“Foo”);

Now, a few things to work out:

How do we ignore values we’re not interested in?

Part of the problem with out parameters is that sometimes you don’t actually care about the value – but you still have to declare and pass in a parameter. Suppose we could use ? as a placeholder for “I don’t care”. (This is _ in Erlang pattern matching, IIRC. Same kind of business.)

What could you do with a tuple?

We could potentially make tuples first class citizens, so that you could declare variables of that type, a bit like anonymous types, but with anonymous property names as well, used just for matching later. Or we could force matching at the point of method call, which would restrict the use a bit further but leave less other rules to be worked out.

Either way, I’d hope to be able to set either fields or properties by parameter matching.

What’s the value of the overall expression?

This really depends on the answer to the previous question. If tuples are first class types, then the result of the expression would normally be the tuple itself. However, I wonder whether there’s more that can be done. For instance, thinking about our TryParse example, it’s useful to be able to write (currently):

if (int.TryParse(“Foo”out value))
{
   …
}

Suppose we were able to designate one of the matched elements of the tuple to be the expression result, e.g. using _ to be slightly Perl-like:

if ((value, _) = int.TryParse(“Foo”))
{
   …
}

Would that be worth doing?

More information required…

I suspect that people who know more about the use of tuples in other languages would be able to say more about this. Some overlap with anonymous types is clearly relevant too, and would need to be carefully considered. I’m not wedded to any of the syntax shown above, of course – I’m just interested in how/where it could be useful.

Named method/constructor arguments

One of the features I like about F# is that you can specify the names of arguments, without worrying about the order. This means that it becomes even more important to name methods appropriately, but it would make method calls with many parameters simpler to read. Currently it’s common practice to use one parameter per line and a comment to indicate the use, e.g.

foo.Complicated(10,        // Number of elements to return
                “bar”,     // Name of collection
                x => x+1,  // Step for element
                3.5        // Load factor
               );

In fact, this example is relatively simple because all the parameter types are different – look at the more complicated overloads of Enumerable.GroupBy for rather more hellish examples. It’s incredibly ugly, and the compiler isn’t able to check anything. Now suppose we could instead write:

foo.Complicated(maxElements = 10,
                collectionName = “bar”,
                step = x => x+1,
                load = 3.5);

Personally I think that’s clearer and less error-prone. The arguments could be reordered with few issues, and the compiler could check that we really were using the right parameter names. One potential issue is in terms of side-effects, where evaluating one argument had a side-effect which affected the evaluation of another argument. At that point reordering is a breaking change. I suspect the compiler would need to stick to the specified textual order, and then rework things on the stack as required to get the appropriate order for the method call. A bit nasty.

Event handler subscription in object initializers

I only thought of this one today, when coming up with an example for a screencast on object initializers. I suspect most uses of object initializers will be to with custom classes (although I recently used them for XmlWriterSettings to great effect) which would make the screencast harder to understand. I was wondering what common framework classes had lots of writable properties, and I hit on the idea of building a UI. It shouldn’t surprise me that this works quite nicely, but you can build up a hierarchical UI quite pleasantly. For example:

Form form = new Form
{
    Size = new Size(300, 300),           
    Controls =
    {
        new Button
        {
            Location = new Point(10, 10),
            Text = “Hello”,
        },
        new ListBox
        {
            Location = new Point(10, 50),
            Items =
            {
                “First”,
                “Second”,
                “Third”
            }
        }
    }
};
Application.Run(form);

This is somewhat reminiscent of Groovy builders (and no doubt many other things, of course). However, one thing you can’t currently do is attach an event handler in an object initializer. The obvious syntax would be something like:

new Button
{
    Location = new Point(10, 10),
    Text = “Hello”,
    Click += (sender, args) => Save()
}

where I happen to have used a lambda expression, but didn’t need to – a normal method group conversion or any other way of constructing a delegate would have done just as well.

I mailed the C# team about this, and although it’s been considered before it’s really not useful in many situations. However, the syntax has been left open – there’s no other use of += within object initializers, so it could always be revisited if someone comes up with a killer pattern.

Immutable object initialization

I’ve been thinking about this partly as a result of object initialization in general, and the previous point about named arguments. As has been noted before, C# doesn’t really help you to build immutable objects – either as from the point of view of building the type, or then instantiating it. Basically you’ve got the constructor call, and that’s it. A static method could set private properties and then return the object for popsicle immutability, but it still feels slightly grim.

Someone (possibly Marc Gravell – not sure) suggested to me that there ought to be some way of indicating when an object initializer had finished. At the time I think I rejected the idea, but now I like it. There’s already the ISupportInitialize interface, but that feels slightly too heavy to me – in particular, it has two methods rather than just one. What I think could be nice would be:

  • A new interface with a single CompleteInitialization method.
  • Readonly automatic properties which would either make the property only writable during a constructor call if the new interface weren’t implemented or would insert an execution-time check that CompleteInitialization hadn’t been called already.
  • I’d anticipate the C# compiler implementing the new interface itself automatically in some way which supported inheritance reasonably, unless specifically implemented by the developer.
  • Members other than constructors couldn’t set readonly automatic properties on this, to avoid accidents.
  • The CLR should have some interaction so it knew which fields it could treat as being readonly after initialization had been completed.
  • Object initializers would call CompleteInitialization automatically at the end of the block.

It’s a bit messy, and I’m sure I haven’t thought of everything – but I suspect something along these lines would be a good idea at some point. It’s reminiscent of an earlier wacky idea I had which went further, but this would be specifically to support immutability. Without it, complex immutable types end up with nightmarish constructor calls.

Conclusion

So there we have it – some relatively half-baked ideas which will hopefully provoke a bit more thought – both from readers and myself. It’s interesting to note that aside from event subscription, they all have a fair number of questions and complexity around them, which is off-putting to start with. I would feel more comfortable about event subscription being added than any of the others, because it’s relatively simple and independent. The others feel like more dangerous features – even if they’re more useful too.

22 thoughts on “C# 4, part 5: Other bits and bobs which probably don’t merit inclusion”

  1. I would really like to see a tuple like structure get deep support into both the VB/C# compiler and the CLR itself. I find them to be incredibly useful tools for passing data around between internal functions.

    Exposing tuples externally may be a bigger issue. Tuples will likely have a generic naming convention (A,B,C, etc…) It’s nice and concise in your code but if you’re going to export data in a public interface it may be nicer to do it as a more fleshed out class.

    I ended up writing a basic tuple implementation in C#. It’s similar to anonymous types except in a few ways

    1) Separates mutable and immutable (default) tuples
    2) Types are describable so you can use them in metadata
    3) Ugly names. First property is A, second B, etc …

    http://blogs.msdn.com/jaredpar/archive/2008/01/27/tuples-part-8-finishing-up.aspx

    Like

  2. Do you envision tuples to be used like C/C++ unions? E.g.

    int i = TryParse(“Skeet”) would automatically
    get the integer value from the tuple?

    Seems much like a anonymous type (without field names). I’m not clear on the value added.

    I don’t think named parameters will make it in unless it’s required for another feature.

    In what context(s) do you think knowing when object initialization is complete would be helpful?

    Like

  3. @Peter:

    1) I wouldn’t expect to be able to use that syntax for the tuple, no. I’d always expect a tuple comprehension syntax of some kind. You’re right that it’s similar to anonymous types – but just try *returning* an anonymous type in a useful way! Perhaps that’s actually the answer…

    I suspect I’d be able to give more examples of them being really useful if I had more experience with F#/Erlang/etc – ask someone who uses one of those!

    2) Knowing about object initialization being complete means that I can write a type which is immutable after initialization (and can therefore make more assumptions about itself, and others can make more assumptions about it – like that it’s a reasonable hash key) but still initialize it in a friendly fashion, whether through object initializers or other mechanisms like IoC containers.

    Jon

    PS You guys are quick…

    Like

  4. Re named arguments — Stroustrup’s “The Evolution of C++” has a section (6.5.1, page 183) on “Keyword arguments”. While certain differences between C++ and C# (e.g. default values for args) make this somewhat of an apples and oranges comparison, one reason C++ didn’t include named arguments on calls is that once you publish names for arguments, you can’t change them (e.g. to give a parameter a more descriptive name, or even just to correct a typo) without breaking user code.

    Re tuples — Again, an apples and oranges comparison between C++ and C#. C++ has the binary “comma” operator (,). It evaluates its LHS, but returns the value of its RHS. So (but I admit I haven’t tried it) int i; i = 1, 2; should work, with i being 2. But is there a difference between i = 2; i = (2); /* redundant but allowable parentheses */ and i = (1, 2);? IOW, it would be tricky to work tuples into C++, at least with the (more or less) standard syntax. But C# doesn’t have the comma operator. Maybe they left it out from day one, perhaps with tuples in mind?

    Like

  5. Larry: parameter names are already published, and shouldn’t change without breaking user code. Note that F# already has named parameters, so this idea is already out in the wild.

    As for the comma operator – I suspect it wasn’t included in C# because it didn’t provide enough value to merit the complexity. (And yes, the same could probably be said for all of these ideas too!)

    Like

  6. Tupels (or an easy way to construct anonymous types), named arguments, and initializer expressions for event handlers all sound good to me. Particularly tupels — C# 3.0 already lets you define anonymous types so it shouldn’t be hard to add a public naming convention for them.

    I’m not sure what’s the big deal about multiple initialization phases for immutable objects, though. Maybe I don’t quite understand the situations you’re thinking of, but it sounds as if you’re trying to squeeze immutable types into an imperative programming style that expects mutable types — first create a big object, then initialize its components.

    That’s certainly going to be problematic. But immutable types are really more suited to a functional style of programming where the big object is not created until all its components are ready. Wouldn’t that eliminate these initialization problems?

    Like

  7. No, it’s not about the order of initialization – it’s about the mechanism for initialization.

    Suppose I have an Address type which I’d like to be immutable. It may need to have 3 fields for the main lines, then Town, City, State/Province, Country, PostCode/ZipCode.

    Now, to create a genuinely readonly type with readonly fields, all of those would have to be passed to a constructor. Either you demand all of them, or you have to make choices about which combinations you’re going to allow. Readability is decreased unless you have argument names, because a lot of these may be strings.

    An alternative would be:

    o Create new Address()
    o Set properties
    o Say I’m done

    If we could make it so that you could do nothing with the object *other* than set its properties before it was done, that would be useful.

    Alternatively – and this is really going out on a limb – how about revisiting the whole nature of constructors? Basically what I’m trying to achieve is a useful way of being able to specify some arbitrary combination of properties at construction time, specifying the names as well as values.

    If a constructor could effectively take an anonymous type with all the desired properties set, then copy the values, we’d be done.

    I’m typing as I think here, which is never ideal, but hopefully it’s a bit clearer in terms of the problem I would like to address. I’m really not that bothered *how* it’s addressed.

    Jon

    Like

  8. I would love to see language support for Tuples.

    As I don’t have my own blog I’d like to throw these ideas out for consideration:

    – Improve generic constraints on constructors e.g. new(string)
    – Type inference on constructors – at the moment you have to specify the generic types or create a static method to create the type and then the types can be infered.
    e.g. new Tuple(“A”, “B”) could just be new Tuple(“A”, “B”).
    – Type classes (or constriants based on operators) (e.g. some type of number I can add, subtract etc)
    – Strong typing against methods (Good for proving reflection code still works after refactoring)
    – Pattern matching rather than switch ala F# – Pattern matching is a much more powerful idea.
    – Another from F# is statement assignment (though this might go against the purity of C# a little) e.g.
    int a = if (x > 0) -1 else 0;
    The logic could be more complex though assures your code assigns the variable from all paths.

    Like

  9. I like the tuple idea. Here’s an idea for the implementation in C# 4. First, a set of classes are required:

    // This could be “Tuple” instead.
    // Also, in this example, having a base class adds no value.
    // (It’s actually more hurtful than helpful, but I see potential.)
    public abstract class Group
    {
    protected object[] Properties { get; private set; }

    protected Group(params object[] properties)
    {
    if (properties == null)
    throw new ArgumentNullException(“properties”);

    Properties = properties;
    }
    }

    public sealed class Group : Group
    {
    public T1 First { get { return (T1) Properties[0]; } }
    public T2 Second { get { return (T2) Properties[1]; } }

    public Group(T1 first, T2 second) : base(first, second) {}
    }

    public sealed class Group : Group
    {
    public T1 First { get { return (T1) Properties[0]; } }
    public T2 Second { get { return (T2) Properties[1]; } }
    public T3 Third { get { return (T3) Properties[2]; } }

    public Group(T1 first, T2 second, T3 third) : base(first, second, third) {}
    }

    // more predefined groups (up to 5 type args?)

    Technically, we can use this now in C# 3.0 like this:

    class Program
    {
    static void Main(string[] args)
    {
    Program program = new Program();
    program.TestConversion(“Foo”);
    program.TestConversion(“54”);
    Console.ReadLine();
    }

    public void TestConversion(string value)
    {
    var result = TryParseInt32(value);

    if (result.First)
    Console.WriteLine(“Success: {0}”, result.Second);
    else
    Console.WriteLine(value + ” is not an integer!”);
    }

    public static Group TryParseInt32(string text)
    {
    int i;
    bool success = int.TryParse(text, out i);
    return new Group(i, success);
    }
    }

    However, the goal for a named tuple would be to use result.Success instead of result.First, correct?. So, theoretically, with a bit of syntactic C# sugar (and possibly borrowing something from the anonymous type implementation) we could use it like this instead:

    class Program
    {
    static void Main(string[] args)
    {
    Program program = new Program();
    program.TestConversion(“Foo”);
    program.TestConversion(“54”);
    Console.ReadLine();
    }

    public void TestConversion(string value)
    {
    var result = TryParseInt32(value);

    if (result.Success)
    Console.WriteLine(“Success: {0}”, result.Value);
    else
    Console.WriteLine(value + ” is not an integer!”);
    }

    public static bool,int TryParseInt32(string text)
    {
    int i;
    return (Success = int.TryParse(text, out i), Value = i);
    }
    }

    Does that seem reasonable?

    Like

  10. @Jon: given the preference I’d rather have the ability to declare an immutable type (i.e. initialization is “done” when the constructor has returned). I think wanting to know when initialization has completed stems from not having immutable types.

    My question about tuples was an attempt to illicit the implicitly typed non-local variable concept. I believe that’s what you’re asking for; do you think expanding var to return values and members would be inescapable if this type of tuple support were added? Do you think expanding var like that is a good or a bad thing?

    Like

  11. Peter: Yes, it would be better for everything to be “done” when the constructor returns. I’ll think about alternatives for that a bit further…

    I’m not really asking for implicitly typed non-local variables – I’m asking for the ability to effectively return multiple values, and have them assigned to independent variables/properties in a single assignment statement.

    I wouldn’t expect any implicit typing to be involved here – the return value should still be explicitly typed, but with a tuple. It’s more like anonymous typing than implicit typing.

    I don’t have much of a view on whether expanding var for other things is a good idea or not – although if pushed I’d probably lean on the conservative side.

    Like

  12. @Jon: It’s an interesting concept, tuples. Yes, in terms of how it could be implemented it would be akin to anonymous types (I can’t see a list or array being suitable)–from a variable declaration/assignment standpoint it would have to be implicitly typed as well (assuming you assign to one variable instead of x).

    I’m a proponent of being explicit, if I have a concept in my application that pairs an int with a bool I’ll have a name for that–in which case I’ll declare a type. In this particular example (TryParse) pairing the two values doesn’t make sense outside the call to TryParse.

    I think the concept is interesting and given we have anonymous types and implicitly typed locals it could be reasonably easily done. But, I think it would introduce/promote a concept of weaker typing, ambiguity, and less compile-time checking abilities which I don’t think would make the language stronger.

    With outs/refs it makes it more clear what’s going on based upon the interface of the method:

    bool TryParse(String input, out int result)
    …is clear that “result” is important in this operation and the operation can’t do without “result”.

    (bool, int)TryParse(String input)
    …isn’t clear and actually would let the programmer completely ignore the real result of the operation:

    bool success;
    (sucesss, _)TryParse(“10”);
    …defeating the point of the method, introducing a bug the compiler can’t warn about.

    Like

  13. Peter: Do you make all “important” return values out parameters, so that callers won’t ignore them? I’m guessing you don’t – why is that any different from TryParse?

    Personally, I find it an irritation that I can’t ignore the value, rather than a blessing. If I write “_” in the tuple pattern matching, that’s *explicitly* saying I want to ignore the value, but making it easy to do so.

    If I want to see whether a number is numeric or not, but I don’t actually care about its parsed value (i.e. it’s just for validation purposes) then I have to declare a variable, just for it to be populated. Ick.

    I don’t think it makes the typing any weaker – just more loosely coupled. Sometimes that can be a bad thing, sometimes good. I don’t think there’d be any less compile-time checking involved, or much more ambiguity.

    Again, I don’t think implicitly typed local variables are actually relevant here – there’s no implicit typing going on here, and I can easily imagine a parallel universe where C# had tuples but not implicit typing.

    I suspect the use would follow the guidelines of anonymous types – is this naturally a reusable type in itself, in which case it should be encapsulated as such, or does it only make sense in this situation, in which case using a tuple would be fine.

    However, I should really use F#/Erlang more to get more experience with tuples :)

    Jon

    Like

  14. TryParse, at least, can be implemented with Nullable, because the value is only useful for one state of the boolean (Parse succeeded).

    Are you requiring a tuple to be unpacked immediately? Then your syntax really isn’t offering anything beyond the ability to discard out parameters, which could be added to the language directly.

    if (double.TryParse(“21.43”, discard out)) { … }

    Only if you allow storing the tuple in a variable is there any new functionality, and then you need a way to reference the type. Either “returnof(double.TryParse)” or “double.TryParse.returntype” or something like that. But it seems that this is more general than return values. You should be able to use the tuple syntax for e.g. KeyValuePair:

    foreach ((string key, int value) in dictionary)

    It doesn’t make much sense to replace KeyValuePair entirely with a tuple type, because how would you implement IList<KeyValuePair>, ICollection, IEnumerable, etc for a anonymous tuple?

    So I think rather than “adding tuples”, what’s needed is a syntax for splitting simple structures, possible declaring new variables in the process ala perl

    my ($p1, $p2, $p3) = frozzle(@_);

    Of course TryParse isn’t that great an example, because it probably should be converted to Nullable anyway.

    Like

  15. Jon: Re your reply to me: “Larry: parameter names are already published…”

    So what? If I have a third-party library, with a method, CircleArea(double r), and in the next release they change it to CircleArea(double radius), how is that going to affect my code that simply writes CircleArea(3.17)? Answer: not at all. (OK, not 100%, but those using reflection take their life in their hands anyway.)

    Sure, F# (among other languages, even, IIRC, VB6) allows symbolic parameter names, but just realize that this extends the “contract” between the (possibly library) class and the user. Not only is the name of the method cast in stone, and the return type, but now also the names of the parameters. But as long as you’re comfortable extending the concept of the “contract”…

    So now we sit back and wait. It shouldn’t take long. Wait for what? Why, for some clown to come along and propose method overloading based on the name of the parameter being or . :-(((

    Like

  16. Larry, you wrote:

    one reason C++ didn’t include named arguments on calls is that once you publish names for arguments, you can’t change them (e.g. to give a parameter a more descriptive name, or even just to correct a typo) without breaking user code

    My point is that that’s already true (due to F#) so introducing it into C# wouldn’t tie anyone’s hands more than they’re already tied… unless you’re saying that third party library designers should ignore F#.

    Like

  17. Totally pointless comment about += … there is only one logical operation (event-add), and *strictly* speaking you might want to add multiple events… I’m just wondering if this isn’t close-enough to the existing (IEnumerable) Add initializer support to use the same syntax? i.e.

    Click = {(sender, args) => Save()}
    or for multiple delegates:
    Click = {(sender, args) => Save(), btn_Click}

    Just a thought…

    Like

  18. Re Named Args: Sure, F# has done it. But do Don Syme et al realize that they’ve placed limitations on the entire .Net framework team (and other library writers)? They can no longer release an update to the library with what previously was a purely cosmetic change (namely renaming arguments).

    And perhaps more importantly, do BradA and Krzysztof know that they’re so constrained, so they can make their teams aware (and also update the Framework Design Guidelines)?

    Like

  19. I suspect Don Syme understands it – he’s a smart cookie.

    Wouldn’t like to guess about the Framework Design Guidelines though.

    I’m not completely dismissing your argument though (or even renaming it!). I understand that it’s a significant restriction. I can just see potential benefits too. Probably best to err on the side of caution unless it creates a *big* win.

    Like

  20. Re the named arguments debate; how about a new “named” keyword (presumably backed by an attribute, same as “this” is) – i.e.

    public int Foo(named string someArg) {…}

    which *allows* me to use someArg as a named argument. That way it is very explicit that changing the name is a breaking change. Without the “named” keyword (or more specifically, the backing attribute) the compiler won’t accept pass-by-name calling conventions.

    Like

Leave a reply to skeet Cancel reply