The final part of this little series is the one where I suggest my own ideas for C# 4, beyond those I’ve already indicated my approval for in earlier posts. Before I talk about individual features, however, I’d like to put forward a manifesto which could perhaps help the decision-making process. I hasten to add that I haven’t run all the previous parts through this manifesto to make sure that I’ve been consistent, but all of these thoughts have been running around in my head for a while so I hope I haven’t been wildly out.
Manifesto for C# 4
I would welcome the following goals:
Remember it’s C#
Many suggestions have been trying to turn C# into either Ruby, LISP, or other languages. I welcome diversity in languages, and I believe in using the right tool for the job – but that means languages should stick to their core principles, too. Now, I know that sounds like I might be bashing C# 3, given how much that has borrowed from elsewhere for lambda expressions and the like, and I don’t know exactly how I square that circle internally – but I don’t want C# to become a giant toolbox that every useful feature from every language in existence is dumped into.
There are useful ideas to think about from all kinds of areas – not just existing languages – but I’d be tempted to reject them if they just don’t fit into C# neatly without redefining the whole thing.
Consider how people will learn it
I’ve mentioned this before, but I am truly worried about people learning C# 3 from scratch. One of the reasons I didn’t attempt to write about C# from first principles, instead assuming knowledge of C# 1, is that I’m not sure people can sensibly learn it that way. Now, I don’t think I can sensibly get inside the head of someone who doesn’t know anything about C#, but I suspect that I’d want to cover query expressions right at the very end, preferably after quite a while of experience in C# without them.
That might not go for every new feature – it’s probably worth knowing about automatic properties right from the start, for instance, and introducing lambda expressions at the same time as anonymous methods (if C# 3 is the definite goal) but expression trees would be pretty late in my list of topics.
I learned C# 1 from a background of Java, and it didn’t take long to understand the basics of the syntax. Many of the subtleties took a lot longer of course (and it was a very long time before I really understood the differences between events and delegates, I’m sad to say) but it wasn’t a hard move. For a long time C# 2 just meant generics as far as I was concerned – with occasional nullable types, and some of the simpler features such as differing access for getters and setters. Anonymous methods and iterator blocks didn’t really hit me in terms of usefulness until much later – possibly due to using closures in Groovy and more iterators in LINQ. I suspect for many C# 2 developers this is still the case.
My method of learning C# 3 (from the spec, often in rather more detail than normal for the sake of accuracy in writing) is sufficiently far off the beaten track as to make it irrelevant for general purposes, but I wonder how people will learn it in reality. How will it be taught in universities (assuming there are any that teach C#)? How easily will developers moving to C# from Java cope? How about from other languages?
Interestingly, the move from VB 9 to C# 3 is now probably easier than the move from Java 6 to C# 3. Even with the differences in generics between Java and C#, that probably wasn’t true with C# 2.
To get back to C# 4, I’d like the improvements to be somehow blend in so that learning C# 4 from scratch isn’t a significantly different experience to learning C# 3 from scratch. It’s bound to be slightly longer as there will certainly be new features to learn – but if they can be learned alongside existing features, I think that’s a valuable asset. It’s also worth considering how much investment will be required to learn C# 4 from a position of understanding C# 3. Going from C# 2 to C# 3 is a significant task – but it’s one which involves a paradigm shift, and of course the payoffs are massive. I’d be very surprised (and disappointed) to see the same level of change in C# 4, or indeed the same level of payoff. Conservative though this is, I’m after “quick wins” for developers – even if in some cases such as covariance/contravariance the win is far from quick from the C# design/implementation team’s perspective.
Just to put things into perspective: think about how many new technologies developers have being asked to learn in the last few years – WCF, WPF, Workflow Foundation, Cardspace, Silverlight, AJAX, LINQ, ClickOnce and no doubt other things I’ve forgotten. I feel a bit like Joseph II complaining that Mozart had written “too many notes” – and if you asked me to start exorcising any of these technologies from history I’d be horrified at the prospect. That doesn’t actually make it any easier for us though. I doubt that the pace of change is likely to slow down any time soon in terms of technologies – let’s not make developers feel like complete foreigners in a language they were happy with.
Keep the language spec understandable
I know there aren’t many people who look at the language spec, but the C# spec has historically been very, very well written. It’s usually clear (even if it can be hard to find the right section at times) and leaves rigidly defined areas of doubt and uncertainty in appropriate places, while stamping out ambiguity in others. The new unified spec is a great improvement over the “C# 1 … and then the new bits in C# 2” approach from before. However, it’s growing at a somewhat alarming rate. As it grows, I expect it to become harder to read as a natural consequence – unless particular effort is put into countering that potential problem.
Stay technology-neutral…
Okay, this one is aimed fairly squarely at VB9. I know various people love XML literals, but I’m not a fan. It just feels wrong to have such close ties with a particular technology in the actual language, even one so widely used as XML. (Of course, there’s already a link between XML and C# in terms of documentation comments – but that doens’t feel quite as problematic.)
My first reaction to LINQ (before I understood it) was that C# was being invaded by SQL. Now that I’ve got a much better grasp of query expressions, I have no concern in that area. Perhaps it would be possible to introduce a new hierarchy construct which LINQ to XML understands with ease – or adapt the existing object/collection initializers slightly for this purpose. With some work, it may be possible to do this without restricting it to XML… I’m really just blue-skying though (and this isn’t a feature on my wishlist.)
… but bear concurrency in mind
While I don’t like the idea of tying C# to any particular technology, I think the general theme of concurrency is going to be increasingly important. That’s far from insightful – just about every technology commentator in the world is predicting a massively parallel computing landscape in the future. Developers won’t be able to get away with blissful ignorance of concurrency, even if not everyone will need to know the nuts and bolts.
Make it easier to do the right thing
This is effectively encouraging developers to “fall into the pit of success”. Often best practices are ignored as being inconvenient or impractical at times, and I’m certainly guilty of that myself. C# has a good history of enabling developers to do the right thing more easily as time progresses: automatic properties, iterator blocks and allowing different getter/setter access spring to mind as examples.
In some ways this is the biggest goal in this manifesto. It’s certainly guided me in terms of encouraging mixin and immutability support, ways of simplifying parameter/invariant checking, and potentially IDisposable
implementation. I like features which don’t require me to learn whole new ways of approaching problems, but let me do what I already knew I should do, just more easily.
Wishlist of C# 4 features
With all that out of the way, what would I like to see in C# 4? Hopefully from the above you won’t be expecting anything earth-shattering – which is a good job, as all of these are reasonably minor modifications. Perhaps we could call it C# 3.5, shipping with .NET 4.0. That would really make life interesting, as people are already referring to C# 3 as C# 3.5 (and C# 2008)…
Readonly automatic properties
I’ve mentioned this before, but I’ll give more details here. I’d like to be able to specify readonly
instead of protected
/internal
/private
for the setter access, which would:
- Mark the autogenerated backing variable as
initonly
in IL
- Prevent code outside the constructor from setting the property
So, for example:
class ReadOnlyDemo
{
public string Name { get;
readonly set; }
public ReadOnlyDemo(string name)
{
Name = name;
}
public void TryToSetName(string newName)
{
// Invalid
Name = newName;
}
}
This would make it easier to write genuinely (and verifiably, as per Joe’s post) immutable classes, or just immutable parts of classes. As mentioned in previous comments, there could be interesting challenges around serialization and immutability, but frankly they really need to be addressed anyway – immutability is going to be one part of the toolkit for concurrency, whether it has language support or not. In the generated IL the property would only have a getter – calls to the setter in the constructor would be translated into direct variable sets.
This shouldn’t require a CLR change.
Property-scoped variables
I’ve been suggesting this (occasionally) for a long time, but it’s worth reiterating. Every so often, you really want to make sure that no code messes around with a variable other than through a property. This can be solved with discipline, of course – but historically we don’t have a good record on sticking to discipline. Why not get the compiler to enforce the discipline? I would consider code like this:
public class Person
{
public int Age
{
int age;
get
{
return age;
}
set
{
if (value < 0 || value > SomeMaxAgeConstant)
{
throw new ArgumentOutOfRangeException();
}
age = value;
}
}
public void SetAgeNicely(int value)
{
Age = value;
}
public void SetAgeSneakily(int value)
{
age = value;
}
}
Just in case Eric’s reading this: yes, having Age
as a property of a person is a generally bad idea. Specifying a date of birth and calculating the age is a better idea. Really, don’t use this code as a model for a Person
type. However, treat it as a dumb example of a reasonable idea. I need to find myself a better type to use as my first port of call when finding an example…
The variable name would still have to be unique – it would still be the name generated in the IL, for instance. Multiple variables could be declared if required. The generated code could be exactly the same as that of existing code which happened to only use the property to access the variable.
A couple of potential options:
- The variables could be directly accessible during the constructor, potentially. This would help with things like serialization.
- Likewise, potentially an attribute could be applied to other members which needed access to the variables. Bear in mind that we’re only trying to save developers from themselves (and their colleagues). We’re not trying to cope with intruders in a security sense. An active “I know I’m violating my own rules” declaration should cause enough discomfort to avoid the accidental issues we’re trying to avoid.
This shouldn’t require a CLR change.
Extension properties
This has been broadly talked about, particularly in view of fluent interfaces. It feels to me that there are two very different reasons for extension properties:
- Making fluent interfaces prettier, e.g.
19.June(1976) + 8.Hours + 20.Minutes
instead of 19.June(1976) + 8.Hours() + 20.Minutes()
- Genuine properties, which of course couldn’t add new state to the extended type, but could access it in a different way.
Point 1 feels like a bit of a hack, I must admit. It’s using properties not because the result is “property-like” but because we want to miss the brackets off. It’s been pointed out to me that VB already allows this, and that by brackets to be missed out for parameterless methods we could achieve the same effect – but that just feels wrong. Arguably fluent interfaces already mess around with the normal conventions of what methods do and how they’re named, so using properties like this probably isn’t too bad.
Point 2 is a more natural reason for extension properties. As an example, consider a type which exposes a Size
property, but not Width
or Height
. Changing either dimension individually requires setting the Size to a new one with the same value for the other dimension – this is often much harder to read than judicious use of Height
/Width
. I suspect that extension properties would actually be used for this reason less often than for fluent interfaces, but there may be any number of interesting uses I haven’t thought of.
This shouldn’t require a CLR change, but framework changes may be required.
Extension method discovery improvements
I’ve made it clear before now that the way extension methods are discovered (i.e. with using
directives which import all the extension methods of all the types within the specified namespace) leaves much to be desired. I don’t like trying to reverse bad decisions – it’s pretty hard to do it well – but I really feel strongly about this one. (Interestingly, although I’ve heard many people criticising this choice, I don’t actually remember hearing the C# team defending it. Given that reservations were raised back in 2005, when there was still plenty of time to change stuff, I suspect there are reasons no-one’s thought of. I’d love to hear them some time.)
The goal would be to change from discovering extensions at a namespace to discovering extensions at a type level. (By which I mean at a “type containing extension methods” level – e.g. System.Linq.Enumerable or System.Linq.Queryably. Admittedly discovery on a basis which explicitly specifies the type to extend would also be interesting.) I don’t mind exactly how the syntax works, but the usual ideas are ones such as:
static using System.Linq.Enumerable;
using static System.Linq.Enumerable;
using class System.Linq.Enumerable;
That’s the easy part – the harder part would be working out the best way to phase out the “old” syntax. I would suggest a warning if extension methods are found and used without being explicitly mentioned by type. In C# 4 this could be a warning which was disabled by default (but could be enabled with pragmas or command line switches), then enabled by default in C# 5 (but with the same options available, this time to disable it). By C# 6 we could perhaps remove the ability to discover extension methods by namespace altogether, so the methods just wouldn’t be found any more.
The C# team could be more aggressive than this, perhaps skipping the first step and making it an enabled warning from C# 4 – but I’m happy to leave that kind of thing to them, without paying it much more attention. I know how seriously they take breaking changes.
No CLR changes required as far as I can see.
Implicit “select” at end of query expressions
I can’t say I’ve used VB9’s LINQ support, but I’ve heard about one aspect which has some appeal. In C# 3, every query expression ends with either “select” or “groupby”. The compiler is actually smart enough to ignore a redundant select clause (except for degenerate query expressions ) and indeed the language spec makes this clear. So why require it in the query expression in the first place? As a concrete example of before/after:
var query =
from user
in db.Users
where user.Age > 18
orderby user.Name
select user;
var query = from user in db.Users
where user.Age > 18
orderby user.Name;
This isn’t a massive deal, but it would be quite neat. I worry slightly that there could be significant costs in terms of the specification complexity, however.
Internal members on internal interfaces
Interfaces currently only ever have public members, even if the interfaces themselves are internal. This means that implementing an internal interface in an internal class still means making the implementing method public, or using explicit interface implementation (which imposes other restrictions, particularly in terms of overriding). It would be nice to be able to make members internal when the interface itself is internal – either explicitly or implicitly. Implementing such members publicly would still be allowed, but you could choose to keep the implementation internal if desired.
This may require a CLR change – not sure.
“Namespace+assembly” access restriction
It’s not an uncommon request on the C# newsgroup for the equivalent of C++’s “friend” feature – where two classes have a special relationship. In many ways InternalsVisibleTo is an assembly-wide version of this feature, but I can certainly see how it would be nice to have a slightly finer grained version. Sometimes two classes are naturally tightly coupled, even though they have distinct responsibilities. Although loose coupling is generally accepted to be a good thing, it’s not always practical. At the same time, giving extra access to all the types within the same assembly can be a little bit much.
Instead of specifying particular types to share members with, I’d propose a new access level, which would make appropriately decorated members available to other types which are both within the same assembly and within the same namespace. This would be similar to Java’s “package level” access (the default, for some reason) except without the implicit availability to derived types. (Java’s access levels and defaults are odd to say the least.)
(Of course, this wouldn’t help in assemblies which consisted of types within a single namespace.)
This would almost certainly require a CLR change.
InternalsVisibleTo simplification for strongly named assemblies
This one’s just a little niggle. In order to use the InternalsVisibleToAttribute to refer to a strongly named assembly (which you almost always have to do if the declaring assembly is strongly named), you have to specify the public key. Not the public key token as the documentation claims, but the whole public key. Not only that, but you can’t have any whitespace in it – so you can’t use a verbatim string literal to easily put it in a block. Instead, you either have to have the whole thing on one line, or use compile-time string concatenation to make sure the key is still unbroken.
It’s not often you need to look at the assembly attributes, so it’s far from a major issue – but it’s a mild annoyance which could be fixed with very few downsides.
This may require a CLR change – not sure.
Is that all?
I suspect that soon after posting this, I’ll think of other ideas. Some may be daft, some may be more significant than these, but either way I’ll do a new post for new ideas, rather than adding to this one. I’ll update this one for typos, further explanations etc. I suspect if I don’t post this now I’ll keep tweaking it for hours – which is relatively pointless as I’m really trying to provoke discussion rather than presenting polished specification proposals.