I’ve had a couple of bug reports about my Protocol Buffers port – both nicely detailed, and one including a patch to fix it. (It’s only due to my lack of timeliness in actually submitting the change that the second bug report occurred. Oops.)
The bug was in text formatting (although it also affected parsing). I was using the default
ToString behaviour for numbers, which meant that floats and doubles were being formatted as "50,15" in Germany instead of "50.15". The unit tests caught this, but only if you ran them on a machine with an appropriate default culture.
Aaargh. I’ve been struggling with a similar problem in a library I can’t change, which uses the system default time zone for various calculations in Java. When you’re running server code, the default time zone is almost never the one you want to use, and it certainly isn’t in my case.
A similar problem is Java’s decision to use the system default encoding in all kinds of bizarre places –
FileReader doesn’t even let you specify the encoding, which makes it almost entirely useless in my view.
So I’ve been wondering how we could fix this and problems like it. One option is to completely remove the defaults. If you always had to pass in a
Charset when you call any method which might be culturally sensitive.
Making life easier (in .NET)
It strikes me that .NET has a useful abstraction here: the assembly as the unit of deployment. (Java’s closest equivalent is probably a jar file, which probably gets messier.)
Within one assembly, I suspect in many cases you always want to make the same decision. For example, in protocol buffers I would like to use the invariant culture all the time. It would be nice if I could say that, and then get the right behaviour by default. Here are the options I’d like to be able to apply (for each of culture, time zone and character encoding – there may be others):
- Use a culture-neutral default (the invariant culture, UTF-8, UTC)
- Use a specific set of values (e.g. en-GB, Windows-1252, "Europe/London")
- Use the system default values
- Use whatever the calling assembly is using
Of course you should still have the option of specifying overrides on a per call basis, but I think this might be a way forward.
Thoughts? I realise it’s almost certainly too late for this to actually be implemented now, but would it have been a good idea? Or is it just an alternative source of confusion?
8 thoughts on “A different approach to inappropriate defaults”
The ideal solution could be an attribute that could be applied to a method, class or whole assembly telling the default culture when not specified in a call.
It could actually be done, i think by using something like Cecil to create a post-compilation tool for that. Same as the binary rewriter that microsoft have in “Code Contracts for .NET” ( http://blogs.msdn.com/somasegar/archive/2009/02/23/devlabs-code-contracts-for-net.aspx )
As an alternative to having to pass cultureinfo/locale, timezoneinfo/timezone, and encoding/charset for *each* call, what about marking up a method call with an attribute that specifies these?
The method being called could use system default values, but if it’s called with this attribute, those defaults would return the values specified in the attribute.
It’s also probably much too late for this to be implemented, but it might strike a decent compromise/balance among the options you listed.
You can, of course, use the Thread.CurrentCulture to set a default cultureinfo/locale.
I live in Belgium, so (luckily) I have to deal with this kind of stuff all the time.
I’ve grown used to always using the overloads that take a CultureInfo. I actually think that Microsoft has changed the defaults to InvariantCulture in .NET 4.0.
A good way to find places in your code where you forgot this, is to use Static Analysis/FxCop. That will catch it.
@Jeff: The point is that not all assemblies are created equal. When I’m writing code for the UI, I don’t *want* the invariant culture – but I do when I’m serializing data in some form. “Thread” isn’t a sufficiently fine-grained discriminator here, IMO.
@Tommy: Good point about FxCop, although it’s a shame it’s required to avoid this. I think MS is changing *some* defaults, but I’m not sure it’s a ubiquitous thing. (I think they’re changing things like String.StartsWith, but I don’t know about the general ToString() behaviour.)
Brings back memories of “date to string to date” problems in ASP…
Even though I would probably hate it, I think the ‘no default’ option is the way to go. I’m all for abstracting away gruesome details, but not if they come back to haunt you. In production.
I quite like the idea of being able to choose [Neutral/System/Calling Assembly/Specific].
The fun part of this is that ‘the rest of the world’ – understand non american/english people – know this !
Ok it’s a bit exagerated.
Actually people know the problem but not the solution.
Since in France, the decimal separator is ‘,’ I often see this :
How ugly !
The rule of thumb is still simple : When dealing with strings, ALWAYS provide a cutlure.
– If the string is meant to be read by people, use CurrentCulture (or appropriate one).
– If the string is intended to be used by another system, use InvariantCulture
I think you could make Thread.CurrentCulture useful to combine it with some scope control. So something like this:
using (new CultureScope(“nl-BE”))
This code would first save the current thread culture, then set the new (nl-BE) culture. After that, the code (…) is executed and finally the saved culture is made current again.