Category Archives: Java

Groovy 1.0 released

Groovy 1.0 has finally been released, and is available for download from the Groovy home page.

For those of you who didn’t read my last blog post on it, Groovy is a dynamic language which runs on the Java platform and integrates very nicely with Java itself.

What I haven’t advertised on this blog until now is that a month or so after starting with Groovy, I became involved in the Groovy in Action book. I won’t claim that I wrote a lot of it – more polishing and adding some introductory material, really. I’m still extremely proud of it, however, and obviously I hope it’ll sell bucket-loads.

So, if you haven’t played with Groovy yet, now’s a very good time to start: download it, have a bit of a play, and then buy the book!

Everything old is new again

I feel I’m too young to be making this kind of statement, but the sense of deja vu I get when reading about the layouts in WPF makes me nearly laugh out loud. Of all the things I can remember about Java 1.0 (this was before any number of things we take for granted now) I know that LayoutManagers exist – including the insanely hard to use GridBagLayout. Fortunately I learned over time that it wasn’t always the most appropriate manager (far from it) but it was powerful.

Ever since I started using .NET I’ve been annoyed at WinForms’ lack of layout management facilities. I’m pretty sure that when I complained about this to Chris Anderson once he said that it’s all there really, but just not exposed very well. Hmm. I’ve tried and lost patience a few times – I suspect there are ways of sorting it all out, but it’s not exactly straightforward.

Fortunately it all changes with WPF, which not only has various layout managers (not sure whether there’s anything like Java 1.4’s SpringLayout, which seemed useful last time I looked at it) but also makes them relatively easy to use with XAML. No more GridBagConstraints messes! Hurrah!

Anyway, it’s good to see that .NET 3.0 has caught up with Java 1.0 :) (And no, I’m not under the impression that Java invented the concept of layout managers. It just happens to be the first environment I used them in.)


Updated 7th August 2006 – It looks like closures aren’t meant to require K&R bracing after all. Hoorah! Examples changed appropriately.

One of my tasks at work is to investigate new languages and technologies and report back what
use we might make of them, where they fit in with what we’re doing, and generally what I think
of them. Obviously some of this will be specific to Clearswift,
but I’d like to make as much “insensitive” information available as possible. This post is my first
“report” as such, on Groovy.

What is Groovy?

From the Groovy home page:

Groovy is an agile dynamic language for the Java Platform with many features that inspired languages like Python, Ruby and Smalltalk, making them available to Java developers using a Java-like syntax.

That doesn’t help much if you don’t know Python,
Ruby or Smalltalk.
However, the key words (for me at least) in the above are Java and dynamic.
The Java bit is important to me because I know Java pretty well – both in terms of
the language and the standard library. It’s always nice not to have to learn yet
another way of doing the same things. (There are extra things to learn
in Groovy, but they are small in comparison with learning a platform from scratch.)
The dynamic bit is important because it’s what differentiates Groovy from Java in the first place.

Compared with, say, C and C++, Java is already pretty dynamic. It’s very easy to load classes
on the fly (it’s pretty easy to generate them, even) and reflection allows you to examine classes
at runtime. This allows for frameworks like Spring,
Hibernate and JUnit.
However, Groovy allows “dynamic typing” (an oft-contended phrase, but more later) and various
bits of what are effectively syntactic sugar to make the code terser. Most importantly from my
point of view, it offers closures – the equivalent of C# 2.0’s
anonymous methods.
(This removes the need for most inner classes in Java.) There are various other handy features too,
which generally make Groovy simpler to work with. Most of this post is effectively just a list of
features with examples and discussion.

Compiled, but scripty

Groovy is compiled to Java byte-code, but can be written as a script as well. Normally, the
whole script is compiled at start-up (as far as I can tell), although a lot of decisions are
left to run-time, so typos etc can sometimes only show up when a line is executed, even though
in a more “static” language they would have been caught at compile-time. Groovy scripts are
(commonly) executed using the groovy tool. There are also tools for running Groovy
as an interactive shell (groovysh) and a similar tool wrapped up in a GUI
(somewhat confusingly called groovyconsole). The groovyc tool is
provided to compile Groovy into bytecode to be used later rather than just run immediately.
The input to the compiler doesn’t have to be a fully-fledged class as such – it can just be a normal
Groovy script, in which case a class with an appropriate Main method is created.

It’s customary at this point to have a “Hello World!” program. As you can use Groovy like a scripting
language, it’s particularly simple:

println "Hello World!"

Saving the above to a file (e.g. test.groovy) and invoking with groovy test.groovy
gives the expected result. Things to note:

  • No class declaration, import statements etc. It’s just a script.
  • println is used instead of System.out.println. I believe this is a
    call to the println method which has been “added” to java.lang.Object.
  • No brackets and no semi-colon. You can use them – you can make them Groovy like very much like Java
    for the most part, but you don’t have to. I tend to use brackets but often omit semi-colons. You
    don’t even have to use brackets when there are multiple parameters.

As programs like the above are so convenient, I’m likely to use the features listed there in the
samples below. Other than that, I’ll attempt to only use one new feature at a time where possible,
so it’s obvious what I’m demonstrating.


In my limited experience with Groovy, closures form the single most useful feature of Groovy. They
allow you to specify some code (which may take parameters and return values) and then encapsulate that
code as an object – so you can pass it as a parameter to a method, for instance. The method could then
call the encapsulated code, and so forth. C# 2.0 provides this feature in the form of anonymous methods
(as delegate implementations) but in normal Java one would typically use an anonymous inner class, which
can end up being very ugly due to all the extra “gubbins” of specifying the superclass and then overriding
a particular method. Here’s possibly the simplest example of a closure:

Closure c = { println ("Hello closure!"); }

Giving all the details of what closures can and can’t do would take pages and pages, so I’ll just mention
a few broad points. Local variables are captured as in C#’s anonymous methods (so are writable, unlike
local variables being used in anonymous inner classes in Java), and access to private members
of the enclosing class is also permitted. Closures taking a single parameter can use the implicit parameter
name of it:

Closure printDouble = { println (it*2) }


Closures taking more than one parameter can specify their names in a sort of “introductory section”:

Closure printProduct = { x, y -> println (x*y) }

printProduct (2, 3)
printProduct (4, 5)

Finally, a very common idiom in Groovy is to make the last parameter of a method a
closure. In this case, you can call the method specifying all the other parameters normally, and then specifying
the closure parameter as code which appears to be after the method call. This takes a little while
to get used to, but is really, really handy. Here’s an example:

// Declare the method we're going to call
void executeWithProduct (int x, int y, Closure c)

// Call it with a closure that prints out the result
executeWithProduct (3, 4)
    println (it);

Groovy uses closures extensively, so they will come up out of necessity in a lot of the following examples.

“Loose typing”

Groovy doesn’t require you to specify the types of variables very often. Lots of magic happens to convert things
at the right time. Indeed, method overloading appears to be performed at run-time rather than compile-time. The
exact nature of how loose the types are is currently a mystery to me, and the specification is somewhat inadequate
in this regard. However, it’s worth looking at a few examples:

Simple hello world using loose typing (the differences when you use def are beyond the scope of this introductory article):

a = "Hello"
def b = " World!"
println (a+b)

Dynamic method overloading:

void show(String x)
    println ("string: "+x)

void show(int x)
    println ("int: "+x)

void show(x)
    println ("???: "+x)

y = "Hello"
y = 2
y = 2.5


string: Hello
int: 2
???: 2.5

String interpolation

Groovy uses the GString class (I kid you not) for string interpolation. Double-quoted
strings are compiled into instances of String or GString depending on whether
they contain any apparent interpolations, and single-quoted strings are always normal strings. (If you
need a character literal, it looks like you need to cast.) Any Groovy expression can be part of
the interpolation, which is enclosed in ${...} (like Ant properties). The braces appear
to be optional for simple expressions (the definition of which I’m not prepared to guess).

x = 10
y = "Jon"

println ('x is $x') // No interpolation with single quotes
println ("x is $x") // Simple interpolation
println ("y is ${y.toUpperCase()}") // Method call


x is $x
x is 10
y is JON

Collections: syntactic sugar and extra methods

Groovy makes working with collections easier, by providing syntax for lists and maps
within the language itself, and by using closures to make life easier. List and
map initializers both go in square brackets, with maps using a colon between a name and
a value. Also, number ranges are available as start..end. Note that
a number of common Java packages are imported by default, which is why the following
code doesn’t have to specify java.util anywhere.

List list = [0, 1, 4, 9]
Map map = ["Hello" : "There", "a" : "b"]
List range = 0..3 // Equivalent to [0, 1, 2, 3]

Indexers are provided (just like in C#) so using the above, map["Hello"] would give "There"
and list[2] would give 4. The collections also have a number of
extra methods added to them, many of them
involving closures. For instance:

list = 1..7

// Execute the closure for each element
// Output: 2, 4, 6, 8, 10, 12, 14 (on separate lines)
    println (it*2)

// Find the first element where the returned value is true
// Output: 6
println list.find 
      return it > 5

// Find all elements where the returned value is true
// Output: [6, 7]
println list.findAll
    return it > 5

// Transform each element, creating a new list
// Output: [1, 4, 9, 16, 25, 36, 49]
println list.collect
    return it*it

There are more – see the link above.


Another aspect of the JDK to be given the closure treatment is IO. Groovy makes it
really easy to read each line of a file and execute some code on the line, for example.
Here’s a program which (assuming it’s in a file called test.groovy) prints
itself out with the line numbers:

int line=1
new File ("test.groovy").eachLine 
    println "${line}: ${it}"

Enhanced switch statements

In Groovy, switch statements can have cases which are collections (including ranges; the case matches
if the switch value is in the collection), types (the case matches if the value is an instance of the type),
regular expressions, and falls back to equality otherwise. In fact, you can add your own type of case testing
by implementing an isCase method, making switch/case very flexible indeed. I haven’t tested it, but
I doubt this is nearly as efficient as the normal Java switch/case – but Groovy is about simplicity of
expression more than ultra-efficiency.

Categories – aka C# 3.0 extension methods

Groovy allows you to pretend that a class has a method you wish it had. It’s all pleasantly scoped so
you won’t do it accidentally. Here’s an example:

// Define the extra method we want
class IntegerCategory
    static boolean isEven(Integer value)
         return (value&1) == 0

// Use it - in a cleaner looking way than
// explicitly calling the static method.
use (IntegerCategory.class)
    println 2.isEven()
    println 3.isEven()

Groovy Markup

There are many times when you need to build a hierarchical structure of some kind. Groovy introduces
the idea of “builders” which help. For instance, for XML, there’s the DOMBuilder class
(along with SAXBuilder and NodeBuilder, the latter of which allows
easy XPath-like navigation). Using DOM to build XML in Java is a complete nightmare, and while
dom4j and JDOM are definite
improvements, they still don’t make it quite as easy as this. Suppose you have a map of names to ages,
and you want to build an XML document representing that information. Here’s a sample script in Groovy to
demonstrate how easy it is (using MarkupBuilder, which writes the generates XML out for you).
Elements are added just by calling a method of the same name (Groovy responds to the method call as if the
method were available normally, even though obviously it doesn’t know in advance what your element names
will be), and attributes are specified using a map in the method call. Child elements are specified
within closures.

import groovy.xml.*;

Map nameAgeMap = ["Jon": 29, "Holly": 30, "Dave": 32]

builder = MarkupBuilder.newInstance()
            entry ->
            person ("name": entry.key, "age": entry.value)


    <person name='Holly' age='30' />
    <person name='Dave' age='32' />
    <person name='Jon' age='29' />

Ant integration

I’m a big fan of Ant, but
every so often it just doesn’t let me do everything I want easily. Sometimes I want to be able to execute
some code, but I don’t want to go through the hassle of having to make sure I’ve compiled something which
is only actually going to be used by the build procedure anyway. Groovy to the rescue! You can embed
Groovy code “in-line” or call out to a Groovy script. Note that Ant allows many scripting languages (anything
supported by BSF, for starters) to be used. Groovy may be
more familiar-looking to developers who are familiar with Java but don’t know any scripting languages.
Groovy supports Ant directly in terms of providing access to the current project and properties, and the
AntBuilder class works in a similar way to the builders mentioned above, allowing Ant tasks
to be dynamically created and executed. Here’s a sample Ant file (which assumes that groovy-all-1.0-jsr-05.jar
is in the same directory):

<?xml version="1.0" ?>   
<project name="groovy-test" default="test" >

  <taskdef name="groovy" 
  <target name="test">
      println "Running in Groovy"
      fs = ant.fileset (dir: ".", casesensitive: "no") 
          include (name: "*.groovy")
          include (name: "*.java")
          exclude (name: "Test.*;test.*")
          ant.echo (it)


Buildfile: build.xml

   [groovy] Running in Groovy
     [echo] test2.groovy
   [groovy] statements executed successfully

Total time: 1 second

Other bits and bobs

There’s a lot more to Groovy than what’s presented above. It has operator overloading, syntax to
make regular expressions and multi-line strings easier, simple property definition and access, built-in
JUnit integration, an XPath-like expression language and much more besides. Read the home page for some of these – but be warned that some features are
pretty well hidden.

So, what’s wrong with it?

In general, I like Groovy. I’m not convinced that the productivity gains from it are worth the
downsides for major apps, but it’s really handy for getting something small working quickly. It could
be really great for prototyping. I may well eventually be convinced that “dynamic typing” isn’t
that dangerous really, and doesn’t have a detrimental impact on the usability of libraries, etc. Only
time will tell.

In the meantime, however, Groovy does suffer majorly from a lack of polish. There are plenty of bugs
to be found, and the documentation is terrible. (The members of the mailing list are more than happy to help, and a major documentation update is under way, however.) There are aspects of the syntax which seem to be
overkill, creating complexity without a huge benefit, and there are bits of normal Java which are
just “missing”. (Normal for loops aren’t available in the version I’m using, although
I believe they will be in the next available release. You can use a loop such as for (i in 0..9),
but not for (int i=0; i < 9; i++).) Things like this should really be fixed to make
as much of normal Java as possible available within Groovy.

I don’t mind the fact that Groovy isn’t finished – my worry is that it may never really be finished.
I really hope that I’m wrong, and that it will be all done and dusted (for v1) in the summer.
There’s no lack of activity – the community is very lively – but activity doesn’t necessarily indicate
actual progress towards a goal. Since originally posting this blog entry, I have been assured that
real progress is being made, so I’m keeping my fingers crossed.


  • Groovy home page
  • “Groovy JDK” – the extra methods added to various classes
  • Grails – Groovy/Spring/Hibernate-based web application devlopment

Inheritance Tax


There aren’t many technical issues that my technical lead (Stuart) and I disagree on.
However, one of them is inheritance and making things virtual. Stuart tends to favour
making things virtual on the grounds that you never know when you might need to inherit from
a class and override something. My argument is that unless a class is explicitly designed
for inheritance in the first place, you can get into a big mess very quickly. Desiging a
class for inheritance is not a simple matter, and in particular it ties your
implementation down significantly. Composition/aggregation usually works better in
my view. This is not to say that inheritance isn’t useful – like regular expressions,
inheritance of implementation is incredibly powerful and I certainly wouldn’t dream of
being without it. However, I find it’s best used sparingly. (Inheritance of interface is a
different matter – I happily use interfaces all the time, and they don’t suffer from the
same problems.) I suspect that much of my wariness is due to a bad experience I had with
java.util.Properties – so I’ll take that as a worked example.

Note: I’ll use the terms “derived type” and “subclass” (along with their related
equivalents) interchangably. This post is aimed at both C# and Java developers, and I can’t
get the terminology right for both at the same time. I’ve tended to go with whatever sounds
most natural at the time.

For those of you who aren’t Java programmers, a bit of background about the class.
Properties represents a “string to string” map, with strongly typed methods
(getProperty and setProperty) along with methods to save and
load the map. So far, so good.

Something we can all agree on…

The very first problem with Properties itself is that it extends
Hashtable, which is an object to object map. Is a string to string map
actually an object to object map? This is actually a question which has come up a lot
recently with respect to generics. In both C# and Java, List<String>
is not viewed as a subtype of List<Object>, for instance. This can
be a pain, but is logical when it comes to writable lists – you can add any object
to a list of objects, but you can only add a string to a list of strings. Co-variance
of type parameters would work for a read-only list, but isn’t currently available in C#.
Contravariance would work for a write-only list (you could view a list of objects as a list
of strings if you’re only writing to it), although that situation is less common, not to
mention less intuitive. I believe the CLR itself supports non-variance, covariance and
contravariance, but it’s not available in C# yet. Arguably generics is a complicated
enough topic already, without bringing in further difficulties just yet – we’ll have to
live with the restrictions for the moment. (Java supports both types of variance to
some extent with the ? extends T and ? super T syntax. Java’s
generics are very different to those in .NET, however.)

Anyway, java.util.Properties existed long before generics were a twinkle
in anyone’s eye. The typical “is-a” question which is usually taught for
determining whether or not to derive from another class wasn’t asked carefully enough in
this case. I believe it’s important to ask the question with Liskov’s Substitution Principle
in mind – is the specialization you’re going to make entirely compatible with
the more general contract? Can/should an instance of the derived type be used as if it were
just an instance of the base type?

The answer to the “can/should” question is “no” in the case of Properties, but
in two potentially different ways. If Properties overrides put (the
method in Hashtable used to add/change entries in the map) to prevent non-string
keys and values from being added, then it can’t be used as a general purpose Hashtable
– it’s breaking the general contract. If it doesn’t override put then a
Properties instance merely shouldn’t be used as a general purpose
Hashtable – in particular, you could get surprises if one piece of code added
a string key with a non-string value, treating it just as a Hashtable, and then
another piece of code used getProperty to try to retrieve the value of that key.

Furthermore, what happens if Hashtable changes? Suppose another method is added which
modifies the internal structure. It wouldn’t be unreasonable to create an add method
which adds a new key/value pair to the map only if the key isn’t already present. Now, if
Properties overrides put, it should really override add as
well – but the cost of checking for new methods which should potentially be overridden every time a
new version comes out is very high.

The fact that Properties derived from Hashtable
also means that its threading mechanisms are forever tied to those of Hashtable.
There’s no way of making it use a HashMap internally and managing the thread
safety within the class itself, as might be desirable. The public interface of
Properties shouldn’t be tied to the fact that it’s implemented using
Hashtable, but the fact that that implementation was achieved using
inheritance means it’s out in the open, and can’t be changed later (without abandoning
making use of the published inheritance).

So, hopefully we can all agree that in the case of java.util.Hashtable and
java.util.Properties at least, the choice to use inheritance instead of aggregation
was a mistake. So far, I believe Stuart would agree.

Attempting to specialize

Now for the tricky bit. I believe that if you’re going to allow a method to be overridden
(and methods are virtual by default in Java – fortunately not so in C#) then you need to document
not only what the current implementation does, but what it’s called from within the rest of
the class. A good example to demonstrate this comes from Properties again.

A long time ago, I wrote a subclass of Properties which had a sort of hierarchy.
If you had keys "X", "" and
"foo.baz" you could ask an instance of this hierarchical properties type for a
submap (which would be another instance of the same type) for "foo". The returned
map would have keys "bar" and "baz". We used this kind of hierarchy
for configuration. If you’re thinking that XML would have been a better fit, you’re right.
(XML didn’t actually exist at the time, and I don’t know if there were any SGML libraries around
for Java. Either way, this was a reasonably simple way of organising configuration.

Now the question of whether or not I should have been deriving from Properties
in the first place is an interesting one. I don’t think there’s any reason anyone couldn’t or
shouldn’t use an instance of the PeramonProperties (as it was unfortunately called)
class as a normal Properties object, and it certainly helped when it came to other
APIs which wanted to use a parameter of type Properties. As it happens, I believe
we did run into a versioning problem, in terms of wanting to override a method of
Properties which only appeared in Java version 1.2, but only when compiling against
1.2. It’s certainly not crystal clear to me now whether we did the right thing or not – there
were definite advantages, and it wasn’t as obviously wrong as the inheritance from Hashtable
to Properties, but it wasn’t plain sailing either.

I needed to override getProperty – but I wanted to do it in the simplest possible way.
There are two overloads for getProperty, one of which takes a default value and one
of which just assumes a default value of null. (The default is returned if the key isn’t
present in the map.) Now, consider three possible implementations of getProperties in
Properties (get is a method in Hashtable which returns
the associated value or null. I’m leaving aside the issue of what to do if a non-string
value has been put in the map.)

First version: non-defaulting method delegates to defaulting

public String getProperty (String key)
    return getProperty (key, null);
public String getProperty (String key, String defaultValue)
    String value = (String) get(key);
    return (value == null ? defaultValue : value);

Second version: defaulting method delegates to non-defaulting

public String getProperty (String key)
    return (String) get(key);
public String getProperty (String key, String defaultValue)
    String value = getProperty (key);
    return (value == null ? defaultValue : value);

Third version: just calling base methods

public String getProperty (String key)
    return (String) get(key);

public String getProperty (String key, String defaultValue)
    String value = (String) get(key);
    return (value == null ? defaultValue : value);

Now, when overriding getProperty myself, it matters a great deal what the implementation
is – because I’m likely to want to call one of the base overloads, and if that in turn calls
my overridden getProperty, we’ve just blown up the stack. An alternative is to override
get instead, but can I absolutely rely on Properties calling get?
What if in a future version of Java, Hashtable adds an overload for get which
takes a default value, and Properties gets updated to use that instead of the signature
of get that I’ve overridden?

There’s a pattern in all of the worrying above – it involves needing to know the implementation
of a the class in order to override anything sensibly. That should make two parties nervous – the
ones relying on the implementation, and the ones providing the implementation. The ones
relying on it first have to find out what the implementation currently is. This is hard enough
sometimes even when you’ve got the source – Properties is a pretty straightforward
class, but if you’ve got a deep inheritance hierarchy with a lot of interaction going on it can
be a pain to work out what eventually calls what. Try doing it without the source and you’re in
real trouble). The ones providing the implementation should be nervous because they’ve now effectively
exposed something which they may want to change later. In the example of Hashtable providing
get with an overload taking a default value, it wouldn’t be unreasonable for the
authors of Properties to want to make use of that – but because they can’t change the
implementation of the class without potentially breaking other classes which have overridden
get, they’re stuck with their current implementation.

Of course, that’s assuming that both parties involved are aware of the risks. If the author
of the base class doesn’t understand the perils of inheritance, they could easily change the
implementation to still fulfill the interface contract, but break existing subclasses. They
could have all the unit tests required to prove that the implementation was, in itself, correct –
but that wouldn’t help the poor subclass which was relying on a particular implementation.
If the author of the subclass doesn’t understand the potential problems – particularly if
the way they first overrode methods just happened to work, so they weren’t as aware as they
might be that they were relying on a specific implementation – then they may not
do quite as much checking as they should when a new version of the base class comes out.

Does this kill inheritance?

Having proclaimed doom and gloom so far, I’d like to emphasise that I’m not trying
to say that inheritance should never be used. There are many times when it’s fabulously
useful – although in most of those cases an interface would be just as useful from a client’s
point of view, possibly with a base class providing a “default implementation” for use where
appropriate without making life difficult for radically different implementations (such as
mocks :)

So, how can inheritance be used safely? Here are a few suggestions – they’re not absolute
rules, and if you’re careful I’m sure it’s possible to have a working system even if you
break all of them. I’d just be a bit nervous when trying to change things in that state…

  • Don’t make methods virtual unless you really need to. Unless you can think of a reason
    why someone would want to override the behaviour, don’t let them. The downside of this
    is that it makes it harder to provide mock objects deriving from your type – but interfaces
    are generally a better answer here.
  • If you have several methods doing a similar thing and you want to make them virtual,
    consider making one method virtual (possibly a protected method) and making all
    the others call the virtual method. That gives a single point of access for derived classes.
  • When you’ve decided to make a method virtual, document all other paths that will call
    that method. (For instance, in the case above, you would document that all the similar
    methods call the virtual one.) In some cases it may be reasonable to not document the
    details of when the method won’t be called (for instance, if a particular
    parameter value will always result in the same return value for one overload of a method,
    you may not need to call anything else). Likewise it may be reasonable to only document
    the callers on the virtual method itself, rather than on each method that calls it.
    However, both of these can affect an implementation. This documentation becomes
    part of the interface of your class – once you’ve stated that one method will call
    another (and implicitly that other methods won’t call the virtual method) any
    change to that is a breaking change in the same way that changing the acceptable parameters
    or the return value is. You should also consider documenting what the base implementation
    of the method does (and in particular what other methods it calls within the same class) –
    quite often, an override will want to call the base implementation, but it can be difficult
    to know how safe this is to do or at what point to call it unless you know what the
    implementation really does.
  • When overriding a method, be very careful which other methods in the base class you
    call – check the documentation to make sure you won’t be causing an infinitely
    recursive loop. If you’re deriving from one of your own types and the documentation
    isn’t explicit enough, now would be a very good time to improve it. You might also
    want to make a note in the base class that you’re overriding the method in the specific
    class so that you can refer to the overriding method if you want to change the base class
  • If you make any assumptions when overriding a method, consider writing unit tests to document
    those assumptions. For instance, if you assume that calling method X will result in a call to
    your overridden method Y, consider testing that path as well as the path where method Y is
    called directly. This will help to give you more confidence if the base type is upgraded to
    a newer version. (This shouldn’t be considered a replacement for careful checking when
    the base type is upgraded to a new version though – indeed, you may want to add extra tests
    due to an expanding API etc.)
  • Take great care when adding a new virtual method in Java, as any existing derived class which
    happens to have a method of the same name will automatically override it, usually
    with unintended consequences. If you’re using Java 1.5/5.0, you can use the @Override
    annotation to specify that you intend to override a method. Some IDEs (such as Eclipse) have
    options to make any override which doesn’t have the @Override annotation result
    in a compile-time error or warning. This gives a similar degree of safety to C#’s requirement
    to use the override modifier – although there’s still no way of providing a “new”
    method which has the same signature as a base type method but without overriding it.
  • If you upgrade the version of a type you’re using as a base type, check for any changes in
    the documentation, particularly any methods you’ve overridden. Look at any new methods which
    you’d expect to call your overridden method – and any you’d expect not to!

Many of these considerations have different effects depending on the consumer of the type.
If you’re writing a class library for use outside your development team or organisation,
life is harder than in a situation where you can easily find out all the uses of a particular
type or method. You’ll need to think harder about what might genuinely be useful to override
up-front rather than waiting until you have a need before making a method virtual (and then
checking all existing uses to ensure you won’t break anything). You may also want to give more
guidance – perhaps even a sample subclass – on how you envisage a method being overridden.


You should be very aware of the consequences of making a method virtual. C# (fortunately in my view)
makes methods non-virtual by default. In an interview
Anders Hejlsberg explained the reasons for that decision, some of which are along the same lines as
those described here. Java treats methods as virtual by default, using Hotspot to get round the performance
implications and largely ignoring the problems described here (with the @Override annotation
coming late in the day as a partial safety net). Like many powerful tools, inheritance of implementation
should be used with care.

Bringing Subversion and Fitnesse together

I’ve recently started working with Subversion (a version control system) and FitNesse (the Fit acceptance testing framework based in a wiki). FitNesse has a primitive version control system built into it, where it builds zip files of previous versions of pages. It’s all a bit messy though, and it’s not likely to be the version control system used by the rest of your source code. Why wouldn’t you want your acceptance tests in the same repository you use for the rest of your source and tests?

So, arming myself with JavaSVN (a pure Java Subversion client library) I went looking
at the FitNesse source code. I’m sorry to say it’s not everything I’d hoped for – lots of methods declared to just throw Exception, using streams with no try/finally blocks and (I suspect) a rather gaping potential for things to go seriously wrong if someone commits a page at the same time as someone else deletes it. However, life goes on – fortunately I was able to find the entry point I needed fairly quickly.

In this case, it was, which dealt with both the writing of the “plain” contents/metadata files, along with the versioning. It was only a matter of a few hours to refactor that to allow the versioning piece to be pluggable. Adding Subversion support took another few hours, and the result works reasonably well. A few things to note:

  • I could possibly have used an existing plugin point instead of creating a versioning system off FileSystemPage.
    I didn’t know that at the time, and I’m not sure how much it would have helped me. I’m not sure whether JavaSVN would have
    let me get away with making changes to the repository without having a working copy at all, but if so that would have been
    quite a nice solution. There’s no real need for a directory hierarchy – just a file per page, and Subversion properties to
    store the FitNesse metadata. With the sort of load I’m expecting the server at work to have, performance wouldn’t have been
    an issue, and it would quite possibly have simplified things a bit. On the other hand, what I’ve got works and was probably
    a bit simpler to implement. On the other hand, it means changing FitNesse :(
  • I’ve currently implemented the new interface in the namespace, and I build it within the same
    Eclipse project as the rest of the FitNesse code. It should really be in its own separate jar file, but that seemed overkill for what I was doing at the moment (especially as it’s only one source file).
  • I’ve only done manual testing on this. I don’t know enough about either FitNesse or JavaSVN to sanely test what I’ve done.
    I’m sure it’s possible, and I hope that if others find this hack useful, they could help me to test it properly. I’m somewhat ashamed of this situation, given my firm belief in TDD – it’s due to a lack of understanding of where to go, not a belief that I’ll have magically got the code right. On the plus side, all the built-in FitNesse tests still pass, so I’m reasonably confident that if you run it without actually using the Subversion code, it’ll still work.
  • I’m really worried about threading. It’s unlikely to be a problem unless you happen to get two users doing things to the same pages at the same time, but the level of locking present in FileSystemPage doesn’t really cut it. That level is too low to be particularly useful, as one responder may need to change several pages on disk, and that should be done reasonably atomically. (I don’t even try to do it in an atomic way in terms of Subversion, but stopping other disk activity from interfering would be helpful.) Of course, you should never end up with a hosed Subversion repository (I really hope the server just wouldn’t let you do that) but it may be possible to get into a situation where you need to either do some delicate work updating the working copy manually, or just check the whole tree out again.
  • Currently, files (the ones under the files directory) aren’t versioned. I’m not sure how easy that will be to fix, but it’s
    obviously something which is needed before it’s really production-ready. Hooks are needed for upload, delete and rename. Creating a directory probably doesn’t need to be versioned, so long as the directory is put under version control when the first file is created.

The whole change (a total of seven files – it’s a relatively small code change, all things considered) is available along with installation instructions on my main web site. It’s pretty basic at the moment, but if it all takes off, who knows what could happen?

Visual Studio vs Eclipse

I often see people in newsgroups saying how wonderful Visual Studio is, and they often claim it’s the “best IDE in the world”. Strangely enough, most go silent when I ask how many other IDEs they’ve used for a significant amount of time. I’m not going to make any claims as to which IDE is “the best” – I haven’t used all the IDEs available, and I know full well that one (IDEA) is often regarded as superior to Eclipse. However, here are a few reasons I prefer Eclipse to Visual Studio (even bearing in mind VS 2005, which is a great improvement). Visual Studio has much more of a focus on designers (which I don’t
tend to use, for reasons given elsewhere) and much less of a focus on making actual coding as easy as possible.

Note that this isn’t a comparison of Java and C# (although those are the languages I use in Eclipse and VS respectively). For the most part, I believe C# is an improvement on Java, and the .NET framework is an improvement on the Java standard library. It’s just a shame the tools aren’t as good. For reference, I’m comparing VS2005 and Eclipse 3.1.1. There are new features being introduced to Eclipse all the time (as I write, 3.2M4 is out, with some nice looking things) and obviously MS is working on improving VS as well. So, without further ado (and in no particular order):

Open Type/Resource

When I hit Ctrl-Shift-T in Eclipse, an “Open Type” dialog comes up. I can then type in the name of any type (whether it’s my code, 3rd party library code, or the Java standard library code) and the type is opened. If the source is available (which it generally is – I’ve used very few closed source 3rd party Java components, and the source for the Java standard library is available) the source opens up; otherwise a list of members is displayed.

In large solutions, this is an enormous productivity gain. I regularly work with solutions with thousands of classes – remembering where each one is in VS is a bit of a nightmare. Non-Java resources can also be opened in the same way in Eclipse, using Ctrl-Shift-R instead. One neat feature is that Eclipse knows the Java naming conventions, and lets you type just the initial letters instead of the type name itself. (You only ever need to type as much as you want in order to find the type you’re after anyway, of course.) So for example, if I type “NPE”, I’m offered NullPointerException and NoPermissionException.

Note that this isn’t the same as the “Find Symbol” search offered by VS 2005. Instead, it’s a live updating search – as you type, the list is updated. This is very handy if you can’t remember whether it’s ArgumentNullException or NullArgumentException and the like – it’s very fast to experiment with.

There’s good news here: Visual Studio users have a saviour in the form of a free add-in called DPack, by USysWare. This offers dialogs
for opening types, members (like the Outline dialog, Ctrl-O, in Eclipse), and files. I’ve only just heard about it, and haven’t tried it on a large solution yet, but I have high hopes for it.

Sensible overload intellisense

(I’m using the word intellisense for what Eclipse calls Code Assist – I’m sure you know what I mean.) For some reason, although Visual Studio is perfectly capable of displaying the choice of multiple methods within a drop-down list, when it comes to overloads it prefers a spinner. Here’s what you get if you type sb.Append( into Visual Studio, where sb is a StringBuilder

Here’s what happens if you do the equivalent in Eclipse:

Look ma, I can see more than one option at once!

Organise imports

For those of you who aren’t Java programmers, import statements are the equivalent to using directives in C# – they basically import a type or namespace so that it can be used without the namespace being specified. In Visual Studio, you either have to manually type the using directives in (which can be a distraction, as you have to go to the top of the file and then back to where you were) or (with 2005) you can hit Shift-Alt-F10 after typing the name ofthe type, and it will give you the option of adding a using statement, or filling in the namespace for you. Now, as far as I’m aware, you have to do that manually for each type. With Eclipse, I can write a load of code which won’t currently compile, then hit Ctrl-Shift-O and the imports are added. I’m only prompted if there are multiple types available from different namespaces with the same name. Not only that, but I can get intellisense for the type name while I’m typing it even before I’ve added the import – and picking the type adds the import automatically. In addition, organise imports removes import statements which aren’t needed – so if you’ve added something but then gone back and removed it, you don’t have misleading/distracting lines at the top of your file. A feature which isn’t relevant to C# anyway but which is quite neat is that Eclipse allows you to specify how many individual type imports you want before it imports the whole package (e.g. import java.util.*). This allows people to code in whatever style they want, and still get
plenty of assistance from Eclipse.

Great JUnit integration

I confess I’ve barely tried the unit testing available in Team System, but it seems to be a bit of a pain in the neck to use. In Eclipse, having written a test class, I can launch it with a simple (okay, a slightly complicated – you learn to be a bit of a spider) key combination. Similarly I can select a package or a whole source directory and run all the unit tests within it. Oh, and it’s got a red/green bar, unlike Team System (from what I’ve seen). It may sound like a trivial thing, but having a big red/green bar in your face is a great motivator in test driven development. Numbers take more time to process – and really, the most important thing you need to know is whether all the tests have passed or not. Now, Jamie Cansdale has done a great job with TestDriven.NET, and I’m hoping that he’ll integrate it with VS2005 even better, but Eclipse is still in the lead at this point for me. Of course, it helps that it just comes with all this stuff, without extra downloads (although there are plenty of plugins available). Oh, and just in case anyone at Microsoft thinks I’ve forgotten: no, unit testing still doesn’t belong in just Team System. It should be in the Express editions, in my view…

Better refactoring

MS has made no secret of the fact that it doesn’t have many refactorings available out of the box. Apparently they’re hoping 3rd parties will add their own – and I’m sure they will, at a cost. It’s a shame that you have to buy two products in 2005 before you can get the same level of refactoring that has been available in Eclipse (and other IDEs) for years. (I know I was using Eclipse in 2001, and possibly earlier.)

Not only does Eclipse have rather more refactorings available, but they’re smarter, too. Here’s some sample code in C#:

public void DoSomething()
    string x = "Hello";
    byte[] b = Encoding.UTF8.GetBytes(x);
    byte[] firstHalf = new byte[b.Length / 2];
    Array.Copy(b, firstHalf, firstHalf.Length);

public void DoSomethingElse()
    string x = "Hello there";
    byte[] b = Encoding.UTF8.GetBytes(x);
    byte[] firstHalf = new byte[b.Length / 2];
    Array.Copy(b, firstHalf, firstHalf.Length);

If I select the last middle lines of the first method, and use the ExtractMethod refactoring, here’s what I get:

public void DoSomething()
    string x = "Hello";
    byte[] firstHalf = GetFirstHalf(x);

private static byte[] GetFirstHalf(string x)
    byte[] b = Encoding.UTF8.GetBytes(x);
    byte[] firstHalf = new byte[b.Length / 2];
    Array.Copy(b, firstHalf, firstHalf.Length);
    return firstHalf;

public void DoSomethingElse()
    string x = "Hello there";
    byte[] b = Encoding.UTF8.GetBytes(x);
    byte[] firstHalf = new byte[b.Length / 2];
    Array.Copy(b, firstHalf, firstHalf.Length);

Note that second method is left entirely alone. In Eclipse, if I have some similar Java code:

public void doSomething() throws UnsupportedEncodingException
    String x = "hello";        
    byte[] b = x.getBytes("UTF-8");
    byte[] firstHalf = new byte[b.length/2];
    System.arraycopy(b, 0, firstHalf, 0, firstHalf.length);
    System.out.println (firstHalf[0]);

public void doSomethingElse() throws UnsupportedEncodingException
    String y = "hello there";        
    byte[] bytes = y.getBytes("UTF-8");
    byte[] firstHalfOfArray = new byte[bytes.length/2];
    System.arraycopy(bytes, 0, firstHalfOfArray, 0, firstHalfOfArray.length);
    System.out.println (firstHalfOfArray[0]);

and again select Extract Method, then the dialog not only gives me rather more options, but one of them is whether to replace the duplicate code snippet elsewhere (along with a preview). Here’s the result:

public void doSomething() throws UnsupportedEncodingException
    String x = "hello";        
    byte[] firstHalf = getFirstHalf(x);
    System.out.println (firstHalf[0]);

private byte[] getFirstHalf(String x) throws UnsupportedEncodingException
    byte[] b = x.getBytes("UTF-8");
    byte[] firstHalf = new byte[b.length/2];
    System.arraycopy(b, 0, firstHalf, 0, firstHalf.length);
    return firstHalf;

public void doSomethingElse() throws UnsupportedEncodingException
    String y = "hello there";        
    byte[] firstHalfOfArray = getFirstHalf(y);
    System.out.println (firstHalfOfArray[0]);

Note the change to doSomethingElse. I’d even tried to be nasty to Eclipse, making the variable names different in the second method. It still does the business.

Navigational Hyperlinks

If I hold down Ctrl and hover over something in Eclipse (e.g. a variable, method or type name), it becomes a hyperlink. Click on the link, and it takes you to the declaration. Much simpler than right-clicking and hunting for “Go to definition”. Mind you, even that much
isn’t necessary in Eclipse with the Declaration view. If you leave your cursor in a variable, method or type name for a second, the Declaration view shows the appropriate code – the line
declaring the variable, the code for the method, or the code for the whole type. Very handy if you just want to check something quickly, without even changing which editor you’re using. (For those of you who haven’t used Eclipse, a view is a window like the Output window in VS.NET. Pretty much any window which isn’t an editor or a dialog is a view.)

Update! VS 2005 has these features too!
F12 is used to go to a definition (there may be a shortcut key in Eclipse as well to avoid having to use the mouse – I’m not sure).
VS 2005 also has the Code Definition window which is pretty much identical to the Declaration view. (Thanks for the tips, guys :)

Better SourceSafe integration

The source control integration in Eclipse is generally pretty well thought through, but what often amuses me is that it’s easier to use Visual SourceSafe (if you really have to – if you have a choice, avoid it) through Eclipse (using the free plug-in) than through Visual Studio. The whole binding business is much more easily set up. It’s a bit more
manual, but much harder to get wrong.

Structural differences

IDEs understand code – so why do most of them not allow you to see differences in code terms? Eclipse does. I can ask it to compare two files, or compare my workspace version with the previous (or any other) version in source control, and it shows me not just the textual
differences but the differences in terms of code – which methods have been changed, which have been added, which have been removed. Also, when going through the differences, it shows blocks at a time and then what’s changed within the block – i.e. down to individual words, not just lines. This is very handy when comparing resources in foreign languages!

Compile on save

The incremental Java compiler in Eclipse is fast. Very, very fast. And it compiles in the background now, too – but even when it didn’t, it rarely caused any bother. That’s why it’s perfectly acceptable for it to compile (by default – you can change it of course) whenever you save. C# compiles a lot faster than C/C++, but I still have to wait a little while for a build to finish, which means that I don’t do it as often as I save in Eclipse. That in turn means I see some problems later than I would otherwise.

Combined file and class browser

The package explorer in Eclipse is aware that Java files contain classes. So it makes sense to allow you to expand a file to see the types within it:

That’s it – for now…

There are plenty of other features I’d like to mention, but I’ll leave it there just for now. Expect this blog entry to grow over time…

New (to me) threading paradigms

In the last couple of days, I’ve been reading up on CSPs (Communicating Sequential Processes) and the Microsoft Research project CCR (Concurrency and Coordination Runtime). I suspect that the latter is really a new look at the former, but I don’t have enough experience with either of them to tell. Now, I know a number of my readers are smart folks who have probably lived and breathed these things for a while – so, is my hunch right, or are they fundamentally different models?

A few links for further reading:

System.Random (and java.util.Random)

This is as much a “before you forget about it” post as anything else.

Both Java and .NET have Random classes, which allow you to get random numbers within a certain range etc. Unless you specify otherwise, both are seeded with the current time. Neither class claims to be thread-safe. This presents a problem – typically you effectively just want one source of random numbers, but you want to be able to access it from multiple threads. You certainly don’t want to create a new instance of the Random class every time you want a random number – due to the granularity of the system clock, that commonly gives repeated sequences of random numbers (i.e. multiple instances are created with the same seed, so each instance gives the same sequence).

I suggest that very few people really need to specify seeds – which means they don’t really need instance methods in the first place. A class with static methods (matching the interface of Random) would be perfectly adequate. This could be implemented using a single Random instance and locking to get thread safety, but a slightly more elegant (or at least more interesting) way would be to use thread-local static variables. In other words, each thread gets its own instance of Random. Now, that introduces the problem of which seed to use for each of these instances. That’s pretty easily solved though – either you take some combination of the thread ID and the current time, or you create a new instance of Random the first time any thread accesses the class, and use that Random as the source of seeds for future Randoms. The only time locking is needed is to access that Random, which occurs once per thread.

This does, of course, break my “keep it as simple as possible” rule – the simplest solution would certainly be to lock each time. In this case though, as this will end up as a library class which may well be used by many threads simultaneously in situations where a lot of random numbers are being generated, I think it’s worth the extra effort. I’ll probably write this up as an article when I’ve written the tests and implementation…

Book idea

Having just glanced at the clock, now is the ideal time to post about an idea I had a little while ago – a book (or blog, or something) about C# (or maybe C# and Java) which I’d only write between midnight and one in the morning.

It would contain only those things which seemed like really good ideas at the time – but which might seem insane at other times. Most of these ideas are probably useless, but may contain a germ of interest. While I don’t always have those ideas between midnight and one, that’s the time of night when they seem most potent, and when I’d be most likely to be ready to write enthusiastically about them. The coding equivalent of “beer goggles” if you will.

A couple ideas I’ve had which would probably qualify:

Extension interfaces

If C# 3.0 is going to allow us to pretend to add methods to classes, which shouldn’t it allow us to pretend that classes implement interfaces they don’t? My original reason for wanting this is to get rid of some of the ugliness in the suggesting new XML APIs: there’s a method which takes an array of objects, even though only a handful of types are catered for. Unfortunately, those types don’t have an interface in common, so all the checking has to be done at runtime. If you could pretend that they all implement the same interface, just for the purposes of the API, you could declare the method as taking an array of the interface type. Of course, this is much less straightforward than converting what looks like an instance method call into a static method call…

Conditional returns

This came up when implementing Equals for several types in quick succession. All of them followed a very similar pattern, and there were similar things needed at the start of each implementation – simple checks for nullity, reference identity etc. It would be interesting to have a sort of “nullable return” for methods which had a non-nullable value type return type – I could write return? expression; where the expression was a nullable form of the return type, and it would only return if the expression was non-null. There are bits of this which appeal, and bits which seem horrible – but the main problem I have with it is that I suspect would rarely use it outside Equals implementations. (If this isn’t a clear enough description, I’m happy to write an example – just not right now.)

Corner cases in Java and C#

Every language has a few interesting corner cases – bits of surprising behaviour
which can catch you out if you’re unlucky. I’m not talking about the kind of thing
that all developers should really be aware of – the inefficiencies of repeatedly
concatenating strings, etc. I’m talking about things which you would never suspect
until you bump into them. Both C#/.NET and Java have some oddities in this respect,
and as most are understandable even to a developer who is used to the other, I thought
I’d lump them together.

Interned boxing – Java 1.5

Java 1.5 introduced autoboxing of primitive types – something .NET has had
from the start. In Java, however, there’s a slight difference – the boxed
types have been available for a long time, and are proper named reference
types just as you’d write elsewhere. In this example, we’ll look at
int boxing to java.lang.Integer. What would you
expect the results of the following operation to be?

Object x = 5;
Object y = 5;
boolean equality = (x==y);

Personally, I’d expect the answer to be false. We’re testing for reference equality
here, after all – and when you box two values, they’ll end up in different boxes,
even if the values are the same, right? Wrong. Java 1.5 (or rather, Sun’s current
implementation of Java 1.5) has a sort of cache of interned values between -128 and
127 inclusive. The language specification explicitly states that programmers shouldn’t
rely on two boxed values of the same original value being different (or being the
same, of course). Goodness only knows whether or not this actually yields performance
improvements in real life, but it can certainly cause confusion. I only ran into it
when I had a unit test which incorrectly asserted reference equality rather than
value equality between two boxed values. The tests worked for ages, until I added
something which took the value I needed to test against above 127.

Lazy initialisation and the static constructor – C#

One of the things which is sometimes important about the pattern I usually use when
implementing a singleton
is that it’s only initialised when it’s first used – or is it? After a newsgroup
question asked why the supposedly lazy pattern wasn’t working, I investigated a little,
finding out that there’s a big difference between using an initialiser directly on
the static field declaration, and creating a static constructor which assigns the value.
Full details on my beforefieldinit

The old new object – .NET

I always believed that using new with a reference type would give me
a reference to a brand new object. Not quite so – the overload for the String
constructor which takes a char[] as its single parameter will return
String.Empty if you pass it an empty array. Strange but true.

When is == not reflexive? – .NET

Floating point numbers have been
the cause of many headaches over the years. It’s relatively well known that “not a number” is not equal
to itself (i.e. if x=double.NaN, then x==x is false).

It’s slightly more surprising when two values which look like they really, really should be equal just
aren’t. Here are a couple of sample programs:

using System;
public class Oddity1
    public static void Main()
        double two = double.Parse("2");
        double a = double.Epsilon/two;
        double b = 0;
        Console.WriteLine(Math.Abs(b-a) < double.Epsilon);

On my computer, the above (compiled and run from the command line) prints out True twice.
If you comment out the last line, however, it prints False – but only under .NET 1.1.
Here’s another:

using System;

class Oddity2
    static float member;

    static void Main()
        member = Calc();
        float local = Calc();
        member = local;

    static float Calc()
        float f1 = 2.82323f;
        float f2 = 2.3f;
        return f1*f2;

This time it prints out True until you comment out the last
line, which changes the result to False. This occurs on both .NET 1.1 and 2.0.

The reason for these problems is really the same – it’s a case of when the JIT decides to
truncate the result down to the right number of bits. Most CPUs work on 80-bit floating point
values natively, and provide ways of converting to and from 32 and 64 bit values. Now, if you
compare a value which has been calculated in 80 bits without truncation with a value which has
been calculated in 80 bits, truncated to 32 or 64, and then expanded to 80 again, you can run
into problems. The act of commenting or uncommenting the extra lines in the above changes what
the JIT is allowed to do at what point, hence the change in behaviour. Hopefully this will
persuade you that comparing floating point values directly isn’t a good idea, even in cases
which look safe.

That’s all I can think of for the moment, but I’ll blog some more examples as and when I see/remember
them. If you enjoy this kind of thing, you’d probably like
Java Puzzlers
– whether or not you use Java itself. (A lot of the puzzles there map directly to C#, and even those which
don’t are worth looking at just for getting into the mindset which spots that kind of thing.)