Category Archives: General

Build and config friendliness counts

Yesterday, I bought a Toppy PVR when my Tivo died. The details of what it does are irrelevant (although quite fun). The important thing is that it’s very hackable – so there are lots of extensions and access programs available. While the Windows ones are typically in binary form, the Linux ones aren’t. The Toppy gives access via a USB port, and programs either access that directly or use FTP to transfer files to it via an intermediate server which basically converts FTP requests into whatever protocol the USB connection uses.

Now, I have a Linkstation with the OpenLink firmware installed on it – a hard disk running a very cut down Linux on a fairly pitiful processor. I had a few bits and bobs on it already, notably TwonkyMedia and Subversion. While TwonkyMedia was a binary installation, Subversion was built from scratch, which took a little bit of doing, mostly because the configure script required sort, which wasn’t provided in the tools for the Linkstation. Doesn’t sound too bad – you just need to download the right package to build and install sort, right? Guess what the configure script for sort requires? Yup – sort. Fortunately a friend helped me out, battling with makefiles until we had a verison of sort which worked well enough to rebuild it properly.

All of this is partly background, but partly the whole point of this blog – build annoyances.

Anyway, back to the Toppy. Over the course of the last 24 hours or so, I’ve fetched/built the following packages on the Linkstation:

  • puppy – a “direct connection” client to fetch/store files
  • ftpd-topfield – an FTP/USB proxying server
  • toppy-web – a web application allowing (limited) remote control of the Toppy
  • lighttpd – a light-weight web server to host toppy-web
  • php – PHP, required as a fastcgi module for lighttpd to run toppy-web
  • libxml2 – an XML library required for some PHP features
  • byacc – Berkeley’s lex/yacc implementation; libxml2 needs yacc
  • ncftp – a Linux CLI FTP client

(Apologies if any of the dependencies in here are wrong. It was getting pretty late by the time I’d got a php-enabled webserver…)

The build/install procedure of all of these varied immensely, and the impression I gained of the quality of the software reflects this. For those of you who don’t do Linux builds regularly, the normal procedure is to run ./configure, then make, then make install. Sounds simple, right? Well…

puppy was straightforward, although it didn’t have a make install – it just built a binary. Still, it was simple enough, and worked first time.
Running it without specifying any arguments gave a useful help message, and it was easy to get to grips with.

ftpd-topfield wasn’t bad either. This time there was a make install, and even a make test. On the other hand, running it without any arguments just returned back to the console with no hint of what was going on. Using --help produced a reasonable help message, but it’s still not clear to me what it does when you don’t specify -D for standalone mode. It could be for running from inetd, but I don’t know. Anyway, it worked pretty well, and I don’t remember any build problems, so it can’t have been that bad. I had to write my own rc.init script, but a simple one is sufficing for the moment.

toppy-web was where the problems began. Now, it doesn’t need to be built itself, but it requires a PHP-capable web server. It’s not quite “onto the next item” though, because this is the one thing I still haven’t got running – due to the configuration aspect. It comes with a bunch of sample configuration files, but you have to hunt around the web for documentation, which still seems to be flaky at best. Now, I can entirely sympathise with the developers, but the point of this blog post is the comparison of build/configure procedures. I’ll get it going at some point, I’m sure (given the effort I’ve put into the rest) but I’m not sure I’ve got the energy just yet. This is a web application – why can’t it be configured with a web page?

lighttpd – the web server itself. This wasn’t too bad, if I remember rightly. The docs have fairly good descriptions of how to configure it appropriately, including what’s required from a php build. Which brings us to…

php – oh dear. It’s never a good sign when the configure script complains about syntax errors to start with. After googling these errors and finding that other people who had received them were told that basically they could be ignored, I let the script continue. After quite a few minutes, it decided that I needed libxml2. At first, I tried to disable this – at which point (well, after starting the script again and letting it run for a few minutes) it complained that without libxml2 it wouldn’t be able to have any DOM support. I don’t know whether toppy-web requires DOM support, but it seemed like an important thing to be without. So, I decided to download libxml2 and build it. In retrospect, I should probably have looked through the toppy-web source to see whether I really needed it…

libxml2 fairly quickly announced that it needed yacc. That’s not particularly unreasonable, and I was slightly surprised not to have it. However, it was yet another step. Fortunately, byacc built and installed easily enough to not deserve its own paragraph here. Hurrah – finally I could configure and build libxml2. You wouldn’t expect that to take too long, would you? XML isn’t that hard. Hmm. It wasn’t actually difficult but I felt very sorry for the Linkstation afterwards. Most of the C files had to be compiled twice for some reason (I didn’t look at the depths of why – something to do with how the libraries are deployed, I believe) and some of them are large. Very large. xmlschemas.c is over 28,000 lines long, and the (fortunately auto-generated) testapi.c is over 50,000 lines long. C compilers are slow beasts at the best of times, and this is on a box which was really only meant to run samba. It took ages. Not only that, but the first time I had to link php with it, it failed. No idea whose fault that was (mine, php or libxml2 but it was frustrating.) Anyway, back to php

With libxml2 built, I set php configuring and building – with the three features enabled which I knew I had to have (thanks to the lighttpd docs) and which all sound like they should be enabled by default. It got there in the end, so by the time I went to bed (very late) I had a working php-enabled web server.

Tonight, I decided to find a command-line FTP client for Linux. After finding a very useful Linux FTP client comparison page I decided to plump for ncftp. After the previous night’s frustrations, I was somewhat nervous. Fortunately, my fears were unwarranted. More than that – the authors of ncftp deserve awards.

The readme file is suitably undaunting. The configure script runs reasonably quickly. So far, so unremarkable… but the make file. Oooh… instead of showing you the exact command it’s running at any time (which in some cases for the previous packages was over a screenful for a single link, and frequently 5 or 6 lines per compile), it just tells you what it’s doing (e.g. Compiling c_utime.c... and has a colour-coded [OK] (in the same way that most Linux distributions do for starting and stopping services) for each build step. Green means a clean build, yellow means there were warnings (which are printed). At the end of the build, it shows you the files it’s built. At the end of make install it shows you the files you’ve installed. The difference in professionalism and the impression you’re left with is marked. I’ve no idea whether the ncftp guys wrote the makefile themselves or whether it’s a framework which will is generally available – but I’ve never seen anything as clean when it comes to make. Indeed, it leaves me wish that Ant builds were logged as cleanly (when they’re successful).

So, the moral of this post (which is rather longer than I’d anticipated)? Builds matter. Configuration matters. Documentation matters. There’s more to a build being good than just it working first time: giving feedback and a warm fuzzy feeling are important too. Making it look simple even if there are complex things going on behind the scenes makes life feel smoother. (I’m sure the ncftp makefile has options for seeing the commands themselves if you really need to.) I’ve understood the importance of a good build system before, but usually in terms of a developer having all the options they need. In the open source world, particularly on Linux where in many cases the end user will have to build the package in order to run it on their custom devices, the build is just as much a part of “ease of use” as the product itself – and if a user falls at the first hurdle, they’ll never see your pretty UI.

New blog, new project

I’ve started a new blog, which I’ll be sharing with a couple of colleagues. In brief, the idea is to try to do a “hobby” project as well as possible, the whole purpose being the learning experience.

Rather than waffle on about it here, I’ll just refer you to the blog of the
Quest for the Perfect Project. Hope it’s of some interest to some of you.

Oh, and this blog will carry on as normal.
 

Wii and MythTV: The future of my living room?

My Tivo has started playing up. A few months ago, it stopped padding programmes automatically like it used to (I haven’t investigated why – getting a console session on it is pretty tricky; it may well be a software update that removed the patch I’d put on) and now it glitches sometimes in a way that suggests the disk may be packing up.

Likewise, my cheap DVD player is pretty rubbish, and needs replacing, and my Freeview receiver is far from perfect.

I had been considering buying a spiffy DVD/HD recorder, and was probably going to start looking sometime in the next couple of months. However, using my Wii over Christmas has given me a different idea: using the Wii remote to control a MythTV box.

The Wiimote, for those of you who haven’t used one, is a small remote control that has relatively few buttons for a gaming controller, but which is able to detect when you point it at the TV, and which has accelerometers to determine tilting and motion. It’s wireless, using Bluetooth to talk to the Wii.

Now, imagine this:

  1. Point at the TV and press “home” to get to the main menu; select “Live TV” or “Recorded Programmes” by pointing at the relevant menu items and pressing the main button (A).
  2. Programme not at the right point? Pause it (with A), hold down the trigger on the back and tilt to the left or right (like a volume knob) to get to the right place quickly and accurately.
  3. Want to fast forward or reverse just a bit? Don’t bother pausing – just hold down the trigger and tilt to change the speed (and direction) of playback.
  4. Watching Live TV and want to change channel? Press “1” for favourite channels (chances are the channel will be on the first page) or “2” for all channels (in channel order, probably on more than one screen: either click on page up/down, or hold the trigger and “drag” up or down.
  5. Volume control? Why, that’s what the -/+ are for, naturally.

I’ve already got a Roku Soundbridge for playing my MP3s (from a network disk) but I suspect that choosing an album/song with a whole screen would be somewhat simpler than with a single line. (Don’t get me round – I love my Soundbridge. It’s stylish and does the job really well.)

Unfortunately, I have no experience whatsoever with MythTV, and very little time to hack around with it. The good news is that I’m not the only one to have come up with the idea (I’d be shocked if no-one else had thought of it) although it sounds like they’ve only done the buttons so far. I expect more full functionality will come along in time, assuming the underlying MythTV platform supports it.

There are two downsides I can think of at the moment:

  1. I suspect the Wii’s “sensor bar” (a misnomer as it doesn’t sense anything – it’s just an array of IR transmitters) is turned off when the Wii is on standby. I’d need to be able to give it power without the Wii being on, ideally.
  2. How can I tell the Wii that I’m controlling MythTV and vice versa? I’d rather not need an extra controller just for the sake of MythTV – although if it comes to that, they’re not that expensive.

I only had the idea of the whole package (tilting etc) earlier today, and haven’t stopped being excited about it. If I were Hauppage or a similar company, I think I’d already be contacting Nintendo to try to licence the technology – I suspect that the first company with a mainstream product which supports a “point/click/tilt” (rather than “find the button”) PVR UI could make a lot of money.

The irony is that the Wii is the only one of the three next-gen consoles not to have designs on becoming a media centre – and yet it’s the one that I suspect I’m most likely to use for that very purpose (albeit just the controller).

Unit tests rock, I suck, news at 11

I’ve just started looking at my Miscellaneous Utility Library again, after quite a while. I’m currently running Vista on my laptop, which means I can’t run Visual Studio 2003 – so it’s about time I updated the library to use generics and all that goodness. I’ll keep the .NET 1.1 version available on the web site, but from now on any new code will be 2.0 only.

In the process of updating RandomAccessQueue to use implement the generic collection interfaces, I decided to do the implementation test-first, as is now my habit. It clearly wasn’t habit back when I originally wrote the code (the same day Peramon laid everyone off, incidentally – I remember as I was at home, ill). The new methods use some of the old methods – and unfortunately that’s now exposed some long-standing bugs.

Looking back, I find it hard to understand why I had so much faith in this code: it’s the kind of code which is bound to suffer from off-by-one errors and the like. It’s not terribly hard to test, fortunately (unlike the threadpool stuff, for example). Oh how I wish I’d been using NUnit back then.

This happened the last time I looked at a MiscUtil class, too. It will take a while to add unit tests giving a decent level of coverage to the code – it’s not like I have a lot of spare time – but it’s clearly got to be done.

I wonder how much other code I’ve written over the years is riddled with bugs? To be fair, the MiscUtil stuff was run considerably less than most of the code I wrote professionally at Peramon… but I bet there were quite a few nasty little gotchas waiting there too. And now?
No, I don’t write perfect code, even with unit tests. Even when the code does what I intend it to do, I have to revisit whether the intention was right in the first place. Even with unit tests there can easily be problems which slip through the cracks – but I don’t think I produce nearly as much code which is basically broken as I used to.

An alternative CV strategy

This is my second attempt at writing this. Memo to self: after hitting the Post button, make sure the post has actually been published before navigating away from the page…

I’ve been reading a fair number of CVs recently, and I’ve been struck by just how much experience everyone seems to have. At least, everyone claims to have a breadth of experience that I just can’t match. I haven’t counted, but I suspect most of the CVs I’ve been looking at have listed over 100 technologies. In light of this, I’ve been considering how I’ll market myself when I’m next interested in getting a job.

There are a few things in my favour which most candidates don’t have, mostly in terms of community – MVP awards, book reviewing, web articles, this blog, newsgroup posts, open source contributions etc – but I don’t know how much attention prospective employers really pay to that kind of thing. What I find frustrating is the way that traditional CVs don’t really convey any of what I find important – either as a potentially employee or as someone involved (to whatever extent) in the hiring process. I have begun to wonder whether a list of values would do me any favours:

  • I prefer working code over perfect UML
  • I prefer whiteboards over Visio
  • I prefer code which can easily be read over code which runs 5% faster but no-one else understands
  • I prefer code reviews which force me to change my design over reviews which stroke my ego
  • I prefer being laughed at due to my trousers over being disrespected for being sloppy
  • I prefer going home at 5 to sleep on a problem over staying at the office until midnight and then being useless the next day
  • I prefer carrots over sticks
  • I prefer progress over process
  • I prefer keen developers with much to learn over experienced developers who feel they have nothing to learn
  • I prefer close collaboration over the heroic coder mentality
  • I prefer solving problems people are having in the real world over providing marketing with a new toy to show off

Maybe that doesn’t go far enough towards selling me though. How about some more direct statements?

  • I write clean code in a timely manner
  • I test my work and refactor mercilessly
  • I don’t assume my code is perfect
  • I love to learn new techniques and technologies
  • I love to teach, and can explain things clearly
  • I pick up new things quickly
  • I have an affinity for code which lets me solve issues quickly
  • I bring passion to whatever I do

If someone presented me with a CV based on the above lists, I’d be interested. Yes, I’d probably check that the candidate had worked in some sort of similar area before, but frankly if you take a bright person and ask them to learn Java or C#, it’s not going to take them that long to do it. Learning design principles takes longer (I’ll let you know if I ever think I’ve finished!) but with good mentoring, it’s not a problem.

CVs can’t be trusted. People can write pretty much anything on them. However, they’re making a choice about what image to present to the world – and that choice itself makes a statement. I want to work with smart people who love what they do. I want to see a spark in their eyes when they tell me what they’ve been up to. At an interview, I want them to be so busy getting me enthusiastic about what they’ve been looking at that I don’t have time for the standard questions.

You may well consider the lists above to be unprofessional to an extent. I agree – but I’m not sure whether it’s a problem. I enjoy my work immensely – so much so that I hardly think of it as work for a lot of the time. That’s not to say it’s not important to do a professional job – but there’s often not much of a gap between what I’m interested in for fun and what I earn money doing.

I suspect if I gave an unconventional CV to an agency they’d either demand a rewrite or they’d change it themselves. Maybe they’d be right to do so – maybe managers aren’t really keen on this sort of thing. What do you think? Comments are always welcome on my blog, but I’m particularly keen on feedback this time, as it could have a real bearing on what I do when I’m next in the job market.

PowerShell in Action

Every so often, I review books for publishers at various times before they hit the streets (anything from initial proposal to final review). The book I’ve been reviewing most recently is PowerShell in Action. (For those of you who didn’t see the news, PowerShell is the new name for Monad, the new object-oriented shell for Windows.)

Now, I’m excited about PowerShell as a product, but I’m even more excited about the book. I’m a pretty harsh reviewer, and I can only think of about three books which I’ve reviewed and been really positive about throughout most of the review. This is the best of them. I’ve not seen the whole book yet, but from what I’ve seen it’s going to be both readable and informative, which is frankly a rare combination in technical books. The author (Bruce Payette) is on the PowerShell team, so we get the information straight from the horse’s mouth (no disrespect meant) along with reasons for design decisions. Anyway, go and have a look at the home page for the book (linked above), read the first (unedited chapter), and sign up for updates. It’s going to be fab.

Groovy

Updated 7th August 2006 – It looks like closures aren’t meant to require K&R bracing after all. Hoorah! Examples changed appropriately.

One of my tasks at work is to investigate new languages and technologies and report back what
use we might make of them, where they fit in with what we’re doing, and generally what I think
of them. Obviously some of this will be specific to Clearswift,
but I’d like to make as much “insensitive” information available as possible. This post is my first
“report” as such, on Groovy.

What is Groovy?

From the Groovy home page:

Groovy is an agile dynamic language for the Java Platform with many features that inspired languages like Python, Ruby and Smalltalk, making them available to Java developers using a Java-like syntax.

That doesn’t help much if you don’t know Python,
Ruby or Smalltalk.
However, the key words (for me at least) in the above are Java and dynamic.
The Java bit is important to me because I know Java pretty well – both in terms of
the language and the standard library. It’s always nice not to have to learn yet
another way of doing the same things. (There are extra things to learn
in Groovy, but they are small in comparison with learning a platform from scratch.)
The dynamic bit is important because it’s what differentiates Groovy from Java in the first place.

Compared with, say, C and C++, Java is already pretty dynamic. It’s very easy to load classes
on the fly (it’s pretty easy to generate them, even) and reflection allows you to examine classes
at runtime. This allows for frameworks like Spring,
Hibernate and JUnit.
However, Groovy allows “dynamic typing” (an oft-contended phrase, but more later) and various
bits of what are effectively syntactic sugar to make the code terser. Most importantly from my
point of view, it offers closures – the equivalent of C# 2.0’s
anonymous methods.
(This removes the need for most inner classes in Java.) There are various other handy features too,
which generally make Groovy simpler to work with. Most of this post is effectively just a list of
features with examples and discussion.

Compiled, but scripty

Groovy is compiled to Java byte-code, but can be written as a script as well. Normally, the
whole script is compiled at start-up (as far as I can tell), although a lot of decisions are
left to run-time, so typos etc can sometimes only show up when a line is executed, even though
in a more “static” language they would have been caught at compile-time. Groovy scripts are
(commonly) executed using the groovy tool. There are also tools for running Groovy
as an interactive shell (groovysh) and a similar tool wrapped up in a GUI
(somewhat confusingly called groovyconsole). The groovyc tool is
provided to compile Groovy into bytecode to be used later rather than just run immediately.
The input to the compiler doesn’t have to be a fully-fledged class as such – it can just be a normal
Groovy script, in which case a class with an appropriate Main method is created.

It’s customary at this point to have a “Hello World!” program. As you can use Groovy like a scripting
language, it’s particularly simple:

println "Hello World!"

Saving the above to a file (e.g. test.groovy) and invoking with groovy test.groovy
gives the expected result. Things to note:

  • No class declaration, import statements etc. It’s just a script.
  • println is used instead of System.out.println. I believe this is a
    call to the println method which has been “added” to java.lang.Object.
  • No brackets and no semi-colon. You can use them – you can make them Groovy like very much like Java
    for the most part, but you don’t have to. I tend to use brackets but often omit semi-colons. You
    don’t even have to use brackets when there are multiple parameters.

As programs like the above are so convenient, I’m likely to use the features listed there in the
samples below. Other than that, I’ll attempt to only use one new feature at a time where possible,
so it’s obvious what I’m demonstrating.

Closures

In my limited experience with Groovy, closures form the single most useful feature of Groovy. They
allow you to specify some code (which may take parameters and return values) and then encapsulate that
code as an object – so you can pass it as a parameter to a method, for instance. The method could then
call the encapsulated code, and so forth. C# 2.0 provides this feature in the form of anonymous methods
(as delegate implementations) but in normal Java one would typically use an anonymous inner class, which
can end up being very ugly due to all the extra “gubbins” of specifying the superclass and then overriding
a particular method. Here’s possibly the simplest example of a closure:

Closure c = { println ("Hello closure!"); }
c();

Giving all the details of what closures can and can’t do would take pages and pages, so I’ll just mention
a few broad points. Local variables are captured as in C#’s anonymous methods (so are writable, unlike
local variables being used in anonymous inner classes in Java), and access to private members
of the enclosing class is also permitted. Closures taking a single parameter can use the implicit parameter
name of it:

Closure printDouble = { println (it*2) }

printDouble(5)
printDouble(10)

Closures taking more than one parameter can specify their names in a sort of “introductory section”:

Closure printProduct = { x, y -> println (x*y) }

printProduct (2, 3)
printProduct (4, 5)

Finally, a very common idiom in Groovy is to make the last parameter of a method a
closure. In this case, you can call the method specifying all the other parameters normally, and then specifying
the closure parameter as code which appears to be after the method call. This takes a little while
to get used to, but is really, really handy. Here’s an example:

// Declare the method we're going to call
void executeWithProduct (int x, int y, Closure c)
{
    c(x*y);
}

// Call it with a closure that prints out the result
executeWithProduct (3, 4)
{
    println (it);
}

Groovy uses closures extensively, so they will come up out of necessity in a lot of the following examples.

“Loose typing”

Groovy doesn’t require you to specify the types of variables very often. Lots of magic happens to convert things
at the right time. Indeed, method overloading appears to be performed at run-time rather than compile-time. The
exact nature of how loose the types are is currently a mystery to me, and the specification is somewhat inadequate
in this regard. However, it’s worth looking at a few examples:

Simple hello world using loose typing (the differences when you use def are beyond the scope of this introductory article):

a = "Hello"
def b = " World!"
println (a+b)

Dynamic method overloading:

void show(String x)
{
    println ("string: "+x)
}

void show(int x)
{
    println ("int: "+x)
}

void show(x)
{
    println ("???: "+x)
}

y = "Hello"
show(y)
y = 2
show(y)
y = 2.5
show(y)

Results:

string: Hello
int: 2
???: 2.5

String interpolation

Groovy uses the GString class (I kid you not) for string interpolation. Double-quoted
strings are compiled into instances of String or GString depending on whether
they contain any apparent interpolations, and single-quoted strings are always normal strings. (If you
need a character literal, it looks like you need to cast.) Any Groovy expression can be part of
the interpolation, which is enclosed in ${...} (like Ant properties). The braces appear
to be optional for simple expressions (the definition of which I’m not prepared to guess).

x = 10
y = "Jon"

println ('x is $x') // No interpolation with single quotes
println ("x is $x") // Simple interpolation
println ("y is ${y.toUpperCase()}") // Method call

Results:

x is $x
x is 10
y is JON

Collections: syntactic sugar and extra methods

Groovy makes working with collections easier, by providing syntax for lists and maps
within the language itself, and by using closures to make life easier. List and
map initializers both go in square brackets, with maps using a colon between a name and
a value. Also, number ranges are available as start..end. Note that
a number of common Java packages are imported by default, which is why the following
code doesn’t have to specify java.util anywhere.

List list = [0, 1, 4, 9]
Map map = ["Hello" : "There", "a" : "b"]
List range = 0..3 // Equivalent to [0, 1, 2, 3]

Indexers are provided (just like in C#) so using the above, map["Hello"] would give "There"
and list[2] would give 4. The collections also have a number of
extra methods added to them, many of them
involving closures. For instance:

list = 1..7

// Execute the closure for each element
// Output: 2, 4, 6, 8, 10, 12, 14 (on separate lines)
list.each
{
    println (it*2)
}

// Find the first element where the returned value is true
// Output: 6
println list.find 
{
      return it > 5
}

// Find all elements where the returned value is true
// Output: [6, 7]
println list.findAll
{                    
    return it > 5
}

// Transform each element, creating a new list
// Output: [1, 4, 9, 16, 25, 36, 49]
println list.collect
{
    return it*it
}

There are more – see the link above.

IO

Another aspect of the JDK to be given the closure treatment is IO. Groovy makes it
really easy to read each line of a file and execute some code on the line, for example.
Here’s a program which (assuming it’s in a file called test.groovy) prints
itself out with the line numbers:

int line=1
new File ("test.groovy").eachLine 
{
    println "${line}: ${it}"
    line++
}

Enhanced switch statements

In Groovy, switch statements can have cases which are collections (including ranges; the case matches
if the switch value is in the collection), types (the case matches if the value is an instance of the type),
regular expressions, and falls back to equality otherwise. In fact, you can add your own type of case testing
by implementing an isCase method, making switch/case very flexible indeed. I haven’t tested it, but
I doubt this is nearly as efficient as the normal Java switch/case – but Groovy is about simplicity of
expression more than ultra-efficiency.

Categories – aka C# 3.0 extension methods

Groovy allows you to pretend that a class has a method you wish it had. It’s all pleasantly scoped so
you won’t do it accidentally. Here’s an example:

// Define the extra method we want
class IntegerCategory
{
    static boolean isEven(Integer value)
    {
         return (value&1) == 0
    }
}

// Use it - in a cleaner looking way than
// explicitly calling the static method.
use (IntegerCategory.class)
{
    println 2.isEven()
    println 3.isEven()
}

Groovy Markup

There are many times when you need to build a hierarchical structure of some kind. Groovy introduces
the idea of “builders” which help. For instance, for XML, there’s the DOMBuilder class
(along with SAXBuilder and NodeBuilder, the latter of which allows
easy XPath-like navigation). Using DOM to build XML in Java is a complete nightmare, and while
dom4j and JDOM are definite
improvements, they still don’t make it quite as easy as this. Suppose you have a map of names to ages,
and you want to build an XML document representing that information. Here’s a sample script in Groovy to
demonstrate how easy it is (using MarkupBuilder, which writes the generates XML out for you).
Elements are added just by calling a method of the same name (Groovy responds to the method call as if the
method were available normally, even though obviously it doesn’t know in advance what your element names
will be), and attributes are specified using a map in the method call. Child elements are specified
within closures.

import groovy.xml.*;

Map nameAgeMap = ["Jon": 29, "Holly": 30, "Dave": 32]

builder = MarkupBuilder.newInstance()
builder.rootElement
{
    names
    {
        nameAgeMap.each 
        {
            entry ->
            person ("name": entry.key, "age": entry.value)
        }
    }
}

Result:

<rootElement>
  <names>
    <person name='Holly' age='30' />
    <person name='Dave' age='32' />
    <person name='Jon' age='29' />
  </names>
</rootElement>

Ant integration

I’m a big fan of Ant, but
every so often it just doesn’t let me do everything I want easily. Sometimes I want to be able to execute
some code, but I don’t want to go through the hassle of having to make sure I’ve compiled something which
is only actually going to be used by the build procedure anyway. Groovy to the rescue! You can embed
Groovy code “in-line” or call out to a Groovy script. Note that Ant allows many scripting languages (anything
supported by BSF, for starters) to be used. Groovy may be
more familiar-looking to developers who are familiar with Java but don’t know any scripting languages.
Groovy supports Ant directly in terms of providing access to the current project and properties, and the
AntBuilder class works in a similar way to the builders mentioned above, allowing Ant tasks
to be dynamically created and executed. Here’s a sample Ant file (which assumes that groovy-all-1.0-jsr-05.jar
is in the same directory):

<?xml version="1.0" ?>   
<project name="groovy-test" default="test" >

  <taskdef name="groovy" 
         classname="org.codehaus.groovy.ant.Groovy"
         classpath="groovy-all-1.0-jsr-05.jar"/>
  
  <target name="test">
    <groovy>
      println "Running in Groovy"
      
      fs = ant.fileset (dir: ".", casesensitive: "no") 
      {
          include (name: "*.groovy")
          include (name: "*.java")
          exclude (name: "Test.*;test.*")
      }
      
      fs.toString().split(';').each 
      {
          ant.echo (it)
      }
    </groovy>
  </target>                               
  
</project>

Results:

Buildfile: build.xml

test:
   [groovy] Running in Groovy
     [echo] Benchmark.java
     [echo] CommandTest.java
     [echo] Handler.java
     [echo] Main.java
     [echo] MyEnum.java
     [echo] test2.groovy
   [groovy] statements executed successfully

BUILD SUCCESSFUL
Total time: 1 second

Other bits and bobs

There’s a lot more to Groovy than what’s presented above. It has operator overloading, syntax to
make regular expressions and multi-line strings easier, simple property definition and access, built-in
JUnit integration, an XPath-like expression language and much more besides. Read the home page for some of these – but be warned that some features are
pretty well hidden.

So, what’s wrong with it?

In general, I like Groovy. I’m not convinced that the productivity gains from it are worth the
downsides for major apps, but it’s really handy for getting something small working quickly. It could
be really great for prototyping. I may well eventually be convinced that “dynamic typing” isn’t
that dangerous really, and doesn’t have a detrimental impact on the usability of libraries, etc. Only
time will tell.

In the meantime, however, Groovy does suffer majorly from a lack of polish. There are plenty of bugs
to be found, and the documentation is terrible. (The members of the mailing list are more than happy to help, and a major documentation update is under way, however.) There are aspects of the syntax which seem to be
overkill, creating complexity without a huge benefit, and there are bits of normal Java which are
just “missing”. (Normal for loops aren’t available in the version I’m using, although
I believe they will be in the next available release. You can use a loop such as for (i in 0..9),
but not for (int i=0; i < 9; i++).) Things like this should really be fixed to make
as much of normal Java as possible available within Groovy.

I don’t mind the fact that Groovy isn’t finished – my worry is that it may never really be finished.
I really hope that I’m wrong, and that it will be all done and dusted (for v1) in the summer.
There’s no lack of activity – the community is very lively – but activity doesn’t necessarily indicate
actual progress towards a goal. Since originally posting this blog entry, I have been assured that
real progress is being made, so I’m keeping my fingers crossed.

Links

  • Groovy home page
  • “Groovy JDK” – the extra methods added to various classes
  • Grails – Groovy/Spring/Hibernate-based web application devlopment

MSDN Product Feedback Center – use it!

The Open Source community has known for ages that making it easy for users to file bugs and feature requests is a great way of making sure that not only are more bugs noticed, but that the bugs which actually annoy users take priority over those which never crop up in real life. For a little while now, MS has been doing the same – although the world can be forgiven for barely noticing. The MSDN Product Feedback Center isn’t exactly hidden in a locked filing cabinet stuck in a disused lavatory with a sign on the door saying “Beware of the Leopard”, but it’s not far off. It’s on the MSDN lab which has a label at the top saying “MSDN Lab projects are experimental and may be removed without notice”. That’s not exactly encouraging when it comes to taking the time to file a bug report.

However, a few MS managers have now emphasised that this way of reporting bugs is really important to them. The entries end up in the internal bugs database which developers use – and importantly, if a bug is found and regarded as important by a customer during (say) beta test, that’s much more likely to be able to get through the red tape required for a late change than one which is found internally. If you’re an MVP, there’s an added bonus that your bugs are automatically regarded as “valid” and so they’re even more likely to be fixed.

I should point out that only bugs/requests with respect to certain products can be entered at the moment – but the list is likely to grow, I suspect. Anyway, the important thing is that it’s there, it’s pretty easy to use, and it makes a difference – so use it!

Bringing Subversion and Fitnesse together

I’ve recently started working with Subversion (a version control system) and FitNesse (the Fit acceptance testing framework based in a wiki). FitNesse has a primitive version control system built into it, where it builds zip files of previous versions of pages. It’s all a bit messy though, and it’s not likely to be the version control system used by the rest of your source code. Why wouldn’t you want your acceptance tests in the same repository you use for the rest of your source and tests?

So, arming myself with JavaSVN (a pure Java Subversion client library) I went looking
at the FitNesse source code. I’m sorry to say it’s not everything I’d hoped for – lots of methods declared to just throw Exception, using streams with no try/finally blocks and (I suspect) a rather gaping potential for things to go seriously wrong if someone commits a page at the same time as someone else deletes it. However, life goes on – fortunately I was able to find the entry point I needed fairly quickly.

In this case, it was fitnesse.wiki.FileSystemPage, which dealt with both the writing of the “plain” contents/metadata files, along with the versioning. It was only a matter of a few hours to refactor that to allow the versioning piece to be pluggable. Adding Subversion support took another few hours, and the result works reasonably well. A few things to note:

  • I could possibly have used an existing plugin point instead of creating a versioning system off FileSystemPage.
    I didn’t know that at the time, and I’m not sure how much it would have helped me. I’m not sure whether JavaSVN would have
    let me get away with making changes to the repository without having a working copy at all, but if so that would have been
    quite a nice solution. There’s no real need for a directory hierarchy – just a file per page, and Subversion properties to
    store the FitNesse metadata. With the sort of load I’m expecting the server at work to have, performance wouldn’t have been
    an issue, and it would quite possibly have simplified things a bit. On the other hand, what I’ve got works and was probably
    a bit simpler to implement. On the other hand, it means changing FitNesse :(
  • I’ve currently implemented the new interface in the fitnesse.wiki namespace, and I build it within the same
    Eclipse project as the rest of the FitNesse code. It should really be in its own separate jar file, but that seemed overkill for what I was doing at the moment (especially as it’s only one source file).
  • I’ve only done manual testing on this. I don’t know enough about either FitNesse or JavaSVN to sanely test what I’ve done.
    I’m sure it’s possible, and I hope that if others find this hack useful, they could help me to test it properly. I’m somewhat ashamed of this situation, given my firm belief in TDD – it’s due to a lack of understanding of where to go, not a belief that I’ll have magically got the code right. On the plus side, all the built-in FitNesse tests still pass, so I’m reasonably confident that if you run it without actually using the Subversion code, it’ll still work.
  • I’m really worried about threading. It’s unlikely to be a problem unless you happen to get two users doing things to the same pages at the same time, but the level of locking present in FileSystemPage doesn’t really cut it. That level is too low to be particularly useful, as one responder may need to change several pages on disk, and that should be done reasonably atomically. (I don’t even try to do it in an atomic way in terms of Subversion, but stopping other disk activity from interfering would be helpful.) Of course, you should never end up with a hosed Subversion repository (I really hope the server just wouldn’t let you do that) but it may be possible to get into a situation where you need to either do some delicate work updating the working copy manually, or just check the whole tree out again.
  • Currently, files (the ones under the files directory) aren’t versioned. I’m not sure how easy that will be to fix, but it’s
    obviously something which is needed before it’s really production-ready. Hooks are needed for upload, delete and rename. Creating a directory probably doesn’t need to be versioned, so long as the directory is put under version control when the first file is created.

The whole change (a total of seven files – it’s a relatively small code change, all things considered) is available along with installation instructions on my main web site. It’s pretty basic at the moment, but if it all takes off, who knows what could happen?

Pedantry – how much is too much?

I’m a pedant, there’s no doubt about it. I’m particularly pedantic when it comes to terminology in computing discussions – at least where I see value in being precise about what is meant. So, when discussing static constructors in a mailing list thread recently, I’ve been very carefully distinguishing between a static constructor (which is a C# term) and a type initializer (which is a CLI term). This hasn’t been met terribly favourably by those who wish to use the term “static constructor” to mean both the .cctor member in a (compiled) type and the C# static constructor, despite them being slightly different in semantics and belonging to different domains. Now, I don’t wish to spill that discussion over onto my blog, but it has made me think about the general issue of pedantry when it comes to terminology.

Pedantry is rarely popular, but I believe it does bring value to a discussion, especially when some subtleties are involved. I generally assume a specification to be the authoritative source of information on terms related to the topic covered by the specification, as it’s a piece of common ground on which to base discussions. (The exception to this is if the spec is generally agreed to be incorrect in a particular regard.) If I talk about something being a variable and you understand “variable” in a completely different way to me, it’s a potential source of great confusion. I’m not pedantic to gain a feeling of superiority – I’m pedantic to try to make sure everyone’s effectively speaking the same language.

Of course, you don’t need to be absolutely precise all the time. If I were discussing an ASP.NET problem, for instance, I probably wouldn’t feel too bad about a sentence such as “x is now a string of length 5”. However, if I were discussing variables, reference types etc, I’d probably try to be more precise: “The value of x is now a reference to a string of length 5.” Writing (or reading) the second style for prolonged periods gets quite tedious, but I believe it’s important to be able to move into that mode when the need arises.

So, the question is: am I the only one who feels this way? I would expect most of the readers of this blog to be people who’ve read either my newsgroup posts, mailing list posts, or C# articles, so you probably have a fair idea of what I’m like. Do I go over the top, or do you find it useful? Is there a way of bringing precision to a discussion without irritating people (as I tend to, unfortunately)? Just to possibly remind you of things I’m often pedantic about, here’s a brief list of “pet peeves” which tend to involve people cutting fast and loose with terminology:

  • Value types “always being on the stack”
  • “Objects are passed by reference by default”
  • “C# supports two floating-point types: float and double.” (That one’s in the C# spec, unfortunately – decimal is also a floating point type.)
  • “I’m having trouble with ASCII characters above 127…” (along with its side-kick “I’m using extended ASCII”)
  • Volatility and atomicity being mixed up