Category Archives: Book reviews

Book Review: Async in C# 5.0

May 20, 2013 jonskeet 3 Comments

Resources:

Amazon, Barnes and Noble, Play Books
The book’s web site (O’Reilly) – downloads, errata etc

A while ago I was attending one of the Developer, Developer, Developer conference in Reading, and I heard Alex Davies give a talk about actors and async. He mentioned that he was in the process of writing a short book for O’Reilly about async in C# 5, and I offered to review it for him. Many months later (sorry Alex!) I’m finally getting round to it.

Disclaimer: The review copy was given to me for free, and equally the book is arguably a competitor of the upcoming 3rd edition of C# in Depth from the view of readers who already own the 2nd edition… so you could say I’m biased in both directions. Hopefully they cancel out.

This is a book purely on async. It’s not a general C# book, and it doesn’t even cover the tiny non-async features in C# 5. It’s all about asynchrony. As you’d expect, it’s therefore pretty short (92 pages) and can comfortably be consumed in a single session. Alex’s writing style is informal and easy to read. Of course the topic of the book is anything but simple, so even though you may read the whole book in one go first time, that doesn’t mean you’re likely to fully internalize it straight away. The book is divided into 15 short chapters, so you can revisit specific areas as and when you need to.

Aside

I’ve been writing and speaking about async for about two and a half years now. I’ve tried various ways of explaining it, and I’m pretty sure it’s one of those awkward concepts which really just needs to click eventually. I’ve had some mails from people for whom my explanation was the one to do the trick… and other mails from folks who only "got it" after seeing another perspective. I’d encourage anyone learning about async to read a variety of books, articles, blog posts and so on. I don’t even think it’s a matter of finding the single "right" explanation for you – it’s a matter of letting them all percolate.

The book covers all the topics you’d expect it to:

Why asynchrony is important
Drawbacks of library-only approaches
How async/await behaves in general
Threading and synchronization contexts
Exceptions
Different code contexts (ASP.NET, WinRT, regular UI apps)
How async code is compiled

Additionally there are brief sections on unit testing, parallelism and actors. Personally I’d have preferred the actors part to be omitted, with more discussion on the testing side – particularly in terms of how to write deterministic asynchronous tests. However, I know that Alex is a big fan of actors, so I can forgive a little self-indulgence on that front.

There’s one area where I’m not sure I agree with the advice in the book: exceptions. Alex repeatedly gives the advice that you shouldn’t let exceptions go unobserved. I used to go along with that almost without thinking – but now I’m not so sure. There are definitely cases where that definitely is the case, but I’m not as comfortable with the global advice as I used to be. I’ll try to put my thoughts in order on this front and blog about this separately at a later date.

That aside, this is a good, pragmatic book. To be honest, I suspect no book on async is going to go into quite as many details as the PFX team blog, and that’s probably a good thing. But "Async in C# 5.0" is a very good starting point for anyone wanting to get to grips with async, and I in no way begrudge any potential C# in Depth 3rd edition sales I may lose by saying so ;)

Book reviews, C#

Book Review: Fluent C# (Rebecca Riordan, Sams)

December 5, 2011 jonskeet 30 Comments

(As usual, I will be sending the publisher a copy of this review to give them and the author a chance to reply to it before I publish it to the blog. Other than including their comments and correcting any factual mistakes they may point out, I don’t intend to change the review itself.)

Resources:

Publisher page (includes source code download)
Amazon / Barnes and Noble
My unofficial errata and notes
A more positive series of review blog posts (just for balance)

Introduction and disclaimers

In late October, Sams (the publisher) approached me to ask if I’d be interested in reviewing their newest introductory book on C#. Despite my burgeoning review stack, I said I was interested – I’m always on the lookout for good books to recommend. So, the first disclaimer is that this was a review copy – I didn’t have to pay for it. I don’t believe that has biased this review though.

Second disclaimer: obviously as C# in Depth is also "a book about C#" you might be wondering whether the two books are competitors. I don’t believe this is the case: Fluent C# explicitly talks about its target audience, which is primarily complete newcomers to programming. C# in Depth pretty much requires you to know at least C# 1, or perhaps be very comfortable with a similar language such as Java. I find it hard to imagine someone for whom both books would be suitable.

Obviously that puts me firmly out of the target audience. As I’ve written before, if you think the two most important questions to answer in a technical book review are "Is it accurate?" and "How good is at teaching its topic?" then any one person will find it hard to answer both questions. Although I’m far from an expert in some of the areas of the book – notably WPF – I’m sure I don’t have the same approach as a true newcomer. In particular, I find myself asking the questions I’d need the answers to in order to develop software professionally: how do I test it? How does the deployment model work? How does the data flow? These aren’t the same concerns as someone who is coming to programming for the first time. This review should be read with that context in mind: that my approach to the subject matter won’t be the same as a regular reader’s.

Physical format and style

Fluent C# is very reminiscent of Head-First C# in its approach, even down to the introductory "why this book is great at teaching you" blurb. It’s all very informal, with lots of pictures, diagrams and reader exercises. It’s a chunky book, at nearly 900 pages including the index – which I’d expect to be pretty daunting to a newcomer. However, that isn’t the main impression you come away with. Instead…

It’s brown. Everywhere. The diagrams, the text, the pictures – they’re all printed in brown, on off-white paper.

Combined with using multiple fonts including cursive ones, this makes for a pointlessly irritating reading experience right from the outset, however good or bad the actual content is. Now it’s possible that this is actually deliberate: I was speaking to someone recently who mentioned some research that shows if you use a hard-to-read font in presentations, people tend to end up reading it several times, so you end up with better memories of the content than if it had been "clean". I don’t know if that’s what Sams intended with this book, but I frequently found myself longing for simple black ink on clean white paper.

Leaving that to one side, I’m not sure I’ll ever really be a fan of the general tone of books like this, but I can certainly see that it’s popular and therefore presumably helpful to many people. It’s not clear to me whether it’s possible to create a book which retains the valuable elements of this style while casting off the aspects which rub me up the wrong way. It’s something about the enforced jollity which just doesn’t quite sit right, but it wouldn’t surprise me if that were more a peculiarity of my personality than anything about the book. Again, I’ve tried to set this to one side when reviewing the book, but it may come through nonetheless.

Structure

The book is broken up into the following sections, with several chapters per section:

Getting started (122 pages – finding your way around Visual Studio, debugging, deployment)
The Language (100 pages – introduction to C#)
The .NET Framework Library (162 pages – text, date/time APIs, collections – and actually more about C# as a language)
Best practice (116 pages – inheritance, some principles, design patterns)
WPF (341 pages)

I’ve included the page count for each section to show just how much is devoted to WPF. The book goes into much more detail about WPF than it does about the C# language itself (for example, drop shadow effects are included, but the "using" statement and nullable value types aren’t). If you want to write any kind of application other than a WPF one, a large part of the book won’t be useful to you. That’s not to say it’s useless per se – and in fact from my point of view, the WPF section was the most useful. The section on brushes is probably the best written in the whole book, for example. At time it feels to me like the author really wanted to write a book about WPF, but was asked to make it one about C# instead. That may well not be the case at all – it was just an impression.

Even though the best practice section talks briefly about MVC, MVP and MVVM, it doesn’t really go into enough detail to build anything like a real application – and in fact there’s no coverage of persistence of any form. No files, no XML, no database – nothing below the presentation layer, really. As such, although the book claims it’s enough to get you started with application development, it actually only provides a veneer. Even though I didn’t like the first edition of Head-First C# back in 2008, it did at least take the reader end-to-end – the exercises led to complete applications. The best practice section isn’t entirely about architecture and design patterns, however – it’s at this point that inheritance is properly introduced. While I wouldn’t personally count that as a "best practice" as such, it does at least come at the start of the section, before the genuine patterns/architecture areas which would have been harder to understand without that background.

One aspect which concerned me was the emphasis on the debugger and interactive diagnostics. The author states that developers should expect to spend a large part of their time in the debugger, and she says how she prefers using MessageBox.Show for diagnostics over Console.WriteLine information appearing in the output window. While I’m all for something more sophisticated than Console.WriteLine, there are solutions which are a lot less invasive than popping up a dialog, and which can be left in the code (possibly under an execution-time configuration) to allow diagnostics to be produced for real systems.

The "testing and deployment" chapter says nothing about automated tests – it’s as if the author believes that "testing" only involves "running the app in the debugger and seeing if it breaks". I hope that’s not actually the case, and I can understand why newcomers ought to at least know about the debugger – but I’d have welcomed at least a few pages introducing unit testing as a way of recording and checking expectations of how your code behaves. My own preference is to spend as little time in the debugger as possible; I know that’s not always practical, particularly for UI work, but I think it’s a reasonable aim.

Accuracy

Anyone following me on Twitter or Google+ knows where I’m going with this section. After reading through the book, pen in hand (as I always do, even for the books I like), I decided that it was more important to get out some form of errata quickly than this review. As such, I started a Google document which is publicly available to read and add comments to. The result is over 60 pages of notes and errata, and that’s excluding the introduction and table of contents. To be fair to the book, some of those notes are matters of disagreement which are more personal opinion than incontrovertible fact – but there are plenty of simple matters of inaccuracy. Some of the worst are:

Claims that String is a value type. (It’s a reference type.)
Inconsistency between whether arrays are value types or reference types – but consistently claiming that arrays are immutable, with the exception of the size which can be changed (slowly) using Array.Resize. (Array types are always reference types, and they’re always mutable except the size, which is fixed after creation. Array.Resize creates a new array, it doesn’t change the size of the existing one.)
Incorrect syntax for chaining from one constructor to another.
The claim that all reference types are mutable. (Some aren’t, and indeed I often aim for immutability. The canonical example of an immutable reference type is String.)

There are plenty more – including huge number of samples which simply won’t compile. Whole double page spreads where every class declaration is missing the "class" keyword. Pieces of code using VB syntax… the list goes on. (The VB syntax errors are probably explained by the author’s other book published at the same time: "Fluent Visual Basic". I suspect there was a certain amount of copy/paste, and the editing process didn’t catch all the changes which were needed to reflect the differences between the languages.)

Beyond the factually incorrect statements, there’s the matter of terminology. Now I’m well aware that I care more about terminology than more people – but there’s simply no reason to start making up terminology or misusing the perfectly good terminology from the specification. The book has a whole section on "commands" in C#, including things like for statements, switch statements, try/catch/finally statements. Additionally, it mislabels class and namespace declarations as "statements", and even mislabels using directives as statements – although it later goes back on the latter point. The word "object" is used at various times to mean any of variable, type, class and object, with no sense of consistency that I could fathom. For example, at one point it’s used in two different senses within the same sentence: "As we’ll see, you can define several different kinds of objects (called TYPES) in C#, but the one you’ll probably work with most often is the OBJECT."

Both accuracy and staying consistent with accepted terminology (primarily the specification) are particularly important for newcomers. If there’s a typo in a relatively advanced book – or in one which is about a particular technology (e.g. MVC) rather than an introductory text on a language, the reader is fairly likely to be able to guess what should really be there based on their existing experience. If a beginner comes across the same problem, they’re likely to assume it’s their fault that the code won’t compile. Likewise if they learn the wrong terminology to start with, they’ll be severely hampered in communicating effectively with other developers – as well as when reading other books.

I don’t want to make it sound like I expect perfection in a book – just yesterday someone mailed me a correction to C# in Depth, and I’d be foolish to try to hold other authors to standards I couldn’t meet myself. Nor am I suggesting it’s easy to be both accessible and accurate – so often an author may have an accurate picture of a complex topic, but have to simplify it in their writing, particularly for an introductory book like Fluent C#. But there are limits – and in my view this book goes well past the level of error that I’m willing to put up with.

Conclusion

I really don’t like ranting. I don’t like sounding mean – and I wanted to like this book. While I like C# 4.0 in a Nutshell and Essential C# 4.0, I’m still looking for a book which I can recommend to readers who want a more "lively" kind of book. Unfortunately I really can’t recommend Fluent C# to anyone – it is simply too inaccurate, and I believe it will cause confusion and instil bad habits in its readers.

So, what next? I’m hoping that the publisher and author will take my errata on board for the next printing, and revise it thoroughly. At that point I still don’t think I’d actually like the book due to its structure and WPF focus (and the colour scheme, which I don’t expect to change), but it would at least be more a matter of taste then.

I have some reason to be hopeful – because my review of Head-First C# was somewhat like this one, and one of the authors of that book (Andrew Stellman) was incredibly good about the whole thing, and as a result the second edition of Head-First C# is a much better book than the first edition. Again, it’s not quite my preferred style, but for readers who like that sort of thing, it’s a much better option than Fluent C# at the moment, and one I’m happy to recommend (with the express caveat of getting the second edition).

At the same time, reading Fluent C# (and particularly thinking about its debugger-first approach) has set me something of a challenge. You see, I’ve mostly avoided writing for new programmers so far – but I feel it’s really important to get folks off on the right foot, and I’d like to have a stab at it. In particular, I would like to see if it’s possible to write an introductory text which teaches C# using unit tests wherever possible… but without being dry. Can we have a "fun" but accurate book, which tries to teach C# from scratch without giving the impression that user interfaces are the be-all and end-all of programming? Can I write in a way which is more personal but doesn’t feel artificial? I can’t see myself starting such a project any time in the next year, but maybe some time in 2013… Watch this space. In the meantime, I’ll keep an eye out for any more introductory books which might be more promising than Fluent C#.

Book reviews, Books, C#

Book Review: Effective C# (2nd edition) by Bill Wagner

September 25, 2010 jonskeet 10 Comments

Resources:

Disclaimer

Just in case you’re unaware, I’m the author of another C# book, C# in Depth. Although Effective C# is somewhat different to my book, they certainly share a target audience. To that extent, Bill and I are competitors. I try hard to stay unbiased in reviews, but it’s probably impossible. Bear this in mind while reading. I should also note that I didn’t buy my copy of Effective C#; it was kindly sent to me by Pearson, for the purpose of reviewing.

Content and target audience

Effective C# is a style guide for C# developers – but not at the low level of "put your braces here, use PascalCase for method names;" instead, it’s at the design level. As far as I can tell, the aim isn’t to be complete, just the most important aspects of style. (Hey, otherwise there wouldn’t be any need for More Effective C#, right?) There are 50 mostly-self-contained items, totalling about 300 pages to digest – which is a nice size of book, in my opinion. It’s not daunting, and the items can definitely be bitten off one at a time.

Looking down the table of contents, the items are divided into six categories: "C# language idioms", ".NET resource management", "Expressing Designs in C#", "Working with the Framework", "Dynamic Programming in C#", and "Miscellaneous". Broadly speaking these contain the sorts of thing you’d expect – although it’s worth pointing out that a significant chunk of "Working with the Framework" is given over to Parallel Extensions, which may not be obvious from the title. (It’s a really good source of information on PFX, by the way.)

This is not a tutorial on C#. If you don’t know C# reasonably well already (including generics, lambda expressions and so on) you should read another book first, and then come back to Effective C# in order to get the most out of it.

Comment from Bill: generics and lambda expressions (and LINQ) are covered in some detail in More Effective C#. It’s a bit strange that as of the 2nd edition, Effective C# covers a newer version of the language than More Effective C#. I tried hard to make sure neither book expects a reader to have read the other, but the organization of both books as a whole does show the hazards of hitting a moving target.

That’s not to say that there’s no explanation of C# – for example, Bill goes into a few details about the "dynamic" type from C# 4, as well as overloading and how optional parameters work. But these are meant to just cover some poorly-understood (or very new) aspects of the language, rather than teaching you from the beginning. The balance here feels just right to me – I believe most professional C# developers will learn details of C# they weren’t aware of before, but won’t be confused by the basics that Bill left out.

Accuracy, opinion and explanation

My copy of Effective C# has plenty of ink annotations now. They broadly fall into five categories:

"Ooh, I’d never thought of that" – aspects of C# which were genuinely new to me
"Hell, yes!" – things I agree with 100%, and which will help developers a lot
"Um, I disagree" – points where Bill and I would go probably different routes, presumably due to different experiences and having worked in different contexts. (It’s possible that when put in the same context, we’d do the same thing, of course.)
"No, that’s technically incorrect" – a few areas which are outright wrong, or were correct for previous versions of the framework/CLR, but aren’t correct now
"That’s not what that term means" (or "that’s not the right term for the concept you’re trying to get across") – it should come as no surprise to regular readers that I’m a bit of a pedant when it comes to terminology

The majority of my annotations are of the third category – disagreements. That’s not because I disagree with most of the book; it’s just that the second category is reserved for vehement agreement. I haven’t bothered to note every sentence that I’m just fine with.

The good news is that in areas where we disagree, Bill does an admirable job of stating his case. I disagree with some of his arguments – or I can give counter-examples, or merely place different value on some of the pros and cons – but the important thing is that the reasoning is there. If it doesn’t apply to your context, evaluate the advice accordingly.

It’s entirely reasonable for there to be quite a bit of disagreement, as much of the book is opinion. It’s obviously founded in a great deal of experience (and I should note that Bill has spent a lot more time as a professional C# developer than I have), but it’s still opinion. I rather wish that the book was a wiki, so that these items could be debated, amended etc, as per my dream book – I think that would make it even more valuable.

There are relatively few absolutely incorrect statements, and even on the terminology front it’s usually two things which have bugged me repeatedly. Bill uses "implicit properties" for "automatically implemented properties"; I’ve usually heard developers use the abbreviated form "automatic properties" but "implicit" is new to me. Likewise the book talks about "overriding ==" instead of "overloading ==" reasonably frequently. It’s a shame I was too busy with my own book to participate in the technical review for Effective C#, as I suspect that on at least some of these points, Bill would have been happy to amend the text accordingly. I shall, of course, transcribe my comments and send them to him.

Comment from Bill: I’ll make those corrections for the subsequent printings.

What’s missing?

There are some areas which I wish Bill had touched on or emphasized more. Topics such as numbers, text and chronological values could have been given some space as they frequently confuse folks (and are full of pitfalls; see Humanity: Epic Fail for more of my thoughts on this). I would personally have placed more importance on the mantra of "value types should be immutable" – it’s certainly talked about, but in the context of "preferring" atomic, immutable value types – and preferring value types over reference types in rather more situations than I’d personally use. In terms of importance in avoiding shooting yourself in the foot, making sure all structs are immutable comes near the top of the list in my view.

"More Effective C#" doesn’t cover those areas as far as I can tell from the table of contents, but it does go into details about generics and various aspects of C# 3 and LINQ, which are clearly part of any modern C# developer’s toolkit. I certainly intend to get hold of the book to see what else I have to learn from Bill.

I think it might have been nice to have a few sections at an even higher level than the specific items in Effective C#. Topics such as:

Don’t trust me: I don’t know your context. Even the smartest folks can only give advice in fairly general terms in books. Don’t apply this advice blindly; weigh up the arguments presented and work out how they apply to your actual code.
The importance of testing. It’s possible that this was mentioned, but I don’t recall it. Perhaps it’s a little on the opinionated side (see previous point…) but for significant code bases, testing should be deeply ingrained in whatever process you’re using. Note that although it’s worth trying to keep high standards in test code, it often has a very different look and feel to production code, and different "best practices" may apply.
Encouraging "working with the language" – if you find yourself fighting the language, you may discover you can keep winning battles but losing the war. Typically changing your design to represent more idiomatic C# will make life more pleasant for everyone.
Performance: how you might decide when and how much to care.

Very few of these would be C#-specific, of course – which may be why Bill left them out. You could easily fill a whole book like that, and it would probably be horrible to read – full of platitudes rather than concrete advice. I personally think there’s room for some discussion of this kind though.

Comment from Bill: The ultimate goal was to have the book be that ‘nice size’ you mention above. I agree that all of those concepts are important. I felt that many of these features were not C# specific (or even .NET specific) that I felt better covered elsewhere. However, that ‘working with the language’ was one area where I feel that I do cover. There are only a small number of negative titles (e.g "avoid" something or "do not" do something). In those cases, I tried to recommend alternatives where you would find yourself "working with the language".

Conclusion

I like Effective C# a lot. I like the fact that I’ve disagreed with a number of the points raised, and in disagreeing I’ve found myself thinking about why I disagree and in what situations each point of view may be appropriate. I worry a little about inexperienced readers who may be tempted to treat this (or any other book) as an ultimate truth to be quoted out of context and used to beat other developers into inappropriate solutions… but hopefully most readers won’t be like that.

Comment from Bill: I also hope that most readers avoid that. Thank you for pointing out that I’ve tried very hard to explain when my advice applies, and when it doesn’t. That is critical.

It’s definitely encouraged me to try to write a somewhat similar book at some point… possibly not with the same organization, and probably dealing with some very different topics – but it’s good to see that it can work. Whether I can pull it off as well as Bill remains to be seen, of course.

I’ll look forward to reading More Effective C# – although my pile of books to review is groaning somewhat, and I should probably go through another of those first :)

Book reviews, Speaking engagements

Book review: “Confessions of a public speaker” by Scott Berkun

May 9, 2010 jonskeet 11 Comments

Resources

Introduction

A couple of weeks ago I was giving a presentation on Reactive Extensions at VBUG 4Thought spring conference, and there was an O’Reilly stand. I picked up CLR via C# 3rd edition (I now have all three editions, which is kinda crazy) and I happened to spot this book too.

I’ve been doing a reasonable amount of public speaking recently, with more to come in the near future (and local preaching roughly once a month), so I figure it would probably be a good idea to find out how to actually do it properly instead of bumbling along in the way I’ve been doing so far. This looked as good a starting point as any.

It’s been a while since I’ve had a lot of time for reading, unfortunately – C# in Depth is still sucking my time – but this is a quick read, and I finished it on the plane today. I should point out that I’m currently flying to Seattle for meetings in the Google Kirkland office. The book itself is in the overhead locker, so obviously I could reach it down – but I’d rather not. Surely a book like this should at least largely be judged by the impression it makes on the reader; if I couldn’t find enough to talk about when I only finished it a few hours ago, that would be a bad sign. It does mean that I’m not going to be as concrete in my notes as I would usually be – but that’s probably reasonably appropriate for a non-technical book anyway.

Content

The book covers various different topics, from preparation to delivery and evaluation. The book is clearly divided into chapters – but a lot of the time the topics seem to leak into chapters other than the ones you might expect them to crop up in. If this were a technical book, I would view that as a bad thing – but here, it just worked. In some ways the book mirrors an actual presentation in terms of fluidity, narration and imagery. Sometimes this is explicitly mentioned, but often in retrospect and never in a self-aggrandising manner.

Although steps for designing your overall talk are examined, there’s little guidance on how to design slides themselves: that’s delegated to other books. (I’m reasonably happy with my slide style at the moment, and as it’s somewhat uncommon, it may well not benefit much from “conventional wisdom” – there are plenty of bigger picture things I want to concentrate on improving first, anyway.)

There are suggestions for audience activity – from moving people to the front to make an underpopulated venue feel more friendly, to trying to make the audience actively use what they’ve been told for a better learning experience. While I’d like to think I’m a pretty friendly speaker, I could definitely improve here.

While there are some mentions of financial matters, there’s no discussion of getting an agent involved, or what kind of events are likely to be the most lucrative and so on. There is the recommendation that you either need to be famous or an expert to make money – which sounds like very good advice to me. I have no particular desire to go into this for money (and I think I have to speak for free under my current contract with Google) so this was fine by me.

Anecdotes abound: they’re part of the coverage of pretty much every topic. At the end there’s a whole section of gaffes made by Scott and other speakers, as a sort of “you think you’ve had it bad?” form of encouragement. There’s never a sense of the stories being inserted with a crowbar, fortunately – that’s a trait which usually annoys me intensely in sermons.

Evaluation

As you can probably tell already, I liked the book a lot. Scott is a good writer, and I strongly suspect he’s a great presenter too – I’ll be looking out for his name in conferences I’m going to, with the hope of hearing him first hand.

The real trick is actually applying the book to my own speaking though. It would be hard to miss one central point where I fail badly: practising. I simply don’t go through the whole “talking in the living room” thing. For a couple of talks I’ve gone through a dry-run at Google first as a small-scale tech talk, but usually I just put the slides (and code) together, make sure I know roughly what I’m going to say on each topic, and wing it. Assuming the book is accurate, this puts me firmly in the same camp as most speakers – which is somewhat reassuring, but doesn’t actually make my talks any better.

So, I’m going to try it. I’m expecting it to be hugely painful, but I’ll give it a go. I feel I somehow owe Scott that much – even though he makes it very clear that he expects most readers not to bother. Possibly putting it as a sort of challenge to exceed expectations is a deliberate ploy. More seriously, he convincingly makes the point that I owe the audience that much. We’ll see how it goes.

There are plenty of other aspects of the book which I expect to put to good use – particularly in terms of approaching the relevant topic to start with, writing down a big list of possible points, and whittling it down. I’m not going to promise to write a follow-up post trying to work out what’s helped and what hasn’t… I know perfectly well that I’d be unlikely to get round to writing it.

Conclusion

If you speak in public (which includes “internal” talks at work) I can heartily recommend this book as an entertaining and potentially incredibly helpful read.

We’ll see what happens next…

Book reviews

Non-review: The Data Access Handbook by John Goodson and Robert A. Steward

July 10, 2009 jonskeet Leave a comment

A while ago I agreed to write a review of this book (which the publisher sent me a free copy of) but I haven’t had time to read it fully yet. I’ve been skimming through the first couple of chapters though, and it’s pretty interesting. I’ll post a full review when I have more time (along with reviews of CLR via C# and a bunch of other books) but I thought it would be at least worth mentioning the book in advance.

It’s really a performance book – as far as I can tell that’s its sole purpose (and I’m lumping scalability in with performance) which is fine. It covers some generalities and then splits by client technology (ODBC, JDBC, .NET) for the middle section. The final chapters are general-purpose again.

I’m loathe to say much more about it yet, having only read a small amount – but I’ll definitely be reading the rest. It’s unlikely to be particularly useful in my current job, but you never know – one day I may be talking to a regular SQL database again :)

Links:

Book reviews, C#, Parallelization

Book Review: C# 2008 and 2005 Threaded Programming: Beginner’s Guide

March 16, 2009 jonskeet 27 Comments

Note: The author of this book has requested that I remove their name from this blog post. I have done so in accordance with their wishes, editing comments as well.

Update (19th March 2009)

Debate around this review is getting heated. I stand by all the points I make about the text, but I’d like to clarify a few things:

If there are any ad hominem comments in the review against the author, please ignore them. I’m going to try to weed out any that I find, but if you spot one, please let me know and then ignore it. I feel very strongly that a review should be about the text of a book, not about its author. The text is what will inform the reader, not the author’s other work. I’m aware that the author has written many other books, and is generally well-regarded (as far as I can tell, anyway). That neither helps nor hinders the text. The same goes for me and the review, of course. Whether you know me or not, whether you’ve read anything else I’ve written or not, the review should stand on its own merits. This is not a popularity contest – it’s a discussion about a technical book.
The impression I’ve given in the review is almost entirely negative. This is because that’s the impression I received as a reader interested in accuracy and best practices. That does not mean that the book is entirely inaccurate – far from it. There are plenty of aspects where I have no particular issues with the accuracy. (The code style is more uniformly disagreeable to me, but that’s a subjective matter.) However, there is enough inaccuracy (and bad practice, in my view – somewhat subjective, but less so than the code style) to make that the dominant impression left with me, alongside my surprise that there’s no proper discussion of locking. As an analogy, imagine you go to a choral concert. Suppose the sopranos, altos and tenors are all perfectly in tune, but the basses are out of key the whole time. In some senses the concert would be 75% accurate – but the 25% inaccuracy would be enough to ruin it. So it is with technical books (not just this one) – it only takes a relatively small degree of inaccuracy to make the difference between a good book and a bad one. The bottom line is: even I don’t think everything or even most of what’s written in the book is wrong; there are enough problems to make me dislike it though.
I’ve made a few minor edits to the review just now, to address a few comments made so far. If some of the comments appear to be odd, that may be why!

Resources

Publisher’s page (Packt) – this is the cheapest way to buy the book as far as I can see
Sample code (49MB download! Mostly because it contains bin/obj directories for all solutions…)
Amazon or Barnes and Noble links if you don’t want to buy it directly
John Mueller’s review – a much more positive review than this one, which may prove an interesting counterbalance for readers. (Thanks to Erik for pointing out John’s review in the comments.)

Disclaimer

This book doesn’t really compete with C# in Depth, but obviously the very fact that it’s another book about C# at all means I’m probably not entirely unbiased. Arguably it also “competes” with my own (somewhat out of date now) threading article, although that’s not a monetary venture for me. I should also point out that my copy was sent to me for free, specifically for review, by Packt Publishing.

Audience and content

The book claims that “Where you are a beginner to working with threads or an old hand who is looking for a reference, this book should be on your desk.” In practice, I don’t think it’s really suitable as a reference. The kind of information you really want as a reference is hard to find amidst the bulk of the book, which is on-going examples. For the rest of this review I’ll regard the intended audience as just beginners.

The first chapter (out of 12; at 388 pages one of the nice things about this books it that it’s relatively slim) is introductory material about threads and processes, and why concurrency is important in the first place. After this one code-free chapter, the rest of the book is all example-based. The pattern goes something like this:

Give rough idea of what we’re building
Create first version of the code
Explain what it does and why it may not be ideal
Improve it
Explain how the improvements work
Move to next example or add new major feature

That sounds all very well, but I’ll get to my issues with it in a minute. Although the examples are constantly evolving, they essentially break down into these applications:

“Code cracking” (brute-forcing a 4 character code)
“Encrypting” SMS messages (not real encryption – no key – but a general CPU-intensive transformation)
Image processing to find and highlight “old stars” in NASA images
“Encrypting” several files
More image processing – adjusting the brightness of a large image and thumbnailing it

These are all Windows Forms applications, and are frankly pretty similar, all basically dealing with simple, embarrassingly parallel tasks. That’s not to say the author doesn’t get a fair set of different techniques and lessons out of them:

Keeping the UI thread free (and seeing what happens when you don’t)
Tips for debugging multi-threaded apps in Visual Studio
Showing the performance for individual processes using Task Manager and Windows Explorer
Using BackgroundWorker to update the UI
Queuing tasks in the system thread pool
Creating new threads explicitly
Using Control.Invoke/BeginInvoke to update the UI (although this comes very late in the book – chapter 10 out of 12)
Keeping tasks independent
Noting that sharing data between threads is difficult – but coming to the wrong conclusions (more later)
Using the Timer component (just the WinForms timer; not System.Threading.Timer, System.Web.UI.Timer or System.Timers.Timer) – although later on he uses a BackgroundWorker for a task much more suitable for a Timer.
A bit of OO design, although in a pretty botched way – the idea of having a general-purpose “parallel algorithm” class and a “parallel algorithm piece” class is reasonable, but it isn’t handled nearly as well as it might be
Fairly disastrous advice (IMO) about both I/O and the GC
Exception “handling” (where “swallowing exceptions and just reporting them with Debug.Print” counts as “handling” apparently)
Parallel Extensions from .NET 4.0, with both PLINQ and TPL

Unfortunately, this misses out some of the most important concepts in parallelism on .NET. The author frequently mentions locking, but only ever in a “we’re avoiding doing it” way. I find it absolutely incredible that a book on multi-threading in C# doesn’t even mention the “lock” keyword. Okay, it’s nice to be able to split tasks up completely independently where possible, but in the real world you sometimes have to use shared mutable state (or at least, it’s often the simplest approach).

When I first got the book, I looked up several entries in the index to see how they’d be handled. I was shocked to find that none of these have an index entry:

lock
volatile
memory model
Monitor
Wait or Pulse
BeginInvoke or Invoke
double-checked locking
mutable or immutable

The concept of accessing state from multiple threads is glossed over for the entirety of the book. Basically whenever multiple threads want to make their results available, they put them in different elements of an array or list. There’s an assumption that if you read from that array/list in a different thread, it’s all okay. Likewise there’s an assumption that it’s appropriate to read integer variables written to in one thread from another thread without any locking, volatility or use of the Interlocked class. I’ll come back to this topic when I tackle accuracy later on.

Style

This is a very informal book: something I have no problem with. English clearly isn’t the author’s first language, and although I don’t blame him for some of the clumsy wording in the book (e.g. “We will not leave behind the necessary pragmatism in order to improve performance within a reasonable developing time”) I do wish the book’s editorial team had done a better job in that respect. It’s tricky with technical books: non-technical editors have good reason to be wary of going too far, as small changes in wording can have make a large difference semantically, but it does make a big difference to a book’s readability when the language is clear and idiomatic. (As a side note, I feel incredibly fortunate to have English as my native tongue. I’m not fluent in any foreign languages, and I’m often amazed at how well others manage.)

There are other elements of the style of the book which I have much more of a problem with. The first is the way that the examples are handled. A very large proportion of the book is just lists of instructions: “add some using directives: <code>; add these variables: <code>; add this procedure: <code>; add another procedure <code>; add an event handler <code>” with just a sentence or two of explanation for each one as you go. There’s much more explanation after all the code has been added, but the way that the code is given makes it very hard to see what’s going on. We almost never get to see a whole class in one listing – it’s always broken up into using directives, variables, individual methods etc. This may not be too bad if you’re following along with the book at every single point, but it makes it very hard to just read. As a friend has commented, this content might work a lot better as a screencast, rather than as a book.

One detailed gripe: nearly every time a property is introduced, the author uses the phrase “we want to create a compact and reliable class” as a justification. There’s no explanation, and quite often the properties are mutable for no good reason (when a genuinely reliable class in a multi-threaded setting would be immutable). After a while it made me want to grind my teeth every time I saw it.

The feeling is very much that of a Head First book, but one which doesn’t work. For all my misgivings about Head First C# (which I believe is now very much better now that a large number of errors have been removed) the general style was very well handled. It’s not my preferred style to start with (particularly focusing on large GUIs instead of short, complete console apps) but I rarely felt particularly lost in the listings – there was usually enough context to hold onto. Here, I feel there’s very little context at all. If you accidentally miss out a step, you’ll have a really hard time working out which one it is or what’s wrong.

On top of this, there’s the bizarre storyline “explaining” all the listings. Apparently you (the reader) originally started out cracking a code, then got hired by some other crackers, then the FBI, then NASA. We are told of FBI agents getting us capuccinos, the NASA CIO wanting you to use the Parallel Extensions CTP so that they can get free licences for Visual Studio 2010 and all kinds of other oddities. We are constantly bombarded with plaudits about our threading capabilities – by the last chapter we’re regularly being called “experts” and “threading gurus” despite the fact that we wouldn’t have a clue what was going on if someone presented us with some code using a “lock” statement. This is all patronising in the extreme – and again, Head First C# (and I suspect the rest of the Head First series) handles the “keep it informal but drive the topic forward” aspect a lot more successfully.

Finally, on the topic of style, I’d like to rant a bit about the coding style. It’s awful. Really awful. I realise that coding standards are to some extent a personal thing, but I object to code like this:

Pseudo-Hungarian (the type which uses “o” as a prefix for almost any object; not the type Peter likes) and the nature of every variable (local, parameter or instance variable) makes for horrendous variable names such as “prloOutputCharLabels”. It’s not even consistent – variables added by the designer only get a type designation prefix (lbl, but, pic) but no nature prefix. Aargh.
Methods are frequently camel-cased instead of Pascal-cased, e.g. “showFishes” and “checkCodeChar”. It’s possible that this is only true for private methods – a very quick flick through doesn’t reveal any public ones like this – but if so it’s inconsistently applied as there are certainly Pascal-cased private methods too. Some public properties combine both annoyances so far, with names such as “poThread” and “piBegin”.
Most (but not all) of the time the author declares all of a method’s variables at the top, even if they’re not used for a long time. This includes declarations of variables for use in loops. This took me right back to the 80s, writing ANSI C again. I believe that the ability to declare variables at the point of first use gives a significant improvement in readability. It’s easier to see where a variable will be used if its scope is limited, for example.
Using directives aren’t applied nearly thoroughly enough, leaving lots of explicit use of System.Diagnostics, System.Drawing, System.ComponentModel etc. Given the line length limitations in printed books, this is a real killer in terms of providing compact, readable code.
Speaking of line length limitations, it would be really useful to actually acknowledge them – if a comment is going to span two printed lines, starting just the first one with “//” and leaving the second indented but not really a comment isn’t a good idea.

So, we’ve got code broken up into chunks which breaks the flow of the code, and I don’t even like the style of the code. Still, I could live with that if it’s good quality code…

Accuracy and best practices

I’ve already indicated one of the significant problems I have with the book in terms of content: its complete absence of discussion about shared data and locking. Yes, this is a beginner’s book, and I wasn’t expecting the level of detail on the memory model which is present in Joe Duffy’s book (which I promise I’ll review soon) – but I’d certainly prefer to err on the side of safety. The book regularly just accesses data on one thread having written it on another, with no locking, volatility or use of Interlocked. This isn’t the sole bad practice, however, and it’s not limited to stylistic choices either. In the course of the book, we are told all of the items below (and more). Italics indicate what the book claims; regular type indicates my response. These aren’t verbatim quotes, but paraphrase:

Forcing garbage collection before starting a multi-threaded operation is a good idea. This is given as a sort of response to a screenshot of Process Explorer showing ugly memory usage. In fact, I can’t reproduce the kind of nasty graph that’s shown in the book, even with the code downloaded from the web site, but if I did see that there are definitely better ways of addressing it than forcing garbage collection. Disposing of Bitmaps appropriately would be a good start… as it is, each bitmap is going to hang around for at least one garbage collection cycle longer than it should, because we’ve got to wait for its finalizer to be executed. Making sure you dispose of objects appropriately is always a good idea – explicitly forcing the garbage collector is almost always a bad one. (Not absolutely always, but usually.)
WaitHandle.WaitAll has to run on an MTA thread – so let’s just change the [STAThread] line above the Main method to [MTAThread], with no warning that it’s a really bad idea to do that for Windows Forms. (Side-note: when trying to check that there really isn’t a warning, I had to spend a long time finding the section. The index doesn’t contain entries for MTAThread, STAThread, apartment, WaitHandle or WaitAll. In general the index could do with a lot of work. I’m painfully aware that indexing is a horrible task, but it’s important.)
Application.DoEvents() is a way of letting the UI process events. This is true – it’s also another really bad idea unless you absolutely have to use it. Re-entrancy is hard to debug – and not mentioned at all in the book, as far as I can tell.
Data streaming is wasteful, because two threads might both want to do I/O at the same time – it’s a better idea for each thread to load all the data it needs to and then start processing it. This is stated in a context where streaming is ideal – each thread just needs to process every line in a file. (Each thread is asked to process a different file.) There’s no dependency between the lines of the file. It’s an absolute gift – the buffering and pre-fetch techniques of Windows would guess we needed the next block of data before we ask for it, so the disk would be seeking while we’re encrypting, on each thread. At least, I strongly suspect it would – and I would profile the thing instead of just claiming that we’ve managed to avoid an I/O bottleneck by loading files in their entirety up-front. No mention is made of the fact that as soon as a bunch of big files are queued for encryption, you’ll have a bunch of threads all trying to load everything before they bother starting to do anything. Avoiding I/O contention is a tricky topic, and it deserves better than a couple of misleading paragraphs with no attempt at explaining what the benefits of streaming the data would be.
The thread pool is used to queue threads with work to do. If there are already lots of threads busy, the new threads will wait until the old ones have finished. Note the use of “threads” here – not tasks to run on a pool of existing threads, but threads. This would make the thread pool pointless – what is never explained in the book is that creating threads is a relatively expensive business, and you don’t want to do it repeatedly for short-lived tasks when you could instead create a pool of threads and reuse them to run several tasks. Once this purpose is clear, the notion of queuing threads becomes obviously wrong.
We can pass some state into the delegate used for work item queuing (or a ParameterizedThreadStart) and use that to give us some context. We need to cast that state to the relevant type before we can use it, because it’s just typed as System.Object. So far so good – except most of the time, the author ends up passing into the work item the same reference which would be available as just “this” within the method itself. So we have code such as:

loPiece.poThread = new Thread(new ParameterizedThreadStart(loPiece.ThreadMethod));
…
loPiece.poThread.Start(loPiece);

The ThreadMethod method then duly casts its parameter to its own type and uses it. All of this is pointless, as the method doesn’t need any parameters – it can just use “this” inside the method.
It’s very important to initialize lists with the right capacity. Again, this isn’t too bad as far as it goes – except that this micro-optimisation goes awry when he reads the TextBox.Lines property twice: once to work out the appropriate capacity and once to fill the list with initial data. Unfortunately the TextBox.Lines property has to take the existing text in the TextBox, split it (creating a bunch of substrings) and get the result into an array. This in turn means doing all the normal shenanigans associated with creating buffers which are bigger than you need, filling them, copying to a new buffer etc – exactly what we’re trying to avoid! This “optimization” will usually cost time instead of saving it. It could be easily fixed by just fetching the array in one statement, then using the same array for both the count and the list population. In fact, if you just pass the array into the List<T> constructor, it will perform the optimization for you – it can detect that it’s an ICollection<T> and use the Count property directly. Writing the simplest code actually ends up being optimal.
The above bullet point isn’t going to dominate the performance of that example though – there’s a potentially far worse effect due to the way the resulting “encrypted” string is broken up each time: using string concatenation in a loop. I guess we’d better hope there are no really long lines. If an author is going to give optimisation “tips” they need to be a lot more rigorous than this. Using string concatenation in a loop is probably the single best-known performance no-no in .NET. I was really shocked to see this in a book which is supposedly about making your code perform better. Now, it could be that string concatenation was used deliberately to slow things down – but in that case, why not highlight it? Drawing attention to intended optimizations gives the impression that the rest of the code is either optimized or has at least been written reasonably. If “bad” code is to be used for a specific purpose, that should be called out so that the reader won’t go onto use the same kind of code in their own production apps (which really shouldn’t be deliberately slowed down).

These aren’t the only issues I have with the code. Unicode is abused by “encrypting” text with no discussion of whether the strings he produces are valid or not (as opposed to the normal practice of only encrypting data after first converting it into binary; the encrypted binary might then be converted to text using base64 if you need to transmit the encrypted data as text). We could easily end up with strings containing surrogate high or low code points without the corresponding half in the appropriate place. When analyzing a bitmap he uses GetPixel and SetPixel for each pixel, rather than calling LockBits once and then accessing the image data in a much faster manner. (The code given does scale, but it’s not as fast as it could be. Using LockBits it would still scale, but the “per thread” work would be faster.) There are other, similar issues lurking in the text, but I’m sure you get the gist of the problem.

Conclusion

Believe it or not, there are things about this book that I actually like. It’s relatively thin, which has very tangible advantages when you’re carrying it around a lot. The sections explaining has to use Process Explorer and Task Manager to their best are useful, and the ideas of the examples are good – even though they basically cover the same ground several times. Unfortunately the bad points outweight the good far too heavily. To summarise them:

What I consider some of the absolute core elements of .NET multithreading (locking and monitors in particular) aren’t covered at all
Only the simple situation of an embarrassingly parallel algorithm is covered. In the real world developers will have to face real challenges where tasks don’t always split themselves up nicely into totally independent chunks. A reader who finishes this book assured that they are now threading gurus will face a nasty shock.
Server-side threading isn’t given much coverage at all, despite this being arguably the most likely environment for developers to encounter multithreading
The “story” element of the prose style is childish and patronising
The coding style, while a personal choice, makes me wince – and is particularly verbose for a book, where space is important
Many bad practices are encouraged, and there are plenty of important misunderstandings to trip up readers
The index has failed me (even when I’ve known that the subject is in the book) more times than it’s helped me

It’s a real pity. I was hoping this would be a book I could recommend to people as a precursor to reading Joe Duffy’s excellent Concurrent Programming on Windows. Instead, my current best advice is to read Joe Albahari’s threading tutorial. (I previously had a link to my own threading tutorial as well, but apparently this made people think I was fishing for more readers of that.) I’m sure there are good introductory threading books out there, but I’m afraid this isn’t one of them.

Book reviews, Books, C#

Book Review: Programming C# 3.0 by Jesse Liberty and Donald Xie

September 27, 2008 jonskeet 14 Comments

Resources

The O’Reilly page (errata etc)
Jesse Liberty’s page for his various books
Buy it from Amazon or Barnes and Noble

Disclaimer

One reader commented that a previous book review was too full of “this is only my personal opinion” and other such disclaimers. I think it’s still important to declare the situation, but I can see how it can get annoying if done throughout the review. So instead, I’ve lumped everything together here. Please bear these points in mind while reading the whole review:

Obviously this book competes with C# in Depth, although probably not very much.
I was somewhat prejudiced against the book by seeing that the sole 5-star review for it on Amazon was by Jesse Liberty himself. Yes, he wanted to explain why he wrote the book and why he’s proud of it, but giving yourself a review isn’t the right way to go about it.
I’ve seen a previous edition of the book (for C# 2.0) and been unimpressed at the coverage of some of the new features.
I’m a nut for technical accuracy, particularly when it comes to terminology. More on this later, but if you don’t mind reading (and then presumably using) incorrect terminology, you’re likely to have a lot better time with this book than I did.
I suspect I have higher expectations for established, prolific authors such as Jesse Liberty than for newcomers to the world of writing.
I’m really not the target market for this book.

Okay, with all that out of the way, let’s get cracking.

Contents and target audience

According to the preface, Programming C# 3.0 (PC# from now on) is for people learning C# for the first time, or brushing up on it. There’s an expectation that you probably already know another language – it wouldn’t be impossible to learn C# from the book without any prior development experience, but the preface explicitly acknowledges that it would be reasonably tough. That’s a fair comment – probably fair for any book, in fact. I have yet to read anything which made me think it would be a wonderful way to teach someone to program from absolute scratch. Likewise the preface recommends C# 3.0 in a Nutshell for a more detailed look at the language, for more expert readers. Again, that’s reasonable – it’s clearly not aiming to go into the same level of depth as Accelerated C# 2008 or C# in Depth.

The book is split into 4 parts:

The C# language: pretty much what you’d expect, except that not all of the language coverage is in this part (most of the new features of C# 3.0 are in the second part) and some non-language coverage is included (regular expressions and collections) – about 270 pages
C# and Data: LINQ, XML (the DOM API and a bit of LINQ to XML), database access (ADO.NET and LINQ to SQL) – about 100 pages
Programming with C#: introductions to ASP.NET, WPF and Windows Forms – about 85 pages
The CLR and the .NET Framework: attributes, reflection, threading, I/O and interop – about 110 pages

As you can tell, the bulk of it is in the language part, which is fine by me and reflects the title accurately. I’ll focus on that part of the book in this review, and the first chapter of part 2, which deals with the LINQ parts of C# 3.0. To be honest, I don’t think the rest of the book actually adds much value, simply because they skim over the surface of their topics so lightly. Part 3 would make a reasonable series of videos – and indeed that’s how it’s written, basically in the style of “Open Visual Studio, start a new WinForms project, now drag a control over here” etc. I’ve never been fond of that style for a book, although it works well in screencasts.

The non-LINQ database and XML chapters in part 2 seemed relatively pointless too – I got the feeling that they’d been present in older editions and so had just stayed in by default. With the extra space available from cutting these, a much better job could have been done on LINQ to SQL and LINQ to XML. The latter gets particularly short-changed in PC#, with a mere 4 pages devoted to it! (C# in Depth is much less of a “libraries” book but I still found over 6 pages to devote to it. Not a lot, I’ll grant you.)

Part 4 has potential, and is more useful than the previous parts – reflection, threading, IO and interop are all important topics (although I’d probably drop interop in favour of internationalization or something similar) – but they’re just not handled terribly well. The threading chapter talks about using lock or Monitor, but never states that lock is just shorthand for try/finally blocks which use Monitor; no mention is made of the memory model or volatility; aborting threads is demonstrated but not warned about; the examples always lock on this without explaining that it’s generally thought to be a bad idea. The IO chapter uses TextReader (usually via StreamReader) but never mentions the crucial topic of character encodings (it uses Encoding.ASCII but without really explaining it) – and most damning of all, as far as I can tell there’s not a single using statement in the entire chapter. There are calls to Close() at the end of each example, and there’s a very brief mention saying that you should always explicitly close streams – but without saying that you should use a using statement or try/finally for this purpose.

Okay, enough on those non-language topics – let’s look at the bulk of the book, which is about the language.

Language coverage

PC# starts from scratch, so it’s got the whole language to cover in about 300 pages. It would be unreasonable to expect it to provide as much attention to detail as C# in Depth, which (for the most part) only looks at the new features of C# 2.0 and 3.0. (On the other hand, if the remaining 260 pages had been given to the language as well, a lot more ground could have been covered.) It’s also worth bearing in mind that the book is not aimed at confident/competent C# developers – it’s written for newcomers, and delving into tricky issues like generic variance would be plain mean. However, I’m still not impressed with what’s been left out:

There’s no mention of nullable types as far as I can tell – indeed, the list of operators omit the null-coalescing operator (??).
Generics are really only talked about in the context of collections – despite the fact that to understand any LINQ documentation, you really will need to understand generic delegates. Generic constraints are only likewise only mentioned in the context of collections, and only what I call a “derivation type constraint” (e.g. T : IComparable<T>) (as far as I can tell the spec doesn’t give this a name). There’s no coverage of default(T) – although the “default value of a type” is mentioned elsewhere, with an incorrect explanation.
Collection initializers aren’t explained as far as I can tell, although I seem to recall seeing one in an example. They’re not mentioned in the index.
Iterator blocks (and the yield contextual keyword) are likewise absent from the index, although there’s definitely one example of yield return when IEnumerable<T> is covered. The coverage given is minimal, with no mention of the completely different way that this executes compared with normal methods.
Query expression coverage is limited: although from, where, select, orderby, join and group are covered, there’s no mention of let, the difference between join and join ... into, explicitly typed range variables, or query continuations. The translation process isn’t really explained clearly, and the text pretty much states that it will always use extension methods.
Expression trees aren’t referenced to my knowledge; there’s one piece of text which attempts to mention them but just calls them “expressions” – which are of course entirely different. We’ll come onto terminology in a minute.
Only the simplest (and admittedly most common by a huge margin) form of using directives is shown – no extern aliases, no namespace aliases, not even using Foo = System.Console;
Partial methods aren’t mentioned.
Implicitly typed arrays aren’t covered.
Static classes may be mentioned in passing (not sure) but not really explained.
Object initializers are shown in one form only, ditto anonymous object initializer expressions
Only field-like events are shown. The authors spend several pages on an example of bad code which just has a public delegate variable, and then try to blame delegates for the problem (which is really having a public variable). The solution is (of course) to use an event, but there’s little to no explanation of the nature of events as pairs of methods, a bit like properties but with subscribe/unsubscribe behaviour instead of data fetch/mutate.
Anonymous methods and lambda expressions are covered, but with very little text about the closure aspect of them. This is about it: “[…] and the anonymous method has access to the variables in the scope in which they are defined:” (followed by an example which doesn’t demonstrate the use of such variables at all).

I suspect there’s more, but you get the general gist. I’m not saying that all of these should have been covered and in great detail, but really – no mention of nullable types at all? Is it really more appropriate in a supposed language book to spend several pages building an asynchronous file server than to actually list all the operators accurately?

Okay, I’m clearly beginning to rant by now. The limited coverage is annoying, but it’s not that bad. Yes, I think the poor/missing coverage of generics and nullable types is a real problem, but it’s not enough to get me really cross. It’s the massive abuse of terminology which winds me up.

Accuracy

I’ll say this for PC# – if you ignore the terminology abuse, it’s mostly accurate. There are definitely “saying something incorrect” issues (e.g. an implication that ref/out can only be used with value type parameters; the statement that reference types in an array aren’t initialized to their default value (they are – the default value is null); the claim that extension methods can only access public members of target types (they have the same access as normal – so if the extension method is in the same assembly as the target type, for instance, it could access internal members)) but the biggest problem is that of terminology – along with sloppy code, including its formatting.

The authors confuse objects, values, variables, expressions, parameters, arguments and all kinds of other things. These have well-defined meanings, and they’re there for a reason. They do have footnotes explaining that they’re deliberately using the wrong terminology – but that doesn’t make it any better. Here are the three footnotes, and my responses to them:

The terms argument and parameter are often used interchangably, though some programmers insist on differentiating between the parameter declaration and the arguments passed in when the method is invoked.

Just because others abuse terms doesn’t mean it’s right for a book to do so. It’s not that programmers insist on differentiating between the two – the specification does. Now, to lighten things up a bit I’ll acknowledge that this one isn’t always easy to deal with. There are plenty of times where I’ve tried really hard to use the right term and just not ended up with a satisfactory bit of wording. However, at least I’ve tried – and where it’s easy, I’ve done the right thing. I wish the authors had the same attitude. (They do the same with the conditional operator, calling it “the ternary operator”. It’s a ternary operator. Having three operands is part of its nature – it’s not a description of its behaviour. Again, lots of other people get this wrong. Perhaps if all books got it right, more developers would too.) Next up:

Throughout this book, I use the term object to refer to reference and value types. There is some debate in the fact that Microsoft has implemented the value types as though they inherited from the root class Object (and thus, you may call all of Object’s methods on any value type, including the built-in types such as int.)

To me, this pretty much reads as “I’m being sloppy, but I’ve got half an excuse.” It’s true that the C# specification isn’t clear on this point – although the CLI spec is crystal clear. Personally, it just feels wrong to talk about the value 5 as an object. It’s an object when it’s boxed, of course (and if you call any Object methods on a value type which haven’t been overridden by that type, it gets boxed at that point) but otherwise I really don’t think of it as an object. An instance of the type, yes – but not an object. So yes, I’ll acknowledge that there’s a little wiggle room here – but I believe it’s going to confuse readers more than it helps them.

It’s the “confusing readers more than it helps them” part which is important. I’m not above a little bit of shortcutting myself – in C# in Depth, I refer to automatically implemented properties as “automatic properties” (after explicitly saying what I’m doing) and I refer to the versions of C# as 1, 2 and 3 instead of 1.0, 1.2, 2.0 and 3.0. In both these cases, I believe it adds to the readability of the book without giving any room for confusion. That’s very different from what’s going on in PC#, in my view. I’ve saved the most galling example of this for last:

As noted earlier, btnUpdate and btnDelete are actually variables that refer to the unnamed instances on the heap. For simplicity, we’ll refer to these as the names of the objects, keeping in mind that this is just short-hand for “the name of the variables that refer to the unnamed instances on the heap.”

This one’s the killer. It sounds relatively innocuous until you see the results. Things like this (from P63):

ListBox myListBox; // instantiate a ListBox object

No, that code doesn’t instantiate anything. It declares a variable – and that’s all. The comment isn’t non-sensical – the idea of some code which does instantiate a ListBox object clearly makes sense – but it’s not what’s happening in this code (in C# – it would in C++, which makes it even more confusing). That’s just one example – the same awful sloppiness (which implies something completely incorrect) permeates the whole book. Time and time again we’re told about instances being created when they’re not. From P261:

The Clock class must then create an instance of this delegate, which it does on the following line:

public SecondChangeHandler SecondChanged;

Why do I care about this so much? Because I see the results of it on the newsgroups, constantly. How can I blame developers for failing to communicate properly about the problems they’re having if their source of learning is so sloppy and inaccurate? How can they get an accurate mental model of the language if they’re being told that objects are being instantiated when they’re not? Communication and a clear mental model are very important to me. They’re why I get riled up when people perpetuate myths about where structs “live” or how parameters are passed. PC# had me clenching my fists on a regular basis.

These are examples where the authors apparently knew they were abusing the terminology. There are other examples where I believe it’s a genuine mistake – calling anonymous methods “anonymous delegates” or “statements that evaluate to a value are called expressions” (statements are made up of expressions, and expressions don’t have to return a value). I can certainly sympathise with this. Quite where they got the idea that HTML was derived from “Structured Query Markup Language” I don’t know – the word “Query”should have been a red flag – but these things happen.

In other places the authors are just being sloppy without either declaring that they’re going to be, or just appearing to make typos. In particular, they’re bad at distinguishing between language, framework and runtime. For instance:

“C# combines the power and complexity of regular expression syntax […]” – no, C# itself neither knows nor cares about regular expressions. They’re in the framework.
(When talking about iterator blocks) “All the bookkeeping for keeping track of which element is next, resetting the iterator, and so forth is provided for you by the Framework.” – No, this time it is the C# compiler which is doing all the work. (It doesn’t support reset though.)
“Strings can also be created using verbatim string literals, which start the at (@) symbol. This tells the String constructor that the string should be used verbatim […]” – No, the String constructor doesn’t know about verbatim string literals. They’re handled by the C# compiler.
“The .NET CLR provides isolated storage to allow the application developer to store data on a per-user basis.” I very much doubt that the CLR code has any idea about this. I expect it to be in the framework libraries.

Again, if books don’t get this right, how do we expect developers to distinguish between the three? Admittedly sometimes it can be tricky to decide where responsibility lies – but there are plenty of clearcut cases where PC# is just wrong. I doubt that the authors really don’t know the difference – they just don’t seem to think it’s important to get it right.

Code

I’m mostly going to point out the shortcomings of the code, but on the plus side I believe almost all of it will basically work. There’s one point at which the authors have both a method and a variable with the same name (which is already in the unconfirmed errata) and a few other niggles, but they’re relatively rare. However:

The code frequently ignores naming conventions. Method and class names sometimes start with lower case, and there’s frequent use of horrible names beginning with “my” or “the”.
The authors often present several pages of code together, and then take them apart section by section. This isn’t the only book to do this by a long chalk, but I wonder – does anyone really benefit from having the whole thing in a big chunk? Isn’t it better to present small, self-contained examples?
As mentioned before, the uses of using statements are few and far between.
The whitespace is all over the place. The indentation level changes all the time, and sometimes there are outdents in the middle of blocks. Occasionally newlines have actually been missed out, and in other cases (particularly at the start of class bodies) there are two blank lines for no reason at all. (The latter is very odd in a book, where vertical whitespace is seen as extremely valuable.) Sometimes there’s excessive (to my mind) spacing Just as an example (which is explicitly labelled as non-compiling code, so I’m not faulting it at all for that):
using System.Console;
class Hello
{
    static void Main()
    {
    WriteLine(“Hello World”);
   }
}

I promise you that’s exactly how it appears in the book. Now this may have started out as a fault of the type-setter, but the authors should have picked it up before publication, IMO. I could understand there being a few issues like this (proof-reading code really is hard) but not nearly as many as there are.
There are examples of mutable structs (or rather, there’s at least one example), and no warning at all that mutable value types are a really, really bad idea.

Again, I don’t want to give the impression I’m an absolute perfectionist when it comes to code in book. For the sake of keeping things simple, sometimes authors don’t seal types where they should, or make them immutable etc. I’m not really looking for production-ready code, and indeed I made this very point in one of the notes for C# in Depth. However, I draw the line at using statements, which are important and easy to get right without distracting the reader. Likewise giving variables good names – counter rather than ctr, and avoiding those the and my prefixes – makes a competent reader more comfortable and can transfer good habits to the novice via osmosis.

Writing style and content ordering

Time for some good news – when you look beyond the terminology, this is a really easy book to read. I don’t mean that everything in it is simplistic, but the style rarely gets in the way. It’s not dry, and some of the real-world analogies are very good. This may well be Jesse Liberty’s experience as a long-standing author making itself apparent.

In common with many O’Reilly books, there are two icons which usually signify something worth paying special attention to: a set of paw prints indicating a hint or tip, and a mantrap indicating a commonly encountered issue to be aware of. Given the rest of the review, I suspect you’d be surprised if I agreed with all of the points made in these extra notes – and indeed there are some issues – but most of them are good.

Likewise there are also notes for the sake of existing Java and C++ developers, which make sense and are useful.

I don’t agree with some of the choices made in terms of how and when to present some concepts. I found the way of explaining query expressions confusing, as it interleaved “here’s a new part of query expressions” with “here’s a new feature (e.g. anonymous types, extension methods).” It will come as no surprise to anyone who’s read C# in Depth that I prefer the approach of presenting all the building blocks first, and then showing how query expressions use all those features. There’s a note explaining why the authors have done what they’ve done, but I don’t buy it. One important thing with the “building blocks first” approach is to present a preliminary example or two, to give an idea of where we’re headed. I’ve forgotten to do that in the past (in a talk) and regretted it – but I don’t regret the overall way of tackling the topic.

On a slightly different note, I would have presented some of the earlier topics in a different order too. For instance, I regard structs and interfaces as more commonly used and fundamental topics than operator overloading. (While C# developers tend not to create their own structs often, they use them all the time. When was the last time you wrote a program without an int in it?) This is a minor nit – and one which readers may remember I also mentioned for Accelerated C# 2008.

There’s one final point I’d like to make, but which doesn’t really fit anywhere else – it’s about Jesse Liberty’s dedication. Most people dedicate books to friends, colleages etc. Here’s Jesse’s:

This book is dedicated to those who come out, loud, and in your face and in the most inappropriate places. We will look back at this time and shake our heads in wonder. In 49 states, same-sex couples are denied the right to marry, though incarcerated felons are not. In 36 states, you can legally be denied housing just for being q-u-e-e-r. In more than half the states, there is no law protecting LGBT children from harassment in school, and the suicide rate among q-u-e-e-r teens is 400 percent higher than among straight kids. And, we are still kicking gay heroes out of the military despite the fact that the Israelis and our own NSA, CIA, and FBI are all successfully integrated. So yes, this dedication is to those of us who are out, full-time.

(I’ve had to spell out q-u-e-e-r as otherwise the blog software replaces it with asterisks. Grr.) I’m straight, but I support Jesse’s sentiment 100%. I can’t remember when I first started taking proper notice of the homophobia in the world, but it was probably at university. This dedication does nothing to help or hinder the reader with C#, but to my mind it still makes it a better book.

Conclusion

In short, I’m afraid I wouldn’t recommend Programming C# 3.0 to potential readers. There are much better books out there: ones which won’t make it harder for the reader to talk about their code with others, in particular. It’s not all bad by any means, but the mixture of sloppy use of terminology and poor printed code is enough of a problem to make me give a general thumbs down.

Next up will be CLR via C#, by Jeffrey Richter.

Response from Jesse Liberty

As normal, I mailed the author (in this case just Jesse Liberty – I confess I didn’t look for Donald Xie’s email address) and very promptly received a nice response. He asked me to add the following as his reaction:

I believe the book is very good for most real-world programmers and the publisher and I are dedicated to making the next revision a far better book, by correcting some of the problems you point out, and by beefing up the coverage of the newer features of the language.

Also as normal, I’ll be emailing Jesse with a list of the errors I found, so hopefully they can be corrected for the next edition.

Book reviews, C#, LINQ

Book review: Pro LINQ – Language Integrated Query in C# 2008, by Joe Rattz

September 21, 2008 jonskeet 16 Comments

I’m trying something slightly different this time. Joe (the author) has reacted to specific points of my review, and I think it makes sense to show those reactions. I’d originally hoped to present them so that you could toggle them on or off, but this blog server apparently wants to strip out scripts etc, so the comments are now permanently visible.

Resources

Buy from Amazon or Barnes and Noble
Author’s web site
Apress page (errata submissions etc)

Introduction and disclaimer

As usual, I first need to give the disclaimer that as the author of a somewhat-competing book, I may be biased and certainly will have different criteria to most people. In this case the competition aspect is less direct than normal – this book is “LINQ with the C# 3 bits as necessary” whereas my book is “C# 2 and 3 with LINQ API where necessary”. However, it’s still perfectly possible that a potential reader may try to choose between the two books and buy just one. If you’re in that camp, I suggest you buy my book try to find an impartial opinion instead of trusting my review.

A second disclaimer is needed this time: I didn’t buy my copy of this book; it was sent to me by Apress at the request of Joe Rattz, specifically for review (and because Joe’s a nice guy). I hope readers of my other reviews will be confident that this won’t change the honest nature of the review; where there are mistakes or possible improvements, I’m happy to point them out.

Content, audience and overall approach

This book is simply aimed at existing C# developers who want to learn LINQ. There’s an assumption that you’re already reasonably confident in C# 2 – knowledge of generics is taken as read, for example – but there is brief coverage of using iterator blocks to return sequences. No prior experience of LINQ is required, but the LINQ to XML and LINQ to SQL sections assume (not unreasonably) that you already know XML and SQL.

The book is divided into five parts:

Introduction and C# 3.0 features (50 pages)
LINQ to Objects (130 pages)
LINQ to XML (152 pages)
LINQ to DataSet (42 pages)
LINQ to SQL (204 pages)

The approach to the subject matter changes somewhat through the book. Sometimes it’s a concept-by-concept “tutorial style” approach, but for most of the book (particularly the LINQ to Objects and LINQ to XML parts) it reads more like an API reference. Joe recommends that readers tackle the book from cover to cover, but that falls down a bit in the more reference-oriented sections.

[Joe] Early in the development of my book, a friend asked me if it was going to be a tutorial-style book or a reference book. I initially found the question odd because I never really viewed books as being exclusively one or the other. Perhaps I am different than most readers, but when I buy a programming book, I usually read a bit, start coding, and then refer to the book as a reference when needed. This is how I envision my book being used by readers and the type of book I would like for it to be. I see it as both a tutorial and a reference. I want it to be a book that gets used repeatedly, not read once and shelved. Some books work better for this than others. I rarely read a programming book cover to cover because I just don’t have time for that. I think ultimately, most authors write the book they would want to read, and that is what I did. I hope that if someone buys my book, in two years it will be tattered and worn from use as a reference, as well as read cover to cover.

I would disagree that the majority of the book reads like an API reference. Certainly, chapters 4 and 5 (deferred and nondeferred operators) work better as a reference because there isn’t a lot of connective context between the approximately 50 different standard query operators. At best it would be an eclectic tutorial with little continuity. So I decided to make those two chapters (the ones covering the standard query operators) function more like a reference. I knew that I (and hopefully my readers) would refer to it time and time again for information about the operators, and based on most of the reviews I have seen, this appears to have been a good choice. I know I refer to it myself quite frequently. I would not consider the chapters on LINQ to XML to be reference oriented although I could see why someone might feel they are. My discussion of LINQ to XML is tutorial based as I approach the different tasks a developer would need to accomplish when working with XML, such as how to construct XML, how to output XML, how to input XML, how to traverse XML, etc. However, within a task, like traversing XML, I do list the API calls and discuss them, so this is probably why it feels reference-like to some readers, and will function pretty well as a reference.

For example, take the ordering operators in LINQ – OrderBy, ThenBy, OrderByDescending and ThenByDescending. (Interestingly, one of the Amazon reviews picks up on the same example. I already had it in mind before reading that review.) These four LINQ to Objects operators take 15 pages to cover because every method overload is used, but a lot of it is effectively repeated between different examples. I think more depth could have been achieved in a shorter space by talking about the group as a whole – we only really need to see what happens when a custom comparison is used once, not four times – whereas every example of ThenBy/ThenByDescending used an identity projection, instead of showing how you can make the secondary ordering use some completely different projection (without necessarily using a custom comparer). Likewise I don’t remember seeing anything about tertiary orderings, or what the descending orderings tend to do with nulls, or emphasis on the fact that descending orderings aren’t just reversed ascending orderings (due to the stability of the sort – the stability was mentioned, but not this important corollary). Having an example for each overload is useful for a reference work, but not for a “read through from start to finish” book.

The set operators (Distinct, Except, Intersect, Union and SequenceEqual) as applied to DataSets suffer a similar problem – the five descriptions of why custom comparers are needed are all basically the same, and could be dealt with once. In particular, one paragraph is repeated verbatim for each operator. Again, that’s fine for a reference – but cutting and pasting like this makes for an irritating read when you see the exact same text several times in one reading session.

[Joe] A few readers have complained about some of the redundancies that you have pointed out, but I think most of the readers have appreciated my attempt to provide material for each operator/method. I think one of the words you will see most often in the Amazon reviews is “thorough”.

Now, it’s important that I don’t give the wrong impression here. This is certainly not just a reference book, and there’s enough introduction to topics to help readers along. If I’d been coming to C# 3 and LINQ without any other information, I think I’d have followed things, for the most part. (I’m not a fan of the way Joe presented the query expression translations, but I’m enormously pleased that he did it at all. I think I might have got lost at that point, which was unfortunately early in the book. It might have been better as just an appendix.) Anyone reading the book thoroughly should come away with a competent knowledge of LINQ and the ability to use it profitably. They may well be less comfortable with the new features of C# 3, as they’re only covered briefly – but that’s entirely appropriate given the title and target of the book. (To be blunt and selfish, I’m entirely in favour of books which leave room for more depth at a language level – that should be a good thing for sales of my own book!)

[Joe] Jon, if you only knew how difficult it was getting those query expression translations into the book. ;-) You can read in my acknowledgments where I specifically thank Katie Stence and her team for them. They were a very painful effort and in hindsight, I probably would not include them if I were to start the book from scratch. I agree with you that the translations are complex, as the book states. Perhaps the most important part of that section is when I state “Allow me to provide a word of warning. The soon to be described translation steps are quite complicated. Do not allow this to discourage you. You no more need to fully understand the translation steps to write LINQ queries than you need to know how the compiler translates the foreach statement to use it. They are here to provide additional translation information should you need it, which should be rarely, or never.”

However, I would personally have preferred to see a more conceptual approach which spent more time focused on getting the ideas through at a deep level and less time making sure that every overload was covered. After all, MSDN does a reasonable job as a reference – and the book’s web site could have contained an example for every overload if necessary without everything making it into print. The kind of thing I’d have liked to see explored more fully is the buffering vs streaming nature of data flow in LINQ. Some operators – Select and Where, for example – stream all their data through. They never keep look at more than one item of data at a time. Others (Reverse and OrderBy, for example) have to buffer up all the data in the sequence before yielding any of it. Still others use two sequences, and may buffer one sequence and stream the other – Join and Intersect work that way at the moment, although as we saw in my last blog post Intersect can be implemented in a way which streams both sequences (but still needs to keep a buffer of data it’s already seen). When you’re working with an infinite (or perhaps just very large – much bigger than memory) sequence you really need to be aware of this distinction, but it isn’t covered in Pro LINQ as far as I remember. In the interests of balance, I should point out that the difference between immediate and deferred execution is explained, repeatedly and clearly – including the semi-immediate execution which can occur sometimes in LINQ to SQL.

[Joe] I wanted my book to cover each overload because I can’t read MSDN in the bathroom, or when at the beach without an internet connection, or when curled up in a chair by the fireplace. I also wanted to provide examples for every method and overload because I find it frustrating when a book shows the simplest one and I have to figure out the one I need. Granted, depth could be added too, but you have to draw the line somewhere. Apress (at the time, not sure if this is still the plan) has the concept of three levels of book; Foundations, Pro, and Expert. I considered some information beyond the scope of the Pro level that my book is aimed at. The buffering versus streaming issue is an interesting one and would make an excellent additional column in Table 3-1, if I can get it to fit.

I’m unable to really judge the depth to which LINQ to SQL was explored, given that a lot of it was beyond my own initial knowledge (which is a good thing!). I’m slightly perturbed by the idea that it can be comprehensively tackled in a couple of hundred pages, whereas books on other ORMs are often much bigger and tackle topics such as session lifetimes and caching in much more depth. I suspect this is more due to the technologies than the writing here – LINQ to SQL is a relatively feature-poor ORM compared with, say, Hibernate – but a bit more attention to “here are options to consider when writing an application” would have been welcome.

Accuracy and code style

Most of Pro LINQ is pretty accurate. Joe is occasionally a bit off in terms of terminology, but that probably bothers most readers less than it bothers me. There are a few things which changed between the beta version of VS2008 against which the book was clearly developed and the release version, which affect the new features of C# 3. For instance, automatically implemented properties aren’t mentioned at all (and would have been much nicer to see in examples than public fields) and collection initializers are described with the old restrictions (the collection type has to implement ICollection<T>) rather than the new ones (the collection type has to implement IEnumerable and have appropriate Add methods). Other errors include trusting the documentation too much (witness the behaviour of Intersect) and an inconsistency (stating correctly that OrderBy is stable on one page, then incorrectly warning that it’s unstable on another). In my normal fashion, I’ll give Joe an exhaustive list of everything I’ve found and leave it up to him to see which he’d like to fix for the next printing, but overall Pro LINQ does pretty well. I suspect this may be partly due to covering a great deal of area but with relatively little depth and some repetition – Accelerated C# had a higher error rate, but was delving into more treacherous waters, for example.

[Joe] Since my book is not meant to be a C# 3.0 book, but rather a LINQ book, I only cover the new C# 3.0 features which were added to support LINQ. Since automatic properties were not one of those features, I do not cover them. You may notice that my chapter dedicated to the new C# 3.0 features is titled C# 3.0 Language Enhancements For LINQ. Just for your reader’s knowledge, the ordering is now specified to be stable. Initially it was unstable, and was later changed to be stable but I was told it would be specified to be unstable, but apparently at some point, the specification was changed to be stable. My book was updated but apparently I missed a spot.

Most of the advice given throughout the book is reasonable, although I take issue with one significant area. Joe recommends using the OfType operator instead of the Cast operator, because when a nongeneric collection contains the “wrong type of object,” OfType will silently skip it whereas Cast will throw an exception. I recommend using Cast for exactly the same reason! If I’ve got an object of an unexpected type in my collection, I want to know about it as soon as possible. Throwing an exception tells me what’s going on immediately, instead of hiding the problem. It’s usually the better behaviour, unless you explicitly have reason to believe that you will legitimately have objects of different types in the collection and you really want to only find objects of the specified type.

[Joe] Yes, I should have known better than to provide that advice (prefer OfType to Cast) without more explanation, more disclaimers, and more caveats. My preference would be to use Cast in development and debug built code for the exact reasons you mention, but to use OfType in production code. I would prefer my applications to handle unexpected data more gracefully in production than I would in development.

As well as “headline” pieces of advice which are advertised right up to the table of contents, there are many hints and tips along the way, most of which really do add value. I believe they’d actually add more value if they weren’t sometimes buried within reference-like material – but as we’ve already seen, my personal preference is for a more narrative style of book anyway.

The code examples are in “snippet” form (i.e. without using directives, Main method declarations etc) but are complete aside from that. At the start of each chapter there’s a detailed list of which namespaces and references are involved, so there’s no guesswork required. In fact, I’d expect most of them to work in Snippy given an appropriate environment. Some examples are a bit longwinded – we only really need to see the 7 lines showing the list of presidents once or twice, not over and over again – but that’s a minor issue. Another niggle is Joe’s choices when it comes to a few bits of coding convention. There are various areas where we differ, but a few repeatedly bothered me: overuse (to my mind) of parentheses, “old-style” delegate creation (i.e. something.Click += new EventHandler(Foo) instead of just something.Click += Foo) and the explicit specification of type parameters on LINQ operators which don’t need them. Here’s one example which demonstrates the first and the last of these issues – as well as introducing an unnecessary cast:

// This is the code in the book (in listing 7-30)
XElement outOfPrintParticipant = xDocument
.Element(“BookParticipants”)
.Elements(“BookParticipant”)
.Where(e => ((string)((XElement)e).Element(“FirstName”)) == “Joe”
&& ((string)((XElement)e).Element(“LastName”)) == “Rattz”)
.Single<XElement>();

// This is what I’d have preferred
XElement outOfPrintParticipant = xDocument
.Element(“BookParticipants”)
.Elements(“BookParticipant”)
.Where(e => (string)e.Element(“FirstName”) == “Joe”
&& (string)e.Element(“LastName”) == “Rattz”)
.Single();

Check out the penultimate line of the original – a whopping 5 opening brackets and 6 closing ones. This issue looks even worse to me when it’s used to make return and throw look like method calls:

// From GetStringFromDb (P388)
throw (new Exception(
String.Format(“Unexpected exception executing query [{0}].”, sqlQuery)));

// (Insert more code here) – same listing

return (result);

These just look odd and wrong. Of course they’re perfectly valid, but not pleasant to read in my view. On a more minor matter, Joe tends to close SQL connections, commands etc with an explicit try/finally block instead of the more idiomatic (to my mind) using statement, but again that probably bothers me more than others.

The source code is all available on the web site, and it’s easy to find each listing. (The zip file is about 10 times larger than it needs to be because it contains all the bin/obj directories with all the compiled code in rather than just the source, but that’s a tiny niggle.)

Writing style

Joe’s writing style is very informal – or at least, while most of the text is in “normal” formal prose, there are plenty of informal pieces of writing there too. As readers of my book will know, I’m much the same – I try to keep things from getting too dry, despite that being the natural state for technical teaching. I have no idea how well I succeed for most readers, but Joe certainly manages. He occasionally takes it a little too far for my personal taste, usually around listing outputs. They’re often introduced as if Joe didn’t really know what the output would be, with a kind of “wow, it worked, who’d have thought?” comment afterwards. I suspect I’ve got some of this in my book too, but Joe lays it on a little too thickly for my liking. I don’t know whether it would be fairer to present a “medium-level” example of this rather than one which really grated, but this is the one (from page 257) made such an impression that I remembered it over 300 pages later:

This should output the language attribute. Let’s see:
language="English"
Groovy! I have never actually written the word groovy before. I had to let the spelling checker spell it for me.

Now, I really want to stress that that’s a “worst case” rather than the average case, and indeed many listings don’t have anything “cutesy” about them. I just wanted to give an example of the kind of thing that didn’t work for me.

[Joe] Let me see if I get this straight. So you are saying you got to learn something about LINQ and how to spell groovy, and it stuck for over 300 pages and you are upset? Man, you know how to spell groovy now, what’s the problem? 8-D Would it annoy you less if I told you that is a reference to Austin Powers? My book is riddled with references to movies and TV shows, and that one is for Austin Powers. Maybe you didn’t catch that, or maybe you don’t like Austin Powers, or maybe you just still don’t like it. One reader was irritated when I said “Dude, Sweet” because he didn’t recognize that as a reference to Dude, Where’s My Car. I have references to Office Space, Arrested Development, Bottle Rocket, Seinfeld, The Matrix, Wargames, Tron, etc. In fact, on page 455, I actually use the word “moo” instead of “moot” in reference to Friends. My copy editor actually corrected that for me, but once I explained it, she let me have it back. So if you see something goofy, like “groovy” just know it is a reference to something and begin your investigation in your spare time. And if you see an error, it is intentional to make sure you are paying attention. ;-) As you have already pointed out, technical writing can be dry. I made an effort to inject humor into the book in the form of references to pop culture, most specifically movies and television. Sometimes the reference is in a comment like “groovy”, and sometimes it’s in the sample data like a character’s name. Like any comedian, every joke or reference can’t be a hit with everyone. I will say though that I have heard more from those that recognized the references and appreciated them (which helps carry a reader through the lesser interesting parts) than I have from those that found them annoying.

What really did work was including hints and tips which explicitly said where Joe had received unexpected results with slightly different code. If anything is unexpected to the author, it may well be unexpected to readers too, so I really appreciated reading that sort of thing. (It would be wearing if Joe were stupid and expected all kinds of silly results, but that’s not the case at all.)

Conclusion

Pro LINQ is a good book. It has enough niggles to keep me from using superlatives about it, but it’s good nonetheless. It’s Joe’s first book (just like C# in Depth is the first one I can truly call “mine”) and I hope he writes more. Having read it from cover to cover, I think it’ll be more useful as a reference for individual methods (when MSDN doesn’t quite cut it) than to reread whole chapters, but that’s not a problem. My slight complaints above certainly don’t stop it from being a book I’m pleased to own.

[Joe] I’ll take it as a compliment that you think my book would be useful for those times that MSDN isn’t good enough!

This is the first LINQ book I’ve reviewed – I already have LINQ in Action, which is also on the list to review at some point. (I’ve read large chunks of it in soft copy, but I haven’t been through the finished hard copy yet.) It will be interesting to see how the two compare. Next up will probably be “Programming C# 3.0” by Jesse Liberty, however.

Book reviews

Book reviews – what do you look for?

September 14, 2008 jonskeet 3 Comments

I’ve just started writing the book review for “Pro LINQ – Language Integrated Query in C# 2008” and I wondered what people look for in a review. I’ve talked before about who is in the best position to write a review – but this is slightly different. In particular, what sort of balance do you want between totally factual aspects (what’s covered, the kinds of mistakes I found) and pretty subjective aspects (the writing style, quality of advice given)? Is a long and detailed review useful, or are you likely to just skip to the conclusion anyway?

I guess it’s worth answering my own question, partly in the hope that someone will write this kind of review for C# in Depth. (There are plenty of reviews, but not many in significant detail.) Here’s what I like to see:

A mixture of subjective opinions and objective facts
An example or two of the kind of technical errors found, and a rough idea of how often such errors occur
Who the book is aimed at, and more subjectively who it wouldn’t be useful for
A brief summary of what’s covered – and what’s not covered, if that’s relevant
A feeling of how well structured/ordered the book is – does it lead the reader through the technology, or jump around?
An idea of the author’s style – formal or informal, reference or tutorial, etc
Which aspects of that style irked the reader, and which worked well
Exampes of all of this! It’s one thing to say that a style annoys you – it’s another to give an example which will let the review’s reader judge for themselves.
How the author could improve, and their existing strengths
A final gut feeling of how much you like the book, despite/because of the above

Not all of these are suitable for all books, and I wouldn’t like to say that my own reviews have included all of them so far, but I think that’s what I’d appreciate reading. That suggests a fairly comprehensive review, of course – which is just what I’m after when making a reading decision.

I’d love to know what you think – it won’t be in time to affect the review I’m writing now, of course, but I’ll try to take comments into account for future reviews.

Book reviews, Books, C#

Book review: Accelerated C# 2008 by Trey Nash

August 1, 2008 jonskeet 20 Comments

Time for another book review, and this time it’s a due to a recommendation from a reader who has this one, C# in Depth and Head First C#.

Resources

Introduction and disclaimer

My normal book review disclaimer applies, but probably more so than ever before. Yes, Accelerated C# 2008 is a competitor to C# in Depth. They’re different in many ways, but many people would no doubt be in the target audience for both books. If you meet that criterion, please be aware that as the author of C# in Depth I can’t possibly be 100% objective when reviewing another C# book. That said, I’ll try to justify my opinions everywhere I can.

Target audience and content overview

Accelerated C# 2008 is designed to appeal to existing developers with experience in an OO language. As one of the Amazon reviews notes, you may struggle somewhat if you don’t have any .NET experience beforehand – while it should be possible to read it knowing only Java or C++, there are various times where a certain base level of knowledge is assumed and you’ll want to refer to MSDN for some background material. If you come at the book with no OO experience at all, I expect you’ll have a hard time. Chapter 4 does cover the basics of OO in .NET (classes, structs, methods, properties etc) this isn’t really a beginner’s book.

In terms of actual content covered, Accelerated C# 2008 falls somewhere between C# in Depth (almost purely language) and C# 3.0 in a Nutshell (language and then core libraries). It doesn’t attempt to cover all the core technologies (IO, reflection, security, interop etc are absent) but it goes into detail beyond the C# language when it comes to strings, exceptions, collections, threading and more. As well as purely factual information, there’s a lot of guidance as well, including a whole chapter entitled “In Search of C# Canonical Forms.”

General impressions

I’d like to make it clear to start with that I like the book. I have a number of criticisms, none of which I’m making up for the sake of being critical – but that in no way means it’s a bad book at all. It’s very unlikely that you know everything in here (I certainly didn’t) and the majority of the guidance is sound. The code examples are almost always self-contained (a big plus in my view) and Trey’s style is very readable. Where there are inaccuracies, they’re usually pretty harmless, and the large amount of accurate and insightful material makes up for them.

Just as I often compare Java to C# in my book, so Trey often compares C++ to C# in his. While my balance of C# to C++ knowledge is such that these comments aren’t particularly useful to me, I can see them being good for a newcomer to C# from a C++ background. I thought there might have been a few too many comparisons (I understood the point about STL and lambdas/LINQ the first time round…) but that’s just a minor niggle.

Where C# in Depth is primarily a “read from start to finish” book and C# 3.0 in a Nutshell is primarily a reference book (both can be used the other way, of course) Accelerated C# 2008 falls between the two. It actually achieves the best of both worlds to a large extent, which is an impressive feat. The ordering could be improved (more on this later on) but the general feeling is very good.

One quick word about the size of the book in terms of content: if you’re one of those people who judges the amount of useful content in a book on its page count, it’s worth noting that the font in this book is pretty small. I would guess that it packs about 25% more text per page than C# in Depth does, taking its “effective” page count from around 500 to 625. Also, the content is certainly meaty – you’re unlikely to find yourself skimming over loads of simple stuff trying to get to the good bits. Speaking of “getting to the good bits” let’s tackle my first significant gripe.

Material organisation

If you look at the tables of contents for Accelerated C# 2008 and Accelerated C# 2005, you’ll notice that the exact same chapter titles in the 2005 edition carry over in the same order in the 2008 edition. There are three extra chapters in the new edition, covering extension methods, lambda expressions and LINQ. That’s not to say that the content of the “duplicate” chapters is the same as before – C# 3.0 features are introduced in the appropriate place within existing chapters. In terms of ordering the chapters, I think it would be have been much more appropriate to keep the last chapter of the old edition – “In Search of C# Canonical Forms” – as the last chapter of the new edition. Apart from anything else, that would allow it to include hints and tips involving the new C# 3 features which are currently covered later. It really feels like a “wrapping up” chapter, and deserves to be last.

That’s not the only time that the ordering felt strange, however. Advanced topics (at least ones which feel advanced to me) are mixed in with fairly basic ones. For instance, in the chapter on exceptions, there’s a section about “exception neutrality” which includes details about constrained execution regions and critical finalizers. All interesting stuff – even though I wish there were more of a prominent warning saying, “This is costly to both performance and readability: only go to these lengths when you really, really need to.” However, this comes before a section about using try/finally blocks and the using statement to make sure that resources are cleaned up however a block is exited. I can’t imagine anyone who knows enough C# to take in the exception neutrality material also not knowing about try/finally or the using statement (or how to create your own custom exception class, which comes between these two topics).

Likewise the chapter which deals with collections, including generic ones, comes before the chapter on generics. If I were a reader who didn’t know generics already, I think I’d get very confused reading about ICollection<T> without knowing what the T meant. Now don’t get me wrong: ordering material so that you don’t get “circular references” is often hard if not impossible. I just think it could have been done better here.

Aiming too deep?

It’s not like me to criticise a book for being too deep, but I’m going to make an exception here. Every so often, I came away from a topic thinking that it would have been better covered a little bit more lightly. Sometimes this was because a running example became laborious and moved a long way from anything you were actually likely to want to do in real life. The sections on “borrowing from functional programming” and memoization/currying/anonymous recursion felt guilty of this. It’s not that they’re not interesting topics, but the examples picked didn’t quite work for me.

The other problem with going deep is that you really, really need to get things right – because your readers are less likely to spot the mistakes. I’ll give three examples here:

Trey works hard on a number of occasions to avoid boxing, and points it out each time. Without any experience in performance tuning, you’d be forgiven for thinking that boxing is the primary cause of poor performance in .NET applications based on this book. While I agree that it’s something to be avoided where it’s possible to do so without bending the design out of shape, it doesn’t deserve to be laboured as much as it is here. In particular, Trey gives an example of a complex number struct and how he’s written appropriate overloads etc to avoid boxing. Unfortunately, to calculate the magnitude of the complex number (used to implement IComparable in a manner which violates the contract, but that’s another matter) he uses Math.Pow(real, 2) + Math.Pow(img, 2). Using a quick and dirty benchmark, I found that using real * real + img * img instead of Math.Pow made far, far more difference than whether or not the struct was boxed. (I happen to think it’s more readable code too, but never mind.) There was nothing wrong with avoiding the boxing, but in chasing the small performance gains, the big ones were missed.
In the chapter on threading, there are some demonstrations of lock-free programming (before describing locking, somewhat oddly – and without describing the volatile modifier). Now, personally I’d want to discourage people from attempting lock-free programming at all unless they’ve got a really good reason (with evidence!) to support that decision – but if you’re going to do it at all, you need to be hugely careful. One of the examples basically has a load of threads starting and stopping, updating a counter (correctly) using Interlocked.Increment/Decrement. Another thread monitors the count and periodically reports it – but unfortunately it uses this statement to do it:

threadCount = Interlocked.Exchange(ref numberThreads, numberThreads);

The explanation states: “Since the Interlocked class doesn’t provide a method to simply read an Int32 value in an atomic operation, all I’m doing is swapping the numberThreads variable’s value with its own value, and, as a side effect, the Interlocked.Exchange method returns to me the value that was in the slot.” Well, not quite. It’s actually swapping the numberThreads variable’s value with a value evaluated at some point before the method call. If you rewrite the code like this, it becomes more obviously wrong:

int tmp = numberThreads;
Thread.Sleep(1000); // What could possibly happen during this time, I wonder?
threadCount = Interlocked.Exchange(ref numberThreads, tmp);

The call to Thread.Sleep is there to make it clear that numberThreads can very easily change between the initial read and the call to Interlocked.Exchange. The correct fix to the code is to use something like this:

threadCount = Interlocked.CompareExchange(ref numberThreads, 0, 0);

That sets numberThreads atomically to the value 0 if (and only if) its value is already 0 – in other words, it will never actually change the value, just report it. Now, I’ve laboured the explanation of why the code is wrong because it’s fairly subtle. Obvious errors in books are relatively harmless – subtle ones are much more worrying.
As a final example for this section, let’s look at iterator blocks. Did you know that any parameters passed to methods implemented using iterator blocks become public fields in the generated class? I certainly didn’t. Trey pointed out that this meant they could easily be changed with reflection, and that could be dangerous. (After looking with reflector, it appears that local variables within the iterator block are also turned into public fields.) Now, leaving aside the fact that this is hugely unlikely to actually bite anyone (I’d be frankly amazed to see it as a problem in the wild) the suggested fix is very odd.

The example Trey gives is where originally a Boolean parameter is passed into the method, and used in two places. Oh no! The value of the field can be changed between those two uses, which could lead to problems! True. The supposed fix is to wrap the Boolean value in an immutable struct ImmutableBool, and pass that in instead. Now, why would that be any better? Certainly you can’t change the value within the struct – but you can easily change the field‘s value to be a completely different instance of ImmutableBool. Indeed, the breakage would involve exactly the same code, just changing the type of the value. The other train of thought which suggests that this approach would fail is that bool is already immutable, so it can’t be the mutability of the type of the field that causes problems. I’m sure there are much more useful things that Trey could have said in the two and a half pages he spent describing a broken fix to an unimportant problem.

Sorry, that was getting ranty for a bit… but I hope you understand why. Before concluding this review, let’s look at one chapter which is somewhat different to the rest, and which I’ve mentioned before:

In Search of C# Canonical Forms (aka “Design and Implementation Guidelines” :)

I’d been looking forward to this part of the book. I’m always interested in seeing what other people think the most important aspects of class design are. The book doesn’t go into much detail about abstract orientation (in this chapter, anyway – there’s plenty scattered through the book) but concentrates on core interfaces you might implement, etc. That’s fine. I’m still waiting for a C# book to be written to truly be on a par with Effective Java (I have the second edition waiting to be read at work…) but I wasn’t expecting it all to be here. So, was this chapter worth the wait?

Somewhat. I was very glad to see that the first point around reference types was “Default to sealed classes” – I couldn’t agree more, and the arguments were well articulated. Many other guidelines were either entirely reasonable or at least I could go either way on. There were a few where I either disagreed or at least would have put things differently:

Implementing cloning with copy constructors: one point about cloning which wasn’t mentioned is that (to quote MSDN) “The resulting clone must be of the same type as or a compatible type to the original instance.” The suggested implementation of Clone in the book is to use copy constructors. This means that every subclass must override Clone to call its own copy constructor, otherwise the instance returned will be of the wrong type. MemberwiseClone always creates an instance of the same type. Yes, it means the constructor isn’t called – but frankly the example given (performing a database lookup in the constructor) is a pretty dodgy cloning scenario in the first place, in my view. If I create a clone and it doesn’t contain the same data as the original, there’s something wrong. Having said that, the caveats Trey gives around MemberwiseClone are all valid in and of themselves – we just disagree about their importance. The advice to not actually implement ICloneable in the first place is also present (and well explained).
Implementing IDisposable: Okay, so this is a tough topic, but I was slightly disappointed to see the recommendation that “it’s wise for any objects that implement the IDisposable interface to also implement a finalizer […]” Now admittedly on the same page there’s the statement that “In reality, it’s rare that you’ll ever need to write a finalizer” but the contradiction isn’t adequately resolved. A lot of people have trouble understanding this topic, so it would have been nice to see really crisp advice here. My 20 second version of it is: “Only implement a finalizer if you’re holding on to resources which won’t be cleaned up by their own finalizers.” That actually cuts out almost everything, unless you’ve got an IntPtr to a native handle (in which case, use SafeHandle instead).
- As a side note, Trey repeatedly claims that “finalizers aren’t destructors” which irks me somewhat as the C# spec (the MS version, anyway) uses the word “destructor” exclusively – a destructor is the way you implement a .NET finalizer in C#. It would be fine to say “destructors in C# aren’t deterministic, unlike destructors in C++” but I think it’s worth acknowledging that the word has a valid meaning in the context of C#. Anyway…
Implementing equality comparisons: while this was largely okay, I was disappointed to see that there wasn’t much discussion of inheritance and how it breaks equality comparisons in a hard-to-fix way. There’s some mention of inheritance, but it doesn’t tackle the issue I think is thorniest: If I’m asking one square whether it’s equal to another square, is it enough to just check for everything I know about squares (e.g. size and position)? What about if one of the squares is actually a coloured square – it has more information than a “basic” square. It’s very easy to end up with implementations which break reflexivity, simply because the question isn’t well-defined. You effectively need to be asking “are these two objects equal in <this> particular aspect” – but you don’t get to specify the aspect. This is an example where I remember Effective Java (first edition) giving a really thorough explanation of the pitfalls and potential implementations. The coverage in Accelerated C# 2008 is far from bad – it just doesn’t meet the gold standard. Arguably it’s unfair to ask another book to compete at that level, when it’s trying to do so much else as well.
Ordering: I mentioned earlier on that the complex number class used for a boxing example failed to implement comparisons appropriately. Unfortunately it’s used as the example specifically for “how to implement IComparable and IComparable<T>” as well. To avoid going into too much detail, if you have two instances x and y such that x != y but x.Magnitude == y.Magnitude, you’ll find x.CompareTo(y) == y.CompareTo(x) (but with a non-zero result in both cases). What’s needed here is a completely different example – one with a more obvious ordering.
Value types and immutability: Okay, so the last bullet on the value types checklist is “Should this struct be immutable? […] Values are excellent candidates to be immutable types” but this comes after “Need to boxed instances of value? Implement an interface to do so […]” No! Just say no to mutable value types to start with! Mutable value types are bad, bad, bad, and should be avoided like the plague. There are a very few situations where it may be appropriate, but to my mind any advice checklist for implementing structs should make two basic points:
- Are you sure you really wanted a struct in the first place? (They’re rarely the right choice.)
- Please make it immutable! Pretty please with a cherry on top? Every time a struct is mutated, a cute kitten dies. Do you really want to be responsible for that?

Conclusion

At the risk – nay, certainty – of repeating myself, I’m going to say that I like the book despite the (sometimes subjective) flaws pointed out above. As Shakespeare wrote in Julius Caesar, “The evil men do lives after them. The good is oft interred with their bones.” So it is with book reviews – it’s a lot easier to give specific examples of problems than it is to report successes – but the book does succeed, for the most part. Perhaps the root of almost all my reservations is that it tries to do too much – I’m not sure whether it’s possible to go into that much detail and cater for those with little or no previous C# experience (even with Java/C++) and keep to a relatively slim volume. It was a very lofty goal, and Trey has done very well to accomplish what he has. I would be interested to read a book by him (and hey, potentially even collaborate on it) which is solely on well-designed classes and libraries.

In short, I recommend Accelerated C# 2008, with a few reservations. Hopefully you can judge for yourself whether my reservations would bother you or not. I think overall I slightly prefer C# 3.0 in a Nutshell, but the two books are fairly different.

Reaction

I sent this to Trey before publishing it, as is my custom. He responded to all my points extremely graciously. I’m not sure yet whether I can post the responses themselves – stay tuned for the possibility, at least. My one problem with reviewing books is that I end up in contact with so many other authors who I’d like to work with some day, and that number has just increased again…