Book review: Pro LINQ – Language Integrated Query in C# 2008, by Joe Rattz

I’m trying something slightly different this time. Joe (the author) has reacted to specific points of my review, and I think it makes sense to show those reactions. I’d originally hoped to present them so that you could toggle them on or off, but this blog server apparently wants to strip out scripts etc, so the comments are now permanently visible.

Resources

Introduction and disclaimer

As usual, I first need to give the disclaimer that as the author of a somewhat-competing book, I may be biased and certainly will have different criteria to most people. In this case the competition aspect is less direct than normal – this book is “LINQ with the C# 3 bits as necessary” whereas my book is “C# 2 and 3 with LINQ API where necessary”. However, it’s still perfectly possible that a potential reader may try to choose between the two books and buy just one. If you’re in that camp, I suggest you buy my book try to find an impartial opinion instead of trusting my review.

A second disclaimer is needed this time: I didn’t buy my copy of this book; it was sent to me by Apress at the request of Joe Rattz, specifically for review (and because Joe’s a nice guy). I hope readers of my other reviews will be confident that this won’t change the honest nature of the review; where there are mistakes or possible improvements, I’m happy to point them out.

Content, audience and overall approach

This book is simply aimed at existing C# developers who want to learn LINQ. There’s an assumption that you’re already reasonably confident in C# 2 – knowledge of generics is taken as read, for example – but there is brief coverage of using iterator blocks to return sequences. No prior experience of LINQ is required, but the LINQ to XML and LINQ to SQL sections assume (not unreasonably) that you already know XML and SQL.

The book is divided into five parts:

  • Introduction and C# 3.0 features (50 pages)
  • LINQ to Objects (130 pages)
  • LINQ to XML (152 pages)
  • LINQ to DataSet (42 pages)
  • LINQ to SQL (204 pages)

The approach to the subject matter changes somewhat through the book. Sometimes it’s a concept-by-concept “tutorial style” approach, but for most of the book (particularly the LINQ to Objects and LINQ to XML parts) it reads more like an API reference. Joe recommends that readers tackle the book from cover to cover, but that falls down a bit in the more reference-oriented sections.

[Joe] Early in the development of my book, a friend asked me if it was going to be a tutorial-style book or a reference book. I initially found the question odd because I never really viewed books as being exclusively one or the other. Perhaps I am different than most readers, but when I buy a programming book, I usually read a bit, start coding, and then refer to the book as a reference when needed. This is how I envision my book being used by readers and the type of book I would like for it to be. I see it as both a tutorial and a reference. I want it to be a book that gets used repeatedly, not read once and shelved. Some books work better for this than others. I rarely read a programming book cover to cover because I just don’t have time for that. I think ultimately, most authors write the book they would want to read, and that is what I did. I hope that if someone buys my book, in two years it will be tattered and worn from use as a reference, as well as read cover to cover.

I would disagree that the majority of the book reads like an API reference. Certainly, chapters 4 and 5 (deferred and nondeferred operators) work better as a reference because there isn’t a lot of connective context between the approximately 50 different standard query operators. At best it would be an eclectic tutorial with little continuity. So I decided to make those two chapters (the ones covering the standard query operators) function more like a reference. I knew that I (and hopefully my readers) would refer to it time and time again for information about the operators, and based on most of the reviews I have seen, this appears to have been a good choice. I know I refer to it myself quite frequently. I would not consider the chapters on LINQ to XML to be reference oriented although I could see why someone might feel they are. My discussion of LINQ to XML is tutorial based as I approach the different tasks a developer would need to accomplish when working with XML, such as how to construct XML, how to output XML, how to input XML, how to traverse XML, etc. However, within a task, like traversing XML, I do list the API calls and discuss them, so this is probably why it feels reference-like to some readers, and will function pretty well as a reference.

For example, take the ordering operators in LINQ – OrderBy, ThenBy, OrderByDescending and ThenByDescending. (Interestingly, one of the Amazon reviews picks up on the same example. I already had it in mind before reading that review.) These four LINQ to Objects operators take 15 pages to cover because every method overload is used, but a lot of it is effectively repeated between different examples. I think more depth could have been achieved in a shorter space by talking about the group as a whole – we only really need to see what happens when a custom comparison is used once, not four times – whereas every example of ThenBy/ThenByDescending used an identity projection, instead of showing how you can make the secondary ordering use some completely different projection (without necessarily using a custom comparer). Likewise I don’t remember seeing anything about tertiary orderings, or what the descending orderings tend to do with nulls, or emphasis on the fact that descending orderings aren’t just reversed ascending orderings (due to the stability of the sort – the stability was mentioned, but not this important corollary). Having an example for each overload is useful for a reference work, but not for a “read through from start to finish” book.

The set operators (Distinct, Except, Intersect, Union and SequenceEqual) as applied to DataSets suffer a similar problem – the five descriptions of why custom comparers are needed are all basically the same, and could be dealt with once. In particular, one paragraph is repeated verbatim for each operator. Again, that’s fine for a reference – but cutting and pasting like this makes for an irritating read when you see the exact same text several times in one reading session.

[Joe] A few readers have complained about some of the redundancies that you have pointed out, but I think most of the readers have appreciated my attempt to provide material for each operator/method. I think one of the words you will see most often in the Amazon reviews is “thorough”.

Now, it’s important that I don’t give the wrong impression here. This is certainly not just a reference book, and there’s enough introduction to topics to help readers along. If I’d been coming to C# 3 and LINQ without any other information, I think I’d have followed things, for the most part. (I’m not a fan of the way Joe presented the query expression translations, but I’m enormously pleased that he did it at all. I think I might have got lost at that point, which was unfortunately early in the book. It might have been better as just an appendix.) Anyone reading the book thoroughly should come away with a competent knowledge of LINQ and the ability to use it profitably. They may well be less comfortable with the new features of C# 3, as they’re only covered briefly – but that’s entirely appropriate given the title and target of the book. (To be blunt and selfish, I’m entirely in favour of books which leave room for more depth at a language level – that should be a good thing for sales of my own book!)

[Joe] Jon, if you only knew how difficult it was getting those query expression translations into the book. ;-) You can read in my acknowledgments where I specifically thank Katie Stence and her team for them. They were a very painful effort and in hindsight, I probably would not include them if I were to start the book from scratch. I agree with you that the translations are complex, as the book states. Perhaps the most important part of that section is when I state “Allow me to provide a word of warning. The soon to be described translation steps are quite complicated. Do not allow this to discourage you. You no more need to fully understand the translation steps to write LINQ queries than you need to know how the compiler translates the foreach statement to use it. They are here to provide additional translation information should you need it, which should be rarely, or never.”

However, I would personally have preferred to see a more conceptual approach which spent more time focused on getting the ideas through at a deep level and less time making sure that every overload was covered. After all, MSDN does a reasonable job as a reference – and the book’s web site could have contained an example for every overload if necessary without everything making it into print. The kind of thing I’d have liked to see explored more fully is the buffering vs streaming nature of data flow in LINQ. Some operators – Select and Where, for example – stream all their data through. They never keep look at more than one item of data at a time. Others (Reverse and OrderBy, for example) have to buffer up all the data in the sequence before yielding any of it. Still others use two sequences, and may buffer one sequence and stream the other – Join and Intersect work that way at the moment, although as we saw in my last blog post Intersect can be implemented in a way which streams both sequences (but still needs to keep a buffer of data it’s already seen). When you’re working with an infinite (or perhaps just very large – much bigger than memory) sequence you really need to be aware of this distinction, but it isn’t covered in Pro LINQ as far as I remember. In the interests of balance, I should point out that the difference between immediate and deferred execution is explained, repeatedly and clearly – including the semi-immediate execution which can occur sometimes in LINQ to SQL.

[Joe] I wanted my book to cover each overload because I can’t read MSDN in the bathroom, or when at the beach without an internet connection, or when curled up in a chair by the fireplace. I also wanted to provide examples for every method and overload because I find it frustrating when a book shows the simplest one and I have to figure out the one I need. Granted, depth could be added too, but you have to draw the line somewhere. Apress (at the time, not sure if this is still the plan) has the concept of three levels of book; Foundations, Pro, and Expert. I considered some information beyond the scope of the Pro level that my book is aimed at. The buffering versus streaming issue is an interesting one and would make an excellent additional column in Table 3-1, if I can get it to fit.

I’m unable to really judge the depth to which LINQ to SQL was explored, given that a lot of it was beyond my own initial knowledge (which is a good thing!). I’m slightly perturbed by the idea that it can be comprehensively tackled in a couple of hundred pages, whereas books on other ORMs are often much bigger and tackle topics such as session lifetimes and caching in much more depth. I suspect this is more due to the technologies than the writing here – LINQ to SQL is a relatively feature-poor ORM compared with, say, Hibernate – but a bit more attention to “here are options to consider when writing an application” would have been welcome.

Accuracy and code style

Most of Pro LINQ is pretty accurate. Joe is occasionally a bit off in terms of terminology, but that probably bothers most readers less than it bothers me. There are a few things which changed between the beta version of VS2008 against which the book was clearly developed and the release version, which affect the new features of C# 3. For instance, automatically implemented properties aren’t mentioned at all (and would have been much nicer to see in examples than public fields) and collection initializers are described with the old restrictions (the collection type has to implement ICollection<T>) rather than the new ones (the collection type has to implement IEnumerable and have appropriate Add methods). Other errors include trusting the documentation too much (witness the behaviour of Intersect) and an inconsistency (stating correctly that OrderBy is stable on one page, then incorrectly warning that it’s unstable on another). In my normal fashion, I’ll give Joe an exhaustive list of everything I’ve found and leave it up to him to see which he’d like to fix for the next printing, but overall Pro LINQ does pretty well. I suspect this may be partly due to covering a great deal of area but with relatively little depth and some repetition – Accelerated C# had a higher error rate, but was delving into more treacherous waters, for example.

[Joe] Since my book is not meant to be a C# 3.0 book, but rather a LINQ book, I only cover the new C# 3.0 features which were added to support LINQ. Since automatic properties were not one of those features, I do not cover them. You may notice that my chapter dedicated to the new C# 3.0 features is titled C# 3.0 Language Enhancements For LINQ. Just for your reader’s knowledge, the ordering is now specified to be stable. Initially it was unstable, and was later changed to be stable but I was told it would be specified to be unstable, but apparently at some point, the specification was changed to be stable. My book was updated but apparently I missed a spot.

Most of the advice given throughout the book is reasonable, although I take issue with one significant area. Joe recommends using the OfType operator instead of the Cast operator, because when a nongeneric collection contains the “wrong type of object,” OfType will silently skip it whereas Cast will throw an exception. I recommend using Cast for exactly the same reason! If I’ve got an object of an unexpected type in my collection, I want to know about it as soon as possible. Throwing an exception tells me what’s going on immediately, instead of hiding the problem. It’s usually the better behaviour, unless you explicitly have reason to believe that you will legitimately have objects of different types in the collection and you really want to only find objects of the specified type.

[Joe] Yes, I should have known better than to provide that advice (prefer OfType to Cast) without more explanation, more disclaimers, and more caveats. My preference would be to use Cast in development and debug built code for the exact reasons you mention, but to use OfType in production code. I would prefer my applications to handle unexpected data more gracefully in production than I would in development.

As well as “headline” pieces of advice which are advertised right up to the table of contents, there are many hints and tips along the way, most of which really do add value. I believe they’d actually add more value if they weren’t sometimes buried within reference-like material – but as we’ve already seen, my personal preference is for a more narrative style of book anyway.

The code examples are in “snippet” form (i.e. without using directives, Main method declarations etc) but are complete aside from that. At the start of each chapter there’s a detailed list of which namespaces and references are involved, so there’s no guesswork required. In fact, I’d expect most of them to work in Snippy given an appropriate environment. Some examples are a bit longwinded – we only really need to see the 7 lines showing the list of presidents once or twice, not over and over again – but that’s a minor issue. Another niggle is Joe’s choices when it comes to a few bits of coding convention. There are various areas where we differ, but a few repeatedly bothered me: overuse (to my mind) of parentheses, “old-style” delegate creation (i.e. something.Click += new EventHandler(Foo) instead of just something.Click += Foo) and the explicit specification of type parameters on LINQ operators which don’t need them. Here’s one example which demonstrates the first and the last of these issues – as well as introducing an unnecessary cast:

// This is the code in the book (in listing 7-30)
XElement outOfPrintParticipant = xDocument
  .Element(“BookParticipants”)
  .Elements(“BookParticipant”)
  .Where(e => ((string)((XElement)e).Element(“FirstName”)) == “Joe”
           && ((string)((XElement)e).Element(“LastName”)) == “Rattz”)
  .Single<XElement>();

// This is what I’d have preferred
XElement outOfPrintParticipant = xDocument
  .Element(“BookParticipants”)
  .Elements(“BookParticipant”)
  .Where(e => (string)e.Element(“FirstName”) == “Joe”
           && (string)e.Element(“LastName”) == “Rattz”)
  .Single();

Check out the penultimate line of the original – a whopping 5 opening brackets and 6 closing ones. This issue looks even worse to me when it’s used to make return and throw look like method calls:

// From GetStringFromDb (P388)
throw (new Exception(
            String.Format(“Unexpected exception executing query [{0}].”, sqlQuery)));

// (Insert more code here) – same listing

return (result);

These just look odd and wrong. Of course they’re perfectly valid, but not pleasant to read in my view. On a more minor matter, Joe tends to close SQL connections, commands etc with an explicit try/finally block instead of the more idiomatic (to my mind) using statement, but again that probably bothers me more than others.

The source code is all available on the web site, and it’s easy to find each listing. (The zip file is about 10 times larger than it needs to be because it contains all the bin/obj directories with all the compiled code in rather than just the source, but that’s a tiny niggle.)

Writing style

Joe’s writing style is very informal – or at least, while most of the text is in “normal” formal prose, there are plenty of informal pieces of writing there too. As readers of my book will know, I’m much the same – I try to keep things from getting too dry, despite that being the natural state for technical teaching. I have no idea how well I succeed for most readers, but Joe certainly manages. He occasionally takes it a little too far for my personal taste, usually around listing outputs. They’re often introduced as if Joe didn’t really know what the output would be, with a kind of “wow, it worked, who’d have thought?” comment afterwards. I suspect I’ve got some of this in my book too, but Joe lays it on a little too thickly for my liking. I don’t know whether it would be fairer to present a “medium-level” example of this rather than one which really grated, but this is the one (from page 257) made such an impression that I remembered it over 300 pages later:

This should output the language attribute. Let’s see:


language="English"

Groovy! I have never actually written the word groovy before. I had to let the spelling checker spell it for me.

Now, I really want to stress that that’s a “worst case” rather than the average case, and indeed many listings don’t have anything “cutesy” about them. I just wanted to give an example of the kind of thing that didn’t work for me.

[Joe] Let me see if I get this straight. So you are saying you got to learn something about LINQ and how to spell groovy, and it stuck for over 300 pages and you are upset? Man, you know how to spell groovy now, what’s the problem? 8-D Would it annoy you less if I told you that is a reference to Austin Powers? My book is riddled with references to movies and TV shows, and that one is for Austin Powers. Maybe you didn’t catch that, or maybe you don’t like Austin Powers, or maybe you just still don’t like it. One reader was irritated when I said “Dude, Sweet” because he didn’t recognize that as a reference to Dude, Where’s My Car. I have references to Office Space, Arrested Development, Bottle Rocket, Seinfeld, The Matrix, Wargames, Tron, etc. In fact, on page 455, I actually use the word “moo” instead of “moot” in reference to Friends. My copy editor actually corrected that for me, but once I explained it, she let me have it back. So if you see something goofy, like “groovy” just know it is a reference to something and begin your investigation in your spare time. And if you see an error, it is intentional to make sure you are paying attention. ;-) As you have already pointed out, technical writing can be dry. I made an effort to inject humor into the book in the form of references to pop culture, most specifically movies and television. Sometimes the reference is in a comment like “groovy”, and sometimes it’s in the sample data like a character’s name. Like any comedian, every joke or reference can’t be a hit with everyone. I will say though that I have heard more from those that recognized the references and appreciated them (which helps carry a reader through the lesser interesting parts) than I have from those that found them annoying.

What really did work was including hints and tips which explicitly said where Joe had received unexpected results with slightly different code. If anything is unexpected to the author, it may well be unexpected to readers too, so I really appreciated reading that sort of thing. (It would be wearing if Joe were stupid and expected all kinds of silly results, but that’s not the case at all.)

Conclusion

Pro LINQ is a good book. It has enough niggles to keep me from using superlatives about it, but it’s good nonetheless. It’s Joe’s first book (just like C# in Depth is the first one I can truly call “mine”) and I hope he writes more. Having read it from cover to cover, I think it’ll be more useful as a reference for individual methods (when MSDN doesn’t quite cut it) than to reread whole chapters, but that’s not a problem. My slight complaints above certainly don’t stop it from being a book I’m pleased to own.

[Joe] I’ll take it as a compliment that you think my book would be useful for those times that MSDN isn’t good enough!

This is the first LINQ book I’ve reviewed – I already have LINQ in Action, which is also on the list to review at some point. (I’ve read large chunks of it in soft copy, but I haven’t been through the finished hard copy yet.) It will be interesting to see how the two compare. Next up will probably be “Programming C# 3.0” by Jesse Liberty, however.

16 thoughts on “Book review: Pro LINQ – Language Integrated Query in C# 2008, by Joe Rattz”

  1. The onclick handler got stripped by the weblog software, it seems.

    (Hello, by the way, from an old friend from W2W. I’ve enjoyed reading your C# stuff for some time now.)

    Like

  2. the link below should toggle the visibility of the comments.
    It’s relatively simple stuff, but my CSS and JavaScript are not what
    they might be – so let me know if it all goes pear-shaped

    It seems to have gone pear-shaped. :)

    On Safari (Mac), Opera (Mac), and IE8 (Windows), I was unabled to get the comments to be shown by clicking the “Show/hide comments” link.

    On a related note, the comments _do_ show up in the RSS feed, but of course they are unformatted and so very difficult to distinguish from the main article. I’m a fan of summaries in RSS, at least in this sort of context where the article itself may be very long, and in this particular example using a summary would force readers to view the article as it was originally intended (i.e. without comments first, then going back and seeing the “director’s commentary” :) ).

    Of course, I suppose that while the implementation is in fact pear-shaped, the RSS feed makes for a reasonable work-around. :)

    Like

  3. By the way, just looking at the HTML source, it seems that the problem is as basic as the attached script simply being incomplete. It reads as follows:

    function hideOrShowComments(link)
    {
    var divs = link.parentNode.parentNode.getElementsByTagName("div");
    for (i=0; i

    Note that it’s just stops. Probably there should have been a < where you have an angle-bracket. :)

    Like

  4. @peted: Yup, looks like it was stripping stuff. (Previously it was stripping input tags.) Oh, and it also changes double quotes into ampersand quot semi-colon. Grr.

    It doesn’t help that of course the HTML I’m uploading works fine locally. It also doesn’t help that I’ve currently only got access to mobile broadband with a poor signal, and I’ve had guests round this evening… Oh, and whenever I post to the blog it spews out a cached copy for a while. Basically it’s a pain in the neck fixing problems like this :(

    @Peter: Originally Joe had written “[Joe]” at the start of each of his comments. Keeping that in might help the RSS feed to be more useful… I’ll reinstate it now. And yes, I’ll get round to CLR via C# some time – I have it now. I didn’t realise it’s just the second edition of another title; I feel less guilty about not having read it before (as I’d read the first edition).

    I’ve got another couple of Apress books on their way too, and Jesse Liberty’s “Programming C# 3.0”.

    @Everyone: Bear with me while I try to get all this fixed. If I can’t sort it out, I’ll just make the comments show up the whole time, at least for now.

    Jon

    Like

  5. Ironically, I see that in my comment, the character entities have themselves not been quoted and so show up as the actual characters, not the XML entity code.

    So where I wrote “there should have been a < where you have an angle-bracket", it should have actually said "there should have been a < where you have an angle-bracket". There was a similar alteration for the double-quotes, which I posted as " but which actually got displayed as ".

    (I made an attempt to use an explicit character entity for the ampersand in the previous paragraph, but it remains to be seen whether that gets translated somehow too :) ).

    As far as work-around goes until you can get the comment-hiding working right, you might consider placing all of the comments at the end of the post, with links to them via page-relative anchors. A little less elegant than in-line, but not terribly so and perhaps closer to your original intent than just having them shown all the time.

    Like

  6. Okay, I’m giving up for the night. I’ve got close to getting it working by putting the script in an external file – but then I need to hook up the click event as well. I’ll have another go tomorrow. The roundtrip time for testing changes is just too painful at the moment.

    Jon

    Like

  7. The only thing that displays in MS Outlook 7007 RSS Feed reader is “Overture”. You have to takr the option to see the full article in a browser to read further

    Mike

    Like

  8. Really enjoyed that review Jon! I would certainly like to read more reviews written in that style.

    I would also like to +1 Peter’s comment about reviewing CLR via C#, even though the book is now quite old.

    Like

  9. @Mike: Not a lot I can do about that, I’m afraid. As we’ve seen with the comments debacle, I can’t control a lot with this blog :(

    @Paul: Glad you enjoyed it. Looks like the “interleaved” comments aren’t too much of a problem. I don’t expect all authors will give full feedback, of course – but when they do (and if they don’t mind), I’ll post it in a similar way.

    Next stop will almost certainly be “Programming C# 3.0” but I’ll try to do “CLR via C#” after that.

    Jon

    Like

  10. Looks like Google Reader only shows “Overture” too. I’ve really no idea why. I’ll get rid of all trace of the JavaScript etc and see if that helps.

    Like

  11. “[Joe] My preference would be to use Cast in development and debug built code…but to use OfType in production code.”

    Hiding errors in production is just as bad an idea as hiding them in development. If you expect all of the members of an enumeration to be of a particular type, and one or more members are actually of another type, that means reality is not lining up with your expectations, and there is probably a deeper bug somewhere. Hiding the problem doesn’t make it go away, it just makes it harder to find when you need to find it. Fail fast!

    Like

  12. @David Nelson:

    Well if you used Cast in development and testing, and the error wasn’t caught by the developer or QA, you missed your chance to fail fast. The next question becomes, what do you want the user experience to be? Sometimes users enter unanticipated input that programmers just don’t think to catch and bad things happen.

    So if you have been editing in Visual Studio for a couple hours without saving (yeah, I know, who would do that?) and some error managed to slip through QA at Microsoft, you would prefer Visual Studio to just throw a fatal exception, shutdown the process, and lose all your work rather than just ignoring the problem?

    Like

  13. I’m with David on this: throwing an exception is the right way to go. Throwing an exception is likely to make the user see an error message (hopefully a nice one, etc). That’s not great, but it’s *much* better than the system just proceeding as if nothing had gone wrong.

    Note that throwing an exception doesn’t have to (and shouldn’t) mean shutting down the process without any chance to save things. However, I would far rather that than it *silently* losing data.

    When something is seriously wrong (as it would have to be in this case), the whole operation should be aborted as quickly as possible rather than silently pretending to succeed. Using OfType here could hide the bug for months, leading to customer support calls which would quite possibly have no indications of problems in the logs, and be very hard to track down. An exception is brutal, but when the system is so badly hosed that you’ve got unexpected and incorrect data in it, proceeding as if nothing had happened is simply wrong IMO.

    Jon

    Like

  14. @Joe,

    In that particular case, that’s what autosave is for :) There are two problems with the approach you are suggesting. First, hiding problems that seem innocuous can lead to further data corruption. For example, what if one of those instances in your enumeration, actually contained data that needed to be processed, but you ignored it because it was a different type? Now you have lost data, and no one even knows about it! Throwing an exception might also lose data (if the data being processed is not already stored persistently), but at least the user knows that a problem occurred and can take steps to correct it, outside of the context of your application if necessary.

    Second, by hiding the problem where it occurred, you make it harder to determine whether there is actually a problem, and harder to track down the problem once someone figures out that a problem is actually occurring. Your assertion that once your code is in production “you missed your chance to fail fast” is incorrect. The point of failing fast is to make sure that the symptom occurs as close to the cause as possible, so that you can backtrack from the symptom to the cause as quickly and easily as possible. This is true whether the problem is occurring in development or in production. In fact it is even more important to fail fast in production, since getting the system back in proper working order as soon as possible is essential to customer relations (even if your customer is your boss). Having an application crash (with a friendly error message) may not be ideal, but if it helps correct the problem and get the system back in working order more quickly, it is worth it.

    Like

Leave a comment