Category Archives: Books

Getting started with F#

Updated 13th November: Robert pickering has answered the questions. I’ve included his answers inline. Thanks Robert!

The talks I went to at TechEd today (Friday) were mostly related to concurrency and F#. I’ve decided that although it’s fairly unlikely I’ll use it in a serious manner, it’s worth learning F# mostly to become more comfortable with functional programming in general. Apart from anything else, this is likely to help in terms of LINQ and general concurrency.

I was fortunate enough to win a copy of Robert Pickering’s “Foundations of F#” today at the final talk, answering a “pop quiz” question set by Joe Duffy. (Whoever claims that there’s no point in a C# developer knowing details of the CLR memory model is thus proven wrong in a most bizarre way.) I’ll probably buy Don Syme’s “Expert F#” when it comes out. I’ve been the Foundations book on the plane, where I’m currently hunched over my laptop, desperately hoping it doesn’t run out of power before we land. My situation has one important impact on this post: I haven’t actually written any F# yet. As it happens, with the VS2008 release being pretty imminent, I’ll probably wait until that’s out before installing F#.

In a way, my complete inexperience with the language is a good thing, for the sake of what I’m writing here: these are my raw impressions of F# based solely on what I’ve read. Now, some of the points I’m going to make here are potentially about the book instead of F# – I want to make it perfectly clear that I’m not trying to criticize the book. In general, it’s been an easy read so far (I’ve read about 70 pages) and my main problem with it is that there are typos. Normally that’s fine, but there are some cases where I don’t know whether I’m missing something about F# or whether there’s just a typo. Errors like this are to be expected, however hard we try to avoid them. I know that I’m still finding typos in chapters of C# in Depth that I’ve read several times, and I’m sure there will be some in the finished product. Anyway, with that disclaimer, along with the reiteration that these are just first impressions, here are my thoughts so far. They’re numbered for the sake of ease of reference, and as I find out answers to any questions, I’ll include them at the end of the point in italics.

  1. Byte literal strings – what encoding is used? This feels like a bad idea, although I’m sure it’s useful sometimes. (At least the normal strings are plain .NET strings; as I understand it IronRuby uses non-Unicode strings internally by default and only converts to/from Unicode when calling .NET code. The things we do in the name of compatibility…)

    Robert: It doesn’t say what encode is used in the language specification, however I strongly supect its UTF-8 as the spec says that unicode characters are allowed but “A”B comes out as [|65uy|]. I think they were added to help out Ocaml users who are used to having mutable strings thus providing an easier migration for ocaml programs which take advantage of mutable strings (although having non-mutable strings generally feels like a good design choice for an FP).

  2. #light is used at the start of every listing , and Robert explains that he’s not going to explain the non-light syntax. I know this is only a single book, but isn’t that at least a good indiciation that perhaps it should be the default syntax style?

    Robert: The light syntax really isn’t that different from the non-light syntax, it just means you can miss out certainty tokens such as “in” and semi-colon (;) when the context is clear from the whitespacing. This small change does however make listings look a lot neater and forces you to make whitespacing reflect the structure of the program (which is a very good thing). The choice not to explain it was made for two reasons 1) I though explaining would be more confusing that just telling beginners not to worry about it 2) it provides beginners very clear guidance to what syntax they should be using. Should the F# compiler be light by default? I can see the advantages, but adds a little extra complexity for people trying to do F#/Ocaml cross compilation.

  3. List comprehensions: why do I need [] round the range when declaring a value, but not for iteration? I can do “for i in 1..5” but not “let x = 1..5”. Isn’t this inconsistent?

    Robert: This is because the surrounding bracket types denote the type of collection [] is list [||] is array and {} is Seq/IEnumerable, this is directly copied from creating a literal list of these types. Also you can’t use say “for i in 1..5” you need to say “for i in do …” which you can also do in list comprehension [for i in 1..5 -> (i, i*i) ] for a list of tuples of squares.

  4. The “when” clause in a for loop – I’m sure it’s to look like OCaml, but coming at a time when the non-functional world is used to “where”, it’s unfortunate. I’m not saying that F# has made the wrong decision here – it’s just a pain when two different histories collide.

    Robert: No comment :)

  5. P28 has ‘let objList = [box 1; box 2.0; box “three”]’. The box operator here isn’t fully explained as far as I can see – but the main interesting idea is that a string could be boxed. In “classic” .NET boxing refers to creating a reference type instance out of a value type value – but string is already a reference type. What’s going on here?

    Robert: box is not just for boxing structs, it also performs an upcast to type object. A result of F#’s type inference is that there is no implicit upcasts as there are in C#. F# provides the box keyword and :> operator to compensate for this

    Jon: That seems pretty unfortunate to me, when box already has a clearly defined meaning in .NET. I suspect this will confuse quite a few people.

  6. P37 looks like it’s using a C-style pre-decrement: ‘| x -> luc (x – 1) + luc (–x – 2)’ – is that –x just a typo for x?

    Robert: –x is just a strange and embarrassing typo (already listed in the errata)

  7. Another possible typo: P41, first listing the final line is ‘| [] -> ()’ where in the previous example the result had been [] instead of (). That makes more sense to me, as otherwise the function returns unit where otherwise it’s returning a list. Typo, or am I missing something?

    Robert: This is correct, in the other rules of this pattern matching we’re printing to the console, which has type unit as a result, so we need to use unit for the final rule as well.

    Jon: Oops. Doh!

  8. Union/record types: as a C# guy, my natural question is how these things look in IL. Is a union just a name and a value, and if so is it in a struct or a class? Will have to dig out reflector when I’ve got the compiler…

    Robert: Record types are classes with properties for the each of the fields. The union type is represented as class with inner classes to represent each of the cases. The outer class provides methods and properties for working with each of the cases from C#. There is a good description of what a union type looks like in the final chapter of the book.

  9. There are quotes in the output at the bottom of P48, from calling print_any: ‘”one”, “two”, “three”, “four”‘ – are these genuinely in the output? I haven’t found a specific description of print_any yet, but I don’t think we’ve seen quotes round all strings. I could be wrong though :)

    Robert: print_any tries to reconstruct its input into its literal equivalent, so strings come out quoted listed look like [“one”; “two”; “three”]

    Jon: Ah, handy. A bit like Groovy’s inspect() method.

  10. Another terminology clash: “raise” for exceptions rather than “throw” is likely to confusion me for a while; “raise” sounds like an event to my C#-tuned ears.

    Robert: Again, no comment

  11. Supposedly OCaml is really efficient when it comes to exceptions. Now, I usually find that when people talk about exceptions being expensive in .NET they’re basing that on experience under the debugger. Are OCaml exceptions really significantly faster than under .NET? It’s far from impossible, but I’d be interested to know.

    Robert: I believe OCaml exceptions are a lot faster that .NET exceptions, but then they don’t carry any of the debugging information that .NET exceptions do. It is reasonably common to use exceptions as follow control in OCaml which is a bit of a no no in .NET

  12. Why is “lazy” a keyword but “force” isn’t? It feels odd for part of a feature to be in the language but its other side (which is necessary, as far as I can see) being in the library instead.

    Robert: I believe lazy is a keyword because it helps tidy up the precedence when you are creating lazy values. It maps directly to a function in the F# libraries that can be used instead. I believe this is a design decision inherited from OCaml.

  13. “unit” is an interesting name for what I’d normally think of as “void” – I wonder what the history is here? “void” isn’t as descriptive as it might be, but “unit” is even less obvious…

    Robert: Wikipedia has some more info here: http://en.wikipedia.org/wiki/Unit_type but is doesn’t cover the history of the name.

  14. It took me a little while to get the hang of () really being like void instead of like null – so () as a function return is the equivalent of “return;” which requires a void return type – it’s not “return null;”. Nothing in the book suggests that it is null – that was my own misunderstanding, and I don’t know what caused it.

    Robert: No comment :)

  15. P59 talks about mutable record types, and states that “this operation [updating a record] changes the contents of the record’s field rather than changing the record itself”. Now, I suspect the difference being alluded to is the same as “changing a value within a reference type instance’s data is not the same as making a variable refer to a different instance” – but if it’s not, I don’t know what it is meant to be saying.

    Robert: Yes, you are change the value with the reference

  16. P67/68: Do while and for loops require a “done” terminator or not? The text claims they do, but the examples don’t include it. My guess is that the language specification changed while the book was being written, and that the examples are correct, but I’d appreciate clarification :)

    Robert: They only require done when there’s no #light declaration and yes #light declarations were added half way though writing the book

  17. P69: I assume “for x in y” works where y is any sequence (i.e. anything implementing IEnumerable)?

    Robert: Yes, its works for anything that implements IEnumerable<T> or IEnumerable.

  18. P70: Calling methods and specifying parameter names – in the example, the parameter names are actually in the same order as they’re declared in the method itself. Is this always the case? Can you reorder the use of the parameters? Note that (at least in C#) reordering could have an effect on what the eventual arguments were, if the evaluation of the arguments has side effects. I’d really like to see this feature in C# though – it would save on all those comments explaining which parameter method means what!

    Robert: Just tested:
    System.Console.WriteLine(arg = [| box 1 |], format = “{0}”)
    And it compiles okay, so named arguments can be reordered.

    Jon: Ooh, I’ve got feature envy :)

  19. Why can’t I use a .NET method as a value, just as I can use an F# function value? If the method were overloaded I might have to provide some type information to the compiler to tell it which overload I mean, but otherwise I should be able to use it just like anything else without wrapping it. I assume there’s a deep implementation reason why this is impossible, but it seems a bit of a shame.

    Robert: Actually you can now, it’s another case of the language spec changing while [the book was being written]

    Jon: Cool. Always good to hear news like that.

  20. P73: Indexers aren’t always called Item – that’s just the default name for the default indexer. You can have other named ones, although C# will only use whichever has the appropriate attribute applied. (Even in C# you can change the name emitted when you declare an indexer though.)

    Robert: I’ve just read sections 10.9 of C# 3.0 specification and I see no reference to attributes or the ability to change the indexer name. Also section 10.3.9.3 “Member names reserved for indexers” seems to suggest that Item is the only reserved name. What am I missing?

    Jon: The attribute in question is System.Reflection.DefaultMemberAttribute – but it appears that you can’t apply it in C# when there’s already an indexer, contrary to my previous belief. However, if you use IL which specifies a DefaultMemberAttribute other than Item, it works fine.

  21. Also on P73, there’s this line: ‘let temp = new ResizeArray<string>() in’ – is the ‘in’ part here a typo, or is it another bit of syntax that I’ve missed somewhere?

    Robert: “in” is optional here because of the #light syntax

  22. What’s the :> operator shown  on P77?

    Update: P82 explains that it’s the upcasting operator

  23. What is ‘try … match’ referred to on P77?

    Robert: This is try … match is the equivalent of try … catch

  24. Is box actually a keyword or not? It’s not listed in the list of keywords, but it looks like it should be…

    Robert: box is a function not an keyword, its defined in Microsoft.Fsharp.Core.Operators if you want to see its definition (prim-types.fs).

I don’t intend to make notes as I read the rest of the book – it takes too long. However, I hope any F# evangelists find these initial reactions useful to some extent.

I love LINQ: Simplifying a tedious task

As mentioned in my previous post, I’ve been putting together the code samples for C# in Depth. Now, these are spread across several projects in a few solutions. They’re referred to in the book as things like “Listing 6.2” but I’ve given the files “real” names in the projects. When you run any project with multiple executable listings in it, the project offers you options of which listing to run, showing both the type name and the listing, which is embedded using System.ComponentModel.DescriptionAttribute, e.g. [Description("Listing 12.4")]. A few listings have a description which is more than just “Listing x.y” – for instance, “Listing 6.1/6.2/6.3”. These are displayed with no problems.

Now, the problem for the reader would be finding a particular listing – it’s not always obvious from the code what the type name should be, particularly when there are many variations on a theme. Clearly some sort of map is required. Ideally it should be a file looking something like this:

Chapter 1:
1.1: Foo.cs
1.2: Bar.cs
1.3/1.4: Baz.cs

Chapter 2:
2.1: Gronkle.cs

It’s easy enough to work out the directory for any particular file – the projects are helpfully named “Chapter3” and the like. So, the next thing is to create this file. I really didn’t want to do that by hand. After all, there are about 150 listings in the book – and I’ve already done the work of attributing them all. Ah… we could do it programmatically. Sounds like a bit of a slog…

… but it’s a problem which is ideally suited to LINQ. It’s also fairly ideally suited to regular expressions, much as I hate to admit it. The regular expression in question is reasonably complex, but thanks to Jesse Houwing’s advice on adding comments to regular expressions, the results aren’t too bad. Here’s the finished code – which of course is part of the downloadable source code itself.

using System;
using System.IO;
using System.Linq;
using System.Text.RegularExpressions;

namespace Chapter11.Queries
{
    /// <summary>
    /// The listings are scattered around .cs files within various directories.
    /// This class uses LINQ to find all classes with a suitable Description
    /// attribute, groups them by chapters and orders them by chapter and listing number.
    /// </summary>
    class DisplayListingsMap
    {
        static readonly Regex ListingPattern = new Regex(
            @"# First match the start of the attribute, up to the bit we're interested in
            [Description(""Listing 
            # The 'text' group is the whole of the description after Listing
            (?<text>
            # The 'chapter' group is the first set of digits in the description, before a dot
            (?<chapter>d+).
            # The chapter group is the second set of digits in the description
            (?<listing>d+)
            # After that we don't care - stop the 'text' group at the double quote
            [^""]*)
            # Now match the end of the attribute
            "")]",
            RegexOptions.Compiled | RegexOptions.IgnorePatternWhitespace);

        static void Main()
        {
            DirectoryInfo directory = new DirectoryInfo(@"........");

            var query = from file in directory.GetFiles("*.cs", SearchOption.AllDirectories)
                        let match = ListingPattern.Match(File.ReadAllText(file.FullName))
                        where match.Success
                        let Details = new
                        {
                            Text = match.Groups["text"].Value,
                            Chapter = int.Parse(match.Groups["chapter"].Value),
                            Listing = int.Parse(match.Groups["listing"].Value)
                        }
                        orderby Details.Chapter, Details.Listing
                        group new { File = file, Description=Details.Text } by Details.Chapter;

            foreach (var chapter in query)
            {
                Console.WriteLine("Chapter {0}", chapter.Key);
                foreach (var listing in chapter)
                {
                    Console.WriteLine("{0}: {1}", listing.Description, listing.File.Name);
                }
                Console.WriteLine();                
            }
        }
    }
}

Isn’t it cool? The regex works out the listing number (first x.y part only) and sorts on that, grouping by chapter – then we just display the results. There are other ways of skinning the same cat – such as grouping and then ordering “inside” and “outside” a chapter separately – but they’ll all boil down to the same sort of thing.

Book news

The book is coming along well, and here are a few snippets which may be of interest:

  • It’s now on Amazon
  • All the chapters and the appendix have been written and given a first set of edits
  • We’re going to “final review” stage soon – that doesn’t mean the text is being finalized just yet, but it means this is probably the last round of peer review
  • I’ve been putting together the downloadable source code (see next post for some fun)
  • I’m hoping that the next couple of chapters will turn up in MEAP soon
  • Daniel Moth very kindly let me plug it at the recent UK MVP Open Day
  • There’s another plug on the flyers I’ll be giving out promoting Iterative Training at TechEd Developer in Barcelona next week
  • Manning are doing a 25% discount when you buy LINQ in Action and C# in Depth together

The last point is particularly cool – it’s something I’ve been suggesting for a while, as the books complement each other very nicely.

C# in Depth: Chapters 6 and 7 now in MEAP

Chapters 6 and 7 have now been included in the Manning Early Access Program. That means that the whole of the C# 2 part of the book is now available. Marc Gravell has been picking holes in it on the forum (and I mean that in a very positive way – it’s great to have more eyes running over it). Can you find more errors? Here’s a rundown of chapters 6 and 7:

 

Chapter 6: Implementing iterators the easy way

In C# 1, it was a pain to implement IEnumerable. C# 2 makes it easy with iterator blocks, and this can make it worthwhile introducing IEnumerable where you might not have done before. Aside from anything else, it’s fun just to watch the C# compiler build a state machine for you!

 

Chapter 7: Concluding C# 2: the final features

Confession: this is really “C# 2: the features which didn’t fit anywhere else”. It’s a round-up of features which didn’t deserve their own chapters, and which could easily wait until the “big” features had been explored before being mentioned. Ironically, C# 3 is exactly the opposite – the “little features” in C# 3 are pretty key to understanding the big features such as lambda expressions and query expressions, which is why they’re in chapter 8. Chapter 7 covers the following areas:

  • Partial types (including partial methods from C# 3)
  • Static classes
  • Separate getter/setter property access (e.g. public getter, private setter)
  • Namespace aliases (using ::, the global namespace alias, and extern aliases)
  • Pragma directives
  • Fixed size buffers

LINQ to Silliness: Generating a Mandelbrot with parallel potential

I’ve been writing about LINQ recently, and in particular I’ve written a small amount about Parallel LINQ. (Don’t get excited – it’s only about a page, just to mention it as a sort of “meta-provider” for LINQ.) I was wondering what to use to demonstrate it – what general task can we perform which could take a lot of CPU?

Well, I used to be quite into fractals, and I’ve written Mandelbrot set generators in various languages. I hadn’t done it in C# before now, however. Calculating the colour of each pixel is completely independent of all the other pixels – it’s an “embarrassingly parallelizable” task. So, a great task for PLINQ. Here’s the “normal LINQ” code:

 

var query = from row in Enumerable.Range(0, ImageHeight)
from col in Enumerable.Range(0, ImageWidth)
select ComputeMandelbrotIndex(row, col);

byte[] data = query.ToArray();

Changing this into a parallel query is really simple – although we do need to preserve the ordering of the results:

var query = from row in Enumerable.Range(0, ImageHeight).AsParallel(QueryOptions.PreserveOrdering)
from col in Enumerable.Range(0, ImageWidth)
select ComputeMandelbrotIndex(row, col);

byte[] data = query.ToArray();

Without being able to actually use PLINQ yet, I can’t tell how awful the order preservation is – Joe warns that it’s costly, but we’ll see. This is on a pretty giant sequence of data, of course… An alternative would be to parallelize a row at a time, but that loses some of the purity of the solution. This is a very, very silly way of parallelizing the task, but it’s got a certain quirky appeal.

Of course, there’s then the code for ComputeMandelBrotIndex and displaying a bitmap from it – the full code is available for download (it’s a single C# file – just compile and run). Enjoy.

Update!

This blog post has been picked up by Nick Palladinos, who has written his own Parallel LINQ provider (much kudos for that – unfortunately for me the blog is in Greek, which I don’t understand). Apparently on a dual core processor the parallelised version of the Mandelbrot generator is indeed about twice as fast – it works! Unfortunately I can’t tell as my laptop only has a single core… it’s very exciting though :)

C# in Depth: Chapters 4 and 5 now available in MEAP

Chapters 4 and 5 of the book have now been made available for early access.

 

Chapter 4 – Saying nothing with nullable types

Nullable types depend heavily on generics (described in chapter 3) and require both language and runtime changes. In this chapter I explore the problem they solve, the types involved (including runtime changes) and the C# changes (int? meaning Nullable<int> and the various operators and conversions available). I also cover a couple of uses of nullable types which haven’t necessarily hit the mainstream, but can prove useful – the comparisons I wrote about in this blog a little while ago, and using nullable types as an alternative to out parameters for the TryXXX pattern.

 

Chapter 5 – Fast-tracked delegates

C# 3 relies on delegates a lot. You can’t do any real LINQ work without them. C# 2 laid a lot of the groundwork for the lambda expressions available in C# 2 when it introduced anonymous methods. There are other changes in C# 2 which improve delegates, however – and there are more methods in the .NET 2.0 framework which take advantage of delegates than there were in .NET 1.1. (I’m thinking particularly of List<T>.)

I cover all the improvements in this chapter, but most of the chatper is given over to anonymous methods, and the handling of captured variables in particular. Without wishing to sound like a spoilsport, the use of captured variables can look like magic. Captured variables are still just as useful when the magic is explained away, and they’re somewhat less scary!

 

More chapters to come soon, I expect – when chapters 6 and 7 are released, that will cover the whole of C# 2.

Announcing “C# in Depth”

Finally, I can properly talk about what I’ve been working on for about the last 6 months. The book I’ve been writing is called “C# in Depth” and it’s being published by Manning (just like Groovy in Action was). It’s about C# 2 and 3, and pretty much just C# 2 and 3. In particular, it’s aimed at people who already know C# 1 at least reasonably well. I believe there are plenty of people who are comfortable with C# 1 but either don’t know C# 2 at all or are familiar with it but have gaps in their experience. It’s for these people (hopefully many of whom are reading this blog!) that I’ve been writing.

I don’t know whether there are other books in progress in the same vein. I strongly suspect there will be several books which cover C# from scratch – and end up either skimping on detail, or being unliftable tomes. There may be some books which cover just C# 3 – which will make them less useful for developers who may not get to use C# 3 for a while, or don’t have enough C# 2 experience to fully appreciate C# 3.

Anyway, the first few chapters are now available in MEAP (Manning Early Access Program) at the C# in Depth website. The first chapter is available free, and you can get hold of the others by paying for either the e-book or the hard copy now. Obviously you won’t get the hard copy until it’s published, but you’ll get the electronic version as it gets updated, chapter by chapter. Here’s a quick rundown of the chapters which are available so far:

Chapter 1 – The changing face of C# development

This chapter is mostly introductory material, as you’d expect. There’s an outline and examples of some of the biggest features, a brief history of C# and .NET, and a little look at the “snippet” style of listings used in the rest of the book.

Chapter 2 – Core foundations: building on C#1

Although the rest of the book is written about C# 2 and 3, I wanted to make sure that all the readers had a good understanding of three aspects of C# 1: delegates, value/reference types, and the nature of C#’s type system. While all of them are important in C# 1, they’re often misunderstood – particularly delegates, which aren’t used very often in C# 1 beyond event handling. Delegates are friendlier in C# 2, and understanding C# 3 is practically impossible without a good handle on them.

Chapter 3 – Parameterized typing with generics

Generics are the biggest feature of C# 2, and the biggest change in the .NET 2.0 CLR as well. In this chapter I look at why they’re needed, how to use existing generic types/methods as well as how to write your own. Some more advanced topics are covered such as thinking about what the runtime does with generic types, and I examine the generic collection types provided by .NET 2.0. Finally, I cover some of the limitations of C# generics in C#, including the lack of covariance/contravariance (which is one of the most frequently asked questions in the C# newsgroup).

 

So, I hope that’s whetted your appetite a bit. Obviously I’d be overjoyed if all of you lovely people bought a copy of the book, but even if you don’t want to part with any cash right now, I’d still appreciate comments. Rather than talking about the book here in my blog, it’s best to use the author forum which has been set up for that very purpose. If you want to keep things private, you can email me directly of course.

Having worked this hard on the book, I reserve the right to plug it on this blog every so often – but I promise not to turn the blog into just a stream of adverts :)