I was explaining data pipelines in LINQ to Objects to a colleague yesterday, partly as the next step after explaining iterator blocks, and partly because, well, I love talking about C# 3 and LINQ. (Really, it’s becoming a serious problem. Now that I’ve finished writing the main text of my book, my brain is properly processing the stuff I’ve written about. I hadn’t planned to be blogging as actively as I have been recently – it’s just a natural result of thinking about cool things to do with LINQ. It’s great fun, but in some ways I hope it stops soon, because I’m not getting as much sleep as I should.)
When I was describing how methods like
Select take an iterator and return another iterator which, when called, will perform appropriate transformations, something clicked. There’s a conceptual similarity between data pipelines using deferred execution, and higher order functions. There’s a big difference though: I can get my head round data pipelines quite easily, whereas I can only understand higher order functions for minutes at a time, and only if I’ve had a fair amount of coffee. I know I’m not the only one who finds this stuff tricky.
So yes, there’s a disconnect in difficulty level – but I believe that the more comfortable one is with data pipelines, the more feasible it is to think of higher order functions. Writing half the implementation of Push LINQ (thanks go to Marc Gravell for the other half) helped my “gut feel” understanding of LINQ itself significantly, and I believe they helped me cope with currying and the like as well.
I have a sneaking suspicion that if everyone who wanted to use LINQ had to write their own implementation of LINQ to Objects (or at least a single overload of each significant method) there’d be a much greater depth of understanding. This isn’t a matter of knowing the fiddly bits – it’s a case of grokking just why
OrderBy needs to buffer data, and what deferred execution really means.
(This is moving off the topic slightly, but I really don’t believe it’s inconceivable to suggest that “average” developers implement most of LINQ to Objects themselves. It sounds barmy, but the tricky bit with LINQ to Objects is the design, not the implementation. Suggesting that people implement LINQ to SQL would be a different matter, admittedly. One idea I’ve had for a talk, whether at an MVP open day or a Developer Developer Developer day is to see how much of LINQ to Objects I could implement in a single talk/lecture slot, explaining the code as I went along. I’d take unit tests but no pre-written implementation. I really think that with an experienced audience who didn’t need too much hand-holding around iterator blocks, extension methods etc to start with, we could do most of it. I’m not saying it would be efficient (thinking particularly of sorting in a stable fashion) but it would be feasible, and I think it would encourage people to think about it more themselves. Any thoughts from any of you? Oh, and yes, I did see the excellent video of Luke Hoban doing
Where. I’d try to cover grouping, joins etc as well.)
I’m hoping that the more experience I have with writing funky little sequence generators/processors, the more natural functional programming with higher order functions will feel. For those of you who are already comfortable with higher order functions, does this sound like a reasonable hope/expectation? Oh, and is the similarity between the two situations anything to do with monads? I suspect so, but I don’t have a firm enough grasp on monads to say for sure. It’s obviously to do with composition, either way.