Query expression syntax: continuations

In this Stack Overflow question, I used a query continuation from a select clause, and one commenter expressed surprise, being unaware of what "select … into" meant. He asked for any references beyond the MSDN "into" page, and I didn’t know of any specific ones. So, here’s a very quick guide.

When "into" is used after either a "group x by y" or "select x" clause, it’s called a query continuation. (Note that "join … into" clauses are not query continuations; they’re very different.) A query continuation effectively says, "I’ve finished one query, and I want to do another one with the results… but all in one expression." This query:

var query = from x in y
            // other query clauses here
            select x.SomeProperty into z
            // other query clauses here (involving z)
            select z.Result;

Has *exactly* the same behaviour as this (leaving aside the visible local variable):

var tmp = from x in y
          // other query clauses here
          select x.SomeProperty;

var query = from z in tmp
            // other query clauses here (involving z)
            select z.Result;

Indeed the specification is written in terms of a transformation a bit like that. Note that the query continuation starts a clean slate in terms of range variables – after the "into z" part, x is not in scope.

Personally I usually split a query up into two statements with a local variable (i.e. the second form) rather than using a "select … into" query continuation, but it’s useful to know about them. I find I use "group … into" much more often, and I suspect others do too – it’s relatively rare to see a "group by" clause ending a LINQ query on its own.

12 thoughts on “Query expression syntax: continuations”

  1. @Felipe: The C# compiler applies the same transformations to query expressions whatever the data type is.

    As for “let” – that introduces an *extra* range variable into the query, whereas “select into” effectively wipes the slate clean – the only range variable in scope directly after a “select into” clause is the one introduced by it.

    Like

  2. This was a very surprising post to me, but for an unexpected reason. It’s one of those rare cases where VB is better than C#.

    In VB, the equivalent to “select V into X” is the much more natural “select X = V”.

    The VB style is more natural because it is consistent with the “let X = V” query and with the general look of assignments in the language.

    Like

  3. @Strilanc: Are you sure that’s *actually* the equivalent? Note that this isn’t just a normal select clause… what would a query expression which continued after the select in VB look like? Would it definitely *not* have access to the previous range variables?

    Like

  4. Interesting, I didn’t know that, though I almost always use the extension-method syntax for LINQ (i.e. seq.Where(x => x x * 2)).

    Is there any real advantage in using the query-like syntax?

    Like

  5. @ShdNx: I use dot notation for simple queries… but when they get more complex, it becomes unwieldy. In particular, when you start using features of query expressions which introduce transparent identifiers and/or which use multiple delegates, it’s painful. It’s worth being familiar with both, and using each one at the appropriate time.

    Like

  6. @skeet Yes, it is equivalent. Anything after select X = V does not have access to range variables preceding it.

    Another difference between VB and C# queries is the omitable select statement. I like to abuse it for reading input into anonymous types:

    dim input = (From index in Range(count)
    let x = br.ReadSingle()
    let y = br.ReadSingle()
    let n = br.ReadInt32()
    ).ToArray()
    dim lastN = input.Last().n

    Like

  7. @Strilanc: How odd (the first part) – it looks just like a *normal* select constructing an anonymous type with a property X, unless I’m missing something. (I probably am, I have little experience with VB query expressions.)

    But when it comes to the ability to omit the select clause, I’m absolutely with you… it just shouldn’t be necessary. I wouldn’t be surprised to see that change at some point.

    Like

  8. I’m not sure what you mean by a normal select. In C# the Select X = Y style is a compile-error, so how can it be normal?

    To be precise, the following VB query:
    From x In {1, 2, 3}
    Select y = x + 1
    Select z = y + 1

    is converted into (reflector C#):
    new int[] { 1, 2, 3 }.Select(
    new Func(Class1._Lambda$__1) //_Lambda$__1 is return x+1
    ).Select(
    new Func(Class1._Lambda$__2) //_Lambda$__2 is return y+1
    );

    compared to the C# query:
    from x in new int[] { 1, 2, 3 }
    select x + 1 into y
    select y + 1 into z
    select z;

    which is converted into the similar-but-equivalent (reflector C#):
    new int[] { 1, 2, 3 }.Select(delegate (int x) {
    return (x + 1);
    }).Select(delegate (int y) {
    return (y + 1);
    }).Select(delegate (int z) {
    return z;
    });

    Like

  9. @Strilanc: I mean it looks like a normal VB select clause, not a normal C# one. I think I need to look at VB queries in more detail some time :)

    What decides whether VB is going to create an anonymous type with a single property, or just use it as a sort of range variable?

    Like

  10. As far as I know, VB queries with just one named range variable will always simplify to a ‘normal’ sequence (essentially discarding the name).

    The reason it works that way is the implicit select. It would be inconvenient if “from x in {1,2,3} where IsPrime(x)” had type IEnumerable instead of IEnumerable.

    That being said, I consider it a minor bug that, when the last expression is a named select, the compiler throws away the name. In fact, I’m going to report it right now.

    Like

Leave a comment