In this Stack Overflow question, I used a query continuation from a select clause, and one commenter expressed surprise, being unaware of what "select … into" meant. He asked for any references beyond the MSDN "into" page, and I didn’t know of any specific ones. So, here’s a very quick guide.
When "into" is used after either a "group x by y" or "select x" clause, it’s called a query continuation. (Note that "join … into" clauses are not query continuations; they’re very different.) A query continuation effectively says, "I’ve finished one query, and I want to do another one with the results… but all in one expression." This query:
// other query clauses here
select x.SomeProperty into z
// other query clauses here (involving z)
select z.Result;
Has *exactly* the same behaviour as this (leaving aside the visible local variable):
// other query clauses here
select x.SomeProperty;
var query = from z in tmp
// other query clauses here (involving z)
select z.Result;
Indeed the specification is written in terms of a transformation a bit like that. Note that the query continuation starts a clean slate in terms of range variables – after the "into z" part, x is not in scope.
Personally I usually split a query up into two statements with a local variable (i.e. the second form) rather than using a "select … into" query continuation, but it’s useful to know about them. I find I use "group … into" much more often, and I suspect others do too – it’s relatively rare to see a "group by" clause ending a LINQ query on its own.
Using linq to sql, both are the same too?
And let keyword, what is the difference?
LikeLike
@Felipe: The C# compiler applies the same transformations to query expressions whatever the data type is.
As for “let” – that introduces an *extra* range variable into the query, whereas “select into” effectively wipes the slate clean – the only range variable in scope directly after a “select into” clause is the one introduced by it.
LikeLike
This was a very surprising post to me, but for an unexpected reason. It’s one of those rare cases where VB is better than C#.
In VB, the equivalent to “select V into X” is the much more natural “select X = V”.
The VB style is more natural because it is consistent with the “let X = V” query and with the general look of assignments in the language.
LikeLike
@Strilanc: Are you sure that’s *actually* the equivalent? Note that this isn’t just a normal select clause… what would a query expression which continued after the select in VB look like? Would it definitely *not* have access to the previous range variables?
LikeLike
Interesting, I didn’t know that, though I almost always use the extension-method syntax for LINQ (i.e. seq.Where(x => x x * 2)).
Is there any real advantage in using the query-like syntax?
LikeLike
@ShdNx: I use dot notation for simple queries… but when they get more complex, it becomes unwieldy. In particular, when you start using features of query expressions which introduce transparent identifiers and/or which use multiple delegates, it’s painful. It’s worth being familiar with both, and using each one at the appropriate time.
LikeLike
@skeet Yes, it is equivalent. Anything after select X = V does not have access to range variables preceding it.
Another difference between VB and C# queries is the omitable select statement. I like to abuse it for reading input into anonymous types:
dim input = (From index in Range(count)
let x = br.ReadSingle()
let y = br.ReadSingle()
let n = br.ReadInt32()
).ToArray()
dim lastN = input.Last().n
LikeLike
@Strilanc: How odd (the first part) – it looks just like a *normal* select constructing an anonymous type with a property X, unless I’m missing something. (I probably am, I have little experience with VB query expressions.)
But when it comes to the ability to omit the select clause, I’m absolutely with you… it just shouldn’t be necessary. I wouldn’t be surprised to see that change at some point.
LikeLike
I’m not sure what you mean by a normal select. In C# the Select X = Y style is a compile-error, so how can it be normal?
To be precise, the following VB query:
From x In {1, 2, 3}
Select y = x + 1
Select z = y + 1
is converted into (reflector C#):
new int[] { 1, 2, 3 }.Select(
new Func(Class1._Lambda$__1) //_Lambda$__1 is return x+1
).Select(
new Func(Class1._Lambda$__2) //_Lambda$__2 is return y+1
);
compared to the C# query:
from x in new int[] { 1, 2, 3 }
select x + 1 into y
select y + 1 into z
select z;
which is converted into the similar-but-equivalent (reflector C#):
new int[] { 1, 2, 3 }.Select(delegate (int x) {
return (x + 1);
}).Select(delegate (int y) {
return (y + 1);
}).Select(delegate (int z) {
return z;
});
LikeLike
@Strilanc: I mean it looks like a normal VB select clause, not a normal C# one. I think I need to look at VB queries in more detail some time :)
What decides whether VB is going to create an anonymous type with a single property, or just use it as a sort of range variable?
LikeLike
As far as I know, VB queries with just one named range variable will always simplify to a ‘normal’ sequence (essentially discarding the name).
The reason it works that way is the implicit select. It would be inconvenient if “from x in {1,2,3} where IsPrime(x)” had type IEnumerable instead of IEnumerable.
That being said, I consider it a minor bug that, when the last expression is a named select, the compiler throws away the name. In fact, I’m going to report it right now.
LikeLike
Hi Jon,
I was just excited that I was able to apply this to one of my answers on SO. I just had to show you. :)
http://stackoverflow.com/questions/3818734/linq-orderby-another-query/3819011#3819011
Jeff
LikeLike