DISCOVERY

September 17th, 2019

Using LINQ in C#

C#

.NET Core

During free time at work, I've been reading a book called C# 7.0 In a Nutshell. When I get home, I write C# programs based on what I learned. One of the really interesting topics I read about was LINQ (Language Integrated Query), which is the integration of query functions and keywords in the C# language1. LINQ can be used to query remote data sources such as an RDBMS or local data structures.

LINQ reminds me of writing PL/SQL, which is a procedural language provided for the Oracle database. PL/SQL is a superset of SQL, allowing for looping, variable declarations, conditional logic, error handling, and more. The best feature of PL/SQL is the integration of SQL queries directly into an imperative programming language. Unfortunately, PL/SQL is strictly tied to the Oracle database and has clunky syntax in my opinion. LINQ on the other hand can be used with multiple different databases along with local data structures. Also, in my opinion, C# has much nicer syntax.

I'll save my exploration of integrating LINQ with a remote database for a future article. Today, I'm focusing on LINQ basics with local data structures.

LINQ is found in the System.Linq namespace in C#. LINQ queries consume an input sequence and produce an output sequence. They provide the ability to query enumerable data in local data structures or remote data sources. For example, LINQ is available for a locally defined dictionary.

Dictionary<string, int> deerSpottedToday = new Dictionary<string, int> { {"Mianus River Park", 2}, {"Cognewaugh Road", 1} };

This dictionary represents the number of deer I saw walking and driving. There are two approaches for using LINQ with this dictionary. The first approach is called fluent syntax, which utilizes chainable static extension methods defined on the IEnumerable interface2. An extension method is a method added to an existing type without modifying the type definition. The second approach is called query syntax, which uses native keywords to build queries.

Below is a filtering query performed on the deerSpottedToday dictionary. It uses fluent syntax and the Where() extension method.

IEnumerable<KeyValuePair<string, int>> multipleDeer = Enumerable.Where( deerSpottedToday, spot => spot.Value > 1 ); Assert(multipleDeer.Count() == 1);

Where() accepts a delegate (lambda function) as an argument. It checks each element in deerSpottedToday to see if they match the delegate predicate. Only "Mianus River Park" matches the predicate because it's the only dictionary element with a value greater than 1.

In this code sample, I called Where() directly on the Enumerable class. However, it's also invokable directly on the dictionary instance. This is because Dictionary implements IEnumerable. The code above can be shortened to use the IEnumerable extension method.

var multipleDeer = deerSpottedToday.Where(spot => spot.Value > 1); Assert(multipleDeer.Count() == 1);

Query syntax uses C# keywords instead of method calls. This is my favorite aspect of LINQ. Rewriting the fluent syntax method chain in query syntax results in the following statement:

var multipleDeer = from spottedDeer in deerSpottedToday where spottedDeer.Value > 1 select spottedDeer; Assert(multipleDeer.Count() == 2);

Query syntax is more reminiscent of SQL, although LINQ and SQL have some major differences (as I'll explore in my next article). Let's quickly run through the different pieces of this query. from {x} in {y} defines the input sequence for the query (y) and loops through its contents, which are assigned to a variable (x). where {condition} defines a filter for the data, in my case a dictionary item must have a value greater than 1. Finally, select {z} defines which items (z) to will be present in the output sequence of the query.

Let's look at some other operators offered by LINQ. orderby provides the ability to sort the output sequence of the query before returning it.

double[] walksPastWeek = { 3.5, 2.48, 3.6, 3.98, 3.59, 1.74, 1.54 }; // Linq query with additional OrderBy() and Select() query operator methods. // Select() is basically a map() function. var greaterThanTwoMiles = walksPastWeek.Where(miles => miles > 2) .OrderBy(miles => miles) .Select(miles => Math.Round(miles)); Assert(greaterThanTwoMiles.Count() == 5); Assert(greaterThanTwoMiles.First() == 2); // The fluent syntax query above can be rewritten in query syntax: var greaterThanTwoMilesAgain = from distance in walksPastWeek where distance > 2 orderby distance select Math.Round(distance); Assert(greaterThanTwoMilesAgain.Count() == 5); Assert(greaterThanTwoMilesAgain.First() == 2);

orderby {x} descending is used to sort in reverse order.

var greaterThanTwoMilesDesc = greaterThanTwoMiles.OrderByDescending(miles => miles); Assert(greaterThanTwoMilesDesc.First() == 4); var greaterThanTwoMilesDescAgain = from miles in greaterThanTwoMiles orderby miles descending select miles; Assert(greaterThanTwoMilesDescAgain.First() == 4);

One interesting aspect of LINQ queries is that they don't produce output sequences when defined. They exhibit deferred execution3. Deferred execution results in some interesting behavior.

Deferred execution is proven by altering the input sequence after the LINQ query is declared. In the following example, the result of the query changes when the input sequence is altered.

var deerSpottedThisWeek = deerSpottedToday; var deerQuery = from deerSpotted in deerSpottedThisWeek where deerSpotted.Value > 1 select deerSpotted; // Right now, there is only one location where more than one deer was spotted. Assert(deerQuery.Count() == 1); // I saw another deer on Cognewaugh Road today. deerSpottedThisWeek["Cognewaugh Road"] = 2; // Due to deferred execution, now there are two locations where more than one deer was spotted. Assert(deerQuery.Count() == 2);

Deferred execution makes LINQ queries reusable for a mutating data structure. Deferred execution also allows LINQ queries to contain local variables that can change between executions. For example, the following query has a configurable where clause.

// If the values of lexical scope variables referenced in a query change before execution, // the query will honor that change. int numOfDeer = 2; var sameDeerQuery = from deerSpotting in deerSpottedThisWeek where deerSpotting.Value >= numOfDeer select deerSpotting; // When numOfDeer is 2, there are two locations that match the query. Assert(sameDeerQuery.Count() == 2); numOfDeer = 3; // When numOfDeer is 3, there are no locations that match the query. Assert(sameDeerQuery.Count() == 0);

If you are curious to learn more about LINQ queries, check out the full code on GitHub. I explore grouping, nested queries, combined queries, anonymous types in queries, and more.

This article introduced LINQ, a framework that integrates queries into C#. I decided to write about LINQ because it's a feature I believe more programming languages should consider adopting. So much code is based around filtering, mapping, and reducing data. The more options developers have to accomplish these task the better. Check out my next LINQ article where I discuss integrated queries with LINQ and SQL Server.

[1] "Language Integrated Query (LINQ)", https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/linq/

[2] Joseph Albahari & Ben Albahari, C# 7.0 in a Nutshell (Beijing: O'Reilly, 2018), 355

[3] Ibid., 364