My previous article explored LINQ, a module that brings query syntax to C#. All the examples in that article queried local data structures. While using LINQ on local data is valuable in itself, LINQ can do much more. LINQ really shines when used to query remote data sources. Queries on remote data sources such as relational databases are known as integrated queries. In this article, I explore integrated queries with a SQL Server database. First I create the SQL Server database instance with Docker and then query it using LINQ.
SQL Server is a relational database created and maintained by Microsoft. It's an enterprise centric database comparable to the Oracle database. In order to work with integrated queries, I set up a SQL Server database instance on a Docker container. This Docker container runs on my local machine and is queried from my C# code.
Microsoft provides SQL Server Docker container images on their DockerHub repository1. Their mcr.microsoft.com/mssql/server image creates an empty SQL Server database with exposed environment variables that configure passwords, accept license agreements, and more.
While this Docker image provides most of the functionality I need, I also want to pre-populate SQL Server with some data. I found a Microsoft employee's GitHub repository that builds a similar Docker image with pre-populated data2. With this repository and the official SQL Server Docker image as my base, I created the following Dockerfile:
Since this SQL Server database is only used for demo purposes, I have the database password hard-coded in the Dockerfile. Most of the heavy lifting is done in the base mcr.microsoft.com/mssql/server:2017-latest-ubuntu image and the setup-db.sh Bash script. The Bash script starts the database and populates it with data.
The final piece is createDB.sql which creates database tables and populates them with data. For this demo, I'm using programming language statistics for my data. There are two tables in the database - one describing programming languages and another containing the number of lines coded in a language over the course of a year. Here is an abbreviated version of the SQL script:
The SQL Server Docker image is built and run with the following two commands:
Now that the Docker image is up and running, let's explore how to connect LINQ queries to the SQL Server database instance.
To work with SQL Server in C#, we first need to open a connection to the database. For .NET Core, the package used to establish this connection is Microsoft.EntityFrameworkCore.SqlServer. You can easily install this package with nuget. The class in this package used to create a database session and query the tables is DbContext. I created a custom LanguageContext class which extends DbContext:
OnConfiguring is invoked upon construction of each LanguageContext instance. It establishes a connection to the database with the supplied connection string. LanguageContext also defines two DbSet properties which represent tables in the database. Language and CodeWritten are custom model objects containing all the columns and attributes of my tables.
Model objects are relatively simple, holding the columns in their respective tables. However, they also hold their tables relationships such as foreign keys. This allows for table join operations to be performed in C#.
One potential source of confusion is that DbSet caches a tables contents after performing a query. This means that querying DbSet may not reflect changes made to the database during the lifetime of a LanguageContext instance. I won't be inserting, updating, or deleting any data in SQL Server during the following examples, so we don't need to worry about this scenario.
Now that SQL Server and C# are connected, let's perform some integrated queries with LINQ!
To work with integrated queries, we need an instance of LanguageContext. All of the LINQ examples in this article are enclosed in the following using code block3:
using will properly instantiate and tear down the database context. LanguageSet and CodeWrittenSet are accessible from context. The following code confirms that both contain the expected data counts (each language has lines written statistics for six years).
Now let's begin using LINQ. The following queries get the newest and oldest languages in the Language table.
Here are the equivalent SQL queries:
These queries are relatively simple and use the same LINQ methods discussed in my previous article. The next set of queries build on one another, with the final query performing an inner join.
These queries demonstrate how our model objects make joining tables easier to do.
For certain SQL keywords, LINQ and the EntityFrameworkCore package provide functions with equivalent behavior. For example, using Contains() in a where clause is similar to a SQL IN statement and using EF.Functions.Like() in a where clause is similar to a SQL LIKE statement.
LINQ also has equivalent methods for SQL aggregate functions such as SUM().
If you want to work with set operators, LINQ has the tools to do that as well. It provides a Concat() method which emulates UNION ALL and a Union() method which emulates UNION.
With the help of LINQ and the entire C# ecosystem, you can do queries in C# that wouldn't be possible in SQL alone. For example, the following code takes two queries and zips their contents together, resulting in a string. This string represents a language and the total number of lines written in it.
In my previous article I compared LINQ in C# to PL/SQL in the Oracle database. I believe this article enforces that claim and proves that incorporating query syntax into an imperative language is extremely expressive and powerful. While there are some differences between the LINQ queries and their SQL equivalents, learning LINQ is very easy for someone who already knows SQL. While I don't currently use C# for any of my personal projects, if I ever do I will take full advantage of LINQ and the power it gives developers. You can check out the full code for this article on GitHub.