We all love Dapper micro-ORM.

Many developers have realized that although Entity Framework offers some features that Dapper doesn’t, Dapper is still more flexible (and much much faster) than EF. And many have realized that the right tool for the right job sometimes involving using both EF for some things and Dapper for others. And one of the usages where Dapper is the king is for building dynamic queries.

Dynamic SQL

Old-school developers (those who have been coding since the late 90s or early 2000s, probably in VB6, ASP3, or PHP) will probably remember code like this:

string sql = "SELECT * FROM [Product] WHERE 1=1";

if (!string.IsNullOrEmpty(Request["ProductName"]))
   sql += " AND Name LIKE '" + Request["ProductName"].Replace("'", "''") + "'"; // this replace!
if (!string.IsNullOrEmpty(Request["SubCategoryId"]))
    sql += " AND ProductSubcategoryID = " + Request["SubCategoryId"].Replace("'", "''");  // this replace!

// here is where you pray that you've correctly sanitized inputs against sql-injection
var products = cn.Query<Product>(sql);

Basically we were all sending dynamic SQL statements directly to our databases. And since user-input can be malicious (see below), we had to manually sanitize user-input to avoid SQL-Injection attacks.

Hello Bobby Tables

If you weren’t sanitizing your input correctly, you were at serious risk. SQL-injection is probably the most popular web-hacking technique. If you forgot to sanitize something like an “id” querystring, you could be opening a huge vulnerability and possibly exposing all your database, not only for reading but also for modifications.

Parameterized SQL

Even if you’re old school (hey VB6 guy, I’m looking at you again) you probably know that Dynamically building SQL like that is not only error-prone (vulnerable to SQL-injection) but also does not benefit from SQL Server Cached Execution Plans. You probably used some kind of code-generation tool which helped you to build code like this (maybe to invoke a stored procedure, since 3-tier-architecture and “Windows DNA” were in the hype in the early 2000s):

Dim strSQL As String
Dim cmd As New ADODB.Command

strSQL = "UPDATE MyTable SET " & vbNewLine _
    & " NEEDS_ID     = @NEEDS_ID, " & vbNewLine _
    & " OBJ_ID       = @OBJ_ID, " & vbNewLine _
    & " OBJ_COMMENTS = @OBJ_COMMENTS, " & vbNewLine _
    & " TIME21_ID    = @TIME21_ID, " & vbNewLine _
    & " WHERE ID     = @WHEREID"

With cmd
    .ActiveConnection = Cn
    .CommandText = strSQL
    .Parameters.Append .CreateParameter("@NEEDS_ID", adInteger, adParamInput, 2, 12)
    .Parameters.Append .CreateParameter("@OBJ_ID", adInteger, adParamInput, 2, 23)
    .Parameters.Append .CreateParameter("@OBJ_COMMENTS", adVarChar, adParamInput, 250, "Some text")
    .Parameters.Append .CreateParameter("@TIME21_ID", adInteger, adParamInput, 2, 34)
    .Parameters.Append .CreateParameter("@ID", adInteger, adParamInput, 18, 456)
    .Execute
End With 

I hope the millennial developers are not so surprised that parameterized SQL already existed in the past century.

Back to modern software

Time flew by, Java and .NET emerged (and maybe submerged a little?), Reflection, Bytecode emission, Full-ORMs, Micro-ORMs, 3-tier was deprecated in favor of dozens of modern architectures, and now database access is much easier, right?

Now we don’t have to manually describe each one of our parameters to SQL. Our favorite ORMs will do that for us.

Dapper Example:

var dynamicParams = new DynamicParameters();

string sql = "SELECT * FROM [Product] WHERE 1=1";

if (productName != null)
{
    sql += " AND Name LIKE @productName"; 
    dynamicParams.Add("productName", productName);
}


if (subCategoryId != null)
{
    sql += " AND ProductSubcategoryID = @subCategoryId"; 
    dynamicParams.Add("subCategoryId", subCategoryId);
}

var products = cn.Query<Product>(sql, dynamicParams);

Don’t Repeat Yourself

The example below makes me a little upset for 2 reasons:

  • I have to pass productName twice, both to the sql string and to the dynamicParams dictionary.
    Their names should match.
  • I have to keep those two variables (the SQL statement and the list of parameters) separated, even though they are closely related to each other.

String Interpolation Internals

String Interpolation was introduced in C# in 2016.

Instead of doing code like this:

string name = "Rick";
int accesses = 10;
string output = string.Format("{0}, you've been here {1:n0} times.",
                              name, accesses);

You could do like this:

string name = "Rick";
int accesses = 10;
string output = $"{name}, you've been here {accesses:n0} times.";

Internally, when you write an interpolated string (starting with $), the compiler generates a FormattableString class, which contains both the template (as if it was {0}, you've been here {1:n0} times.), and also the list of parameters (string name and int accesses).

If your method expects a regular string, the FormattableString will be implicitly converted into a regular string, and you get the same behavior as if you just passed a string.format to your method. However, if your method expects a FormattableString class, then you have access to format and arguments isolated from each other.

FormattableString can be useful for example if we wanted to build parameterized SQL statements while letting users build their strings as if it was just regular string concatenation:

QueryBuilder ParseSQL(FormattableString query)
{
    QueryBuilder myWrapper = new QueryBuilder();

    string dapperSql = query.Format;
    // replace {0} by "@p0", 
    // replace {1} by "@p1", etc..
    myWrapper.SQL = dapperSql;

    var dapperArgs = new Dapper.DynamicParameters();
    var args = query.GetArguments();
    // dapperArgs.Add("p0", args[0]);
    // dapperArgs.Add("p1", args[1]); ...
    myWrapper.Arguments = dapperArgs;

    return myWrapper;
    // now anyone can use Dapper like  
    // var pocos = connection.Query<POCO>(myWrapper.SQL, myWrapper.Parameters);
}

DapperQueryBuilder

Based on the idea above, I’ve created DapperQueryBuilder - which is a simple wrapper around Dapper which allows us to pass SQL parameters using string interpolation.

You can code like this:

var query = cn.QueryBuilder($"SELECT * FROM [Product] WHERE 1=1");

if (productName != null)
    query.Append($"AND Name LIKE {productName}"); 

if (subCategoryId != null)
    query.Append($"AND ProductSubcategoryID = {subCategoryId}"); 

var products = query.Query<Product>(); 

Although it looks like you’re just building a dynamic SQL (with values as inline literals), actually what you get is parameterized SQL.

In case, query will have this underlying statement: SELECT * FROM [Product] WHERE 1=1 AND Name LIKE @p0 AND ProductSubcategoryId = @p1, and will also hold the parameters @p0 = productName and @p1 = subCategoryId.

To sum, instead of using Dapper’s extension .Query<T> which extends IDbConnection and accepts a SQL string and a list of parameters, you use QueryBuilder() extension which creates a QueryBuilder where you can dynamically (and in a single statement) add new parameters and add the associated SQL clause.

Quickstart

If you liked and want to start using right now,

  1. Install the NuGet package Dapper-QueryBuilder
  2. Start using like this:
using DapperQueryBuilder;
// ...

cn = new SqlConnection(connectionString);

// If you have all your parameters at once and just want to benefit from string interpolation:
var products = cn.QueryBuilder($@"
    SELECT ProductId, Name, ListPrice, Weight
    FROM [Product]
    WHERE [ListPrice] <= {maxPrice}
    AND [Weight] <= {maxWeight}
    AND [Name] LIKE {search}
    ORDER BY ProductId").Query<Product>();

Or building dynamic conditions like this:

using DapperQueryBuilder;
// ...

cn = new SqlConnection(connectionString);

// If you want to dynamically add conditions
var q = cn.QueryBuilder($@"
    SELECT ProductId, Name, ListPrice, Weight
    FROM [Product]
    WHERE 1=1 ");

q.AppendLine("AND [ListPrice] <= {maxPrice}");
q.AppendLine("AND [Weight] <= {maxWeight}");
q.AppendLine("AND [Name] LIKE {search}");
q.AppendLine("ORDER BY ProductId");

var products = q.Query<Product>();

Filters list

Some people believe incorrectly that WHERE 1=1 causes a performance hit. It doesn’t. So using this fake condition is a great trick so that you can just append additional conditions (if any) like shown in the example above.

Another option is that you can build your whole query but leave the filters to be dynamically defined (and rendered) later. DapperQueryBuilder has this special command Where() which is used to save the filters internally, and later they are replaced.

int maxPrice = 1000;
int maxWeight = 15;
string search = "%Mountain%";

var cn = new SqlConnection(connectionString);

// You can build the query manually and just use QueryBuilder to replace "where" filters (if any)
var q = cn.QueryBuilder(@"SELECT ProductId, Name, ListPrice, Weight
    FROM [Product]
    /**where**/
    ORDER BY ProductId
    ");
    
// You just pass the parameters as if it was an interpolated string, 
// and QueryBuilder will automatically convert them to Dapper parameters (injection-safe)
q.Where($"[ListPrice] <= {maxPrice}");
q.Where($"[Weight] <= {maxWeight}");
q.Where($"[Name] LIKE {search}");

// Query() will automatically build your query and replace your /**where**/ (if any filter was added)
var products = q.Query<Product>();

You can also create OR filters like this:

q.Where(new Filters(Filters.FiltersType.OR)
{
    new Filter($"[Weight] <= {maxWeight}"),
    new Filter($"[Name] LIKE {search}")
});

Or you can mix and match OR/AND like this:

q.Where(new Filters(Filters.FiltersType.OR)
{
    new Filters(Filters.FiltersType.AND)
    {
        $"[Weight] <= {maxWeight}",
        $"[Weight] >= {minWeight}",
    }
    new Filter($"[Name] LIKE {search}")
});

And even arrays can be used as interpolated strings:

var categories = new string[] { "Components", "Clothing", "Acessories" };
q.Append($"WHERE c.[Name] IN {categories}");

Fluent API (Chained-methods)

For those who like method-chaining guidance (or for those who allow end-users to build their own queries), there’s a Fluent API that allows you to build queries step-by-step mimicking dynamic SQL concatenation.

So, basically, instead of starting with a full query and just appending new filters (.Where()), the QueryBuilder will build the whole query for you:

var q = cn.QueryBuilder()
    .Select($"ProductId")
    .Select($"Name")
    .Select($"ListPrice")
    .Select($"Weight")
    .From($"[Product]")
    .Where($"[ListPrice] <= {maxPrice}")
    .Where($"[Weight] <= {maxWeight}")
    .Where($"[Name] LIKE {search}")
    .OrderBy($"ProductId");
    
var products = q.Query<Product>();

You would get this query:

SELECT ProductId, Name, ListPrice, Weight
FROM [Product]
WHERE [ListPrice] <= @p0 AND [Weight] <= @p1 AND [Name] LIKE @p2
ORDER BY ProductId

Hope you enjoyed this post as much as I did when I wrote this library!

Full source code here.

This is the second part of a multi-part post series I’m going to write about Code Generation using C#. In the previous part I’ve shown how to invoke C# Scripts (and CSX files) from PowerShell and Visual Studio. In this part, I’ll create a simple extractor to read the physical schema of a SQL Server Database and save it as a JSON file.

Creating the Project

The first step is creating a .NET project.

Running CSX scripts do not require csproj (or Visual Studio), but it’s very helpful to have a csproj for many reasons:

  • You get intellisense (mostly on CS, not so much on CSX - so try to use CSX as little as possible)
  • You can check for compilation-errors
  • You can configure a Console App (or any other entry point) to launch and debug your scripts
  • csproj is required for automatically restoring NuGet packages (although if you already have the NuGet packages you may just use them without csproj)

All libraries which I’ll use in my generator (Dapper and Newtonsoft) are compatible with netstandard2.0.
This means that the project type can be either .NET Framework (net472, which used to be called “full framework”) or .NET Core (netcore21 or netcore31) - both should work.

NuGet Packages

I’ll use two third-party libraries: Dapper (micro-ORM, for loading objects from SQL Queries) and Newtonsoft.Json (JSON library, just for serialization). All we need is their dlls, but the easiest way to getting those is by installing their NuGet packages to our csproj (which will automatically install those packages to anyone who tries to build our csproj, so it’s hassle-free collaboration among multiple developers).
If you’re not using Visual Studio IDE you can install those packages by running
dotnet add package Dapper and dotnet add package Newtonsoft.Json
in the same folder of your csproj.
If you’re using Visual Studio you can use the Package Manager Console and run this:

Install-Package Dapper
Install-Package Newtonsoft.Json

Sql Server Schema objects

Next, I’ll create my objects that will represent my physical SQL database. One important thing to remember is that C# scripting engines do NOT allow the use of namespaces.

SqlServerTable.cs:

using System;
using System.Collections.Generic;

public class SqlServerTable
{
    public string Database { get; set; }
    public string TableSchema { get; set; }
    public string TableName { get; set; }

    /// <summary>
    /// Can be "TABLE" or "VIEW"
    /// </summary>
    public string TableType { get; set; }

    public string TableDescription { get; set; }

    public List<SqlServerColumn> Columns { get; set; } = new List<SqlServerColumn>();

    /// <summary>
    /// FKs which point from THIS (Child) table to the primary key of OTHER (Parent) tables
    /// </summary>
    public List<SqlServerForeignKey> ForeignKeys { get; set; } = new List<SqlServerForeignKey>();

    /// <summary>
    /// FKs which point from OTHER (Child) tables to the primary key of THIS (Parent) table
    /// </summary>
    public List<SqlServerForeignKey> ChildForeignKeys { get; set; } = new List<SqlServerForeignKey>();

}

I’ll omit other classes for brevity, but you can refer to all code here (classes SqlServerColumn.cs, SqlServerForeignKey.cs, SqlServerForeignKeyMember.cs ).

Schema Reader

Finally, I’ll create the class which will read SQL metadata - SqlServerSchemaReader.cs:

using Dapper;
using System;
using System.Data;
using System.IO;
using System.Linq;

public class SqlServerSchemaReader
{
  public Func<IDbConnection> CreateDbConnection { get; set; }

  public SqlServerSchemaReader(Func<IDbConnection> createDbConnection)
  {
    CreateDbConnection = createDbConnection;
  }

  public void ExportSchemaToJSON(string outputJsonSchema)
  {
    Console.WriteLine("Reading Database...");

    using (var cn = CreateDbConnection())
    {
      var tables = cn.Query<SqlServerTable>(@"
        SELECT 
          t.TABLE_CATALOG as [Database], 
          t.TABLE_SCHEMA as [TableSchema], 
          t.TABLE_NAME as [TableName], 
          CASE WHEN t.TABLE_TYPE='VIEW' THEN 'VIEW' ELSE 'TABLE' END as [TableType],
          ep.value as [TableDescription]
		    FROM  INFORMATION_SCHEMA.TABLES t
		    INNER JOIN sys.schemas sc ON t.TABLE_SCHEMA = sc.[name]
          ... full code omitted for brevity - please refer to: 
          ... https://github.com/Drizin/CodegenCS/tree/master/src/CodegenCS.SqlServer
      ").AsList();

      var allColumns = cn.Query<SqlServerColumn>(@"
          ... full code omitted for brevity... 
      ").AsList();

      var fks = cn.Query<SqlServerForeignKey>(@"
          ... full code omitted for brevity... 
      ").AsList();

      var fkCols = cn.Query<SqlServerForeignKeyMember>(@"
          ... full code omitted for brevity... 
      ").AsList();
      foreach (var fk in fks)
      {
        fk.Columns = fkCols.Where(c => 
            c.ForeignKeyConstraintName == fk.ForeignKeyConstraintName && 
            c.FKTableSchema == fk.FKTableSchema
        ).OrderBy(c => c.PKColumnOrdinalPosition).ToList();
      }

      foreach (var table in tables)
      {
        table.Columns = allColumns.Where(c => c.TableSchema == table.TableSchema && c.TableName == table.TableName).ToList();
        foreach(var column in table.Columns)
          column.ClrType = GetClrType(table, column);
        table.Columns.ForEach(c => { c.Database = null; c.TableSchema = null; c.TableName = null; });

        // We copy FKs and remove redundant properties of the parent object (table) which we're attaching this FK into
        table.ForeignKeys = Clone(fks.Where(fk => fk.FKTableSchema == table.TableSchema && fk.FKTableName == table.TableName).ToList());
        table.ForeignKeys.ForEach(fk => { fk.FKTableSchema = null; fk.FKTableName = null; });

        // We copy FKs and remove redundant properties of the parent object (table) which we're attaching this FK into
        table.ChildForeignKeys = Clone(fks.Where(fk => fk.PKTableSchema == table.TableSchema && fk.PKTableName == table.TableName).ToList());
        table.ChildForeignKeys.ForEach(fk => { fk.PKTableSchema = null; fk.PKTableName = null; });

      }

      SqlServerDatabaseSchema schema = new SqlServerDatabaseSchema()
      {
        LastRefreshed = DateTimeOffset.Now,
        Tables = tables,
      };

      Console.WriteLine($"Saving into {outputJsonSchema}...");
      File.WriteAllText(outputJsonSchema, Newtonsoft.Json.JsonConvert.SerializeObject(schema, Newtonsoft.Json.Formatting.Indented));
    }

    Console.WriteLine("Success!");
  }

  string GetClrType(SqlServerTable table, SqlServerColumn column)
  {
    string sqlDataType = column.SqlDataType;
    switch (sqlDataType)
    {
      case "bigint":
        return typeof(long).FullName;
      case "smallint":
        return typeof(short).FullName;
      case "int":
        return typeof(int).FullName;
      case "uniqueidentifier":
        return typeof(Guid).FullName;
      case "smalldatetime":
      case "datetime":
      case "datetime2":
      case "date":
      case "time":
        return typeof(DateTime).FullName;
      case "datetimeoffset":
        return typeof(DateTimeOffset).FullName;
      case "float":
        return typeof(double).FullName;
      case "real":
        return typeof(float).FullName;
      case "numeric":
      case "smallmoney":
      case "decimal":
      case "money":
        return typeof(decimal).FullName;
      case "tinyint":
        return typeof(byte).FullName;
      case "bit":
        return typeof(bool).FullName;
      case "image":
      case "binary":
      case "varbinary":
      case "timestamp":
        return typeof(byte[]).FullName;
      case "nvarchar":
      case "varchar":
      case "nchar":
      case "char":
      case "text":
      case "ntext":
      case "xml":
        return typeof(string).FullName;
      default:
        Console.WriteLine($"Unknown sqlDataType for {table.TableName}.{column.ColumnName}: {sqlDataType}");
        return null;

      // Vendor-specific types
      case "hierarchyid":
        return "Microsoft.SqlServer.Types.SqlHierarchyId"; // requires Microsoft.SqlServer.Types.dll (EF or Dapper 1.34+)
      case "geography":
        return "Microsoft.SqlServer.Types.SqlGeography";  // requires Microsoft.SqlServer.Types.dll (EF or Dapper 1.32+)
      case "geometry":
        return "Microsoft.SqlServer.Types.SqlGeometry";  // requires Microsoft.SqlServer.Types.dll (EF or Dapper 1.33)+
    }
  }

  public static T Clone<T>(T source)
  {
    var serialized = JsonConvert.SerializeObject(source);
    return JsonConvert.DeserializeObject<T>(serialized);
  }

}

CSX

The idea is that we should put as much as possible into .cs files (and as little as possible in csx script), since intellisense and compile-time checks work better than in the CSX. The script file RefreshDatabaseSchema.csx will basically load libraries, define connection strings and paths, and invoke SqlServerSchemaReader.cs:

/// <summary>
/// This CSX Script will invoke SqlServerSchemaReader, which extracts the schema of SQL database and saves into a JSON file.
/// The easiest way to launch csi.exe (which is shipped with Visual Studio) to run this script is by using PowerShell script RefreshDatabaseSchema.ps1
/// You can do that from Visual Studio (see instructions in RefreshDatabaseSchema.ps1) or you can just execute "Powershell RefreshDatabaseSchema.ps1"
/// </summary>

// System libraries
#r "System.Data.dll"

// Load 3rd-party libraries by their relative paths, relative to "$Env:userprofile\.nuget\packages\"
#r "dapper\2.0.35\lib\netstandard2.0\Dapper.dll"
#r "newtonsoft.json\12.0.3\lib\netstandard2.0\Newtonsoft.Json.dll"

// CS files are better than CSX because Intellisense and Compile-time checks works better. 
#load "SqlServerTable.cs"
#load "SqlServerColumn.cs"
#load "SqlServerForeignKey.cs"
#load "SqlServerForeignKeyMember.cs"
#load "SqlServerDatabaseSchema.cs"
#load "SqlServerSchemaReader.cs"

using System;
using System.IO;
using System.Runtime.CompilerServices;
using System.Data;
using System.Data.SqlClient;

// Helpers to get the location of the current CSX script
public static string GetScriptPath([CallerFilePath] string path = null) => path;
public static string GetScriptFolder([CallerFilePath] string path = null) => Path.GetDirectoryName(path);


// location relative to the CSX script
string outputJsonSchema = Path.GetFullPath(Path.Combine(GetScriptFolder(), "AdventureWorksSchema.json")); 
string connectionString = @"Data Source=MYWORKSTATION\SQLEXPRESS;
                            Initial Catalog=AdventureWorks;
                            Integrated Security=True;";

Func<IDbConnection> connectionFactory = () => new SqlConnection(connectionString);
var reader = new SqlServerSchemaReader(connectionFactory);
reader.ExportSchemaToJSON(outputJsonSchema);

Powershell

Last, I’ll create a PowerShell to invoke the CSX file, which is useful because it can locate the csi.exe in multiple locations and because it can provide to csi.exe the location of per-user NuGet packages, so that CSX can load libraries by their relative-locations, without having to hard-code user-specific folders. RefreshDatabaseSchema.ps1:

# To Execute Powershell Scripts from Visual Studio:
# 1) Right-button PS1 file - "Open With...""
# 2) Configure:
#      Program: Powershell.exe
#      Arguments: -noexit -File %1
#      Friendly Name: Execute PowerShell Script

# To execute CSX scripts you'll need CSI.EXE (C# REPL) which is shipped with Visual Studio
# but can also be installed by using the NuGet package Microsoft.Net.Compilers.Toolset - https://www.nuget.org/packages/Microsoft.Net.Compilers.Toolset/

# For more info about launching CSX scripts from PowerShell or from Visual Studio, check https://drizin.io/code-generation-csx-scripts-part1/

$dir = Split-Path $MyInvocation.MyCommand.Path 
$script = Join-Path $dir ".\RefreshDatabaseSchema.csx"


# Locate CSI.EXE by searching common paths
$csi = ( 
    "$Env:userprofile\.nuget\packages\microsoft.net.compilers.toolset\3.6.0\tasks\net472\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2019\Enterprise\MSBuild\Current\Bin\Roslyn\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2019\Professional\MSBuild\Current\Bin\Roslyn\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Current\Bin\Roslyn\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2017\Enterprise\MSBuild\15.0\Bin\Roslyn\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2017\Professional\MSBuild\15.0\Bin\Roslyn\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2017\Community\MSBuild\15.0\Bin\Roslyn\csi.exe"
) | Where-Object { Test-Path $_ } | Select-Object -first 1

if (!$csi)
{
    Write-Host "---------------------------------------" -for red
    Write-Host "Can't find csi.exe" -for red
    Write-Host "Please fix search paths above, or install NuGet Microsoft.Net.Compilers.Toolset" -for red
    Write-Host "---------------------------------------" -for red
    Exit 1
}


$stopwatch = [System.Diagnostics.Stopwatch]::StartNew()

Write-host "Starting csi.exe $script ..." -for yellow
& $csi /lib:"$Env:userprofile\.nuget\packages\" $script

Write-Host "Finished in $($stopwatch.Elapsed.TotalMilliSeconds) milliseconds"

# Since I configured "-noexit" parameter in Visual Studio I don't need this
#if ($host.Name -notmatch 'ISE') { Write-Host -NoNewLine "(Just press Enter to exit)" -for cyan; read-host; }  

Running Powershell:

Result file AdventureWorksSchema.json:

Full Source code available here

This is the second part of a multi-part post series I’m going to write about Code Generation using C#:

This is the first part of a multi-part post series I’m going to write about Code Generation using C#.

I’ve mentioned before that I’ve been a fan of code generation for a long-time. I have used CodeSmith generator, MyGeneration Code Generator, and in the past few years, I’ve been using T4 templates.

The main advantage of code generation (which includes DB scaffolding, also called database first by some ORMs) is that it does the work for you, is repeatable, and less error-prone than manually writing everything. So it’s mostly about productivity and consistency. If you’re writing repetitive code by hand, you’re stealing from your employer or clients. And if you have a data-oriented application, almost certainly you have repetitive code to some degree.

The right tool for the job

Until recently the gold standard for code generation in Microsoft stack was using T4 templates, which is the out-of-the-box tool provided by Microsoft and shipped with Visual Studio. The major problem with T4 is that it has terrible syntax, terrible tooling and debugging support, and can get as ugly as this:

C# Script Files (CSX files)

In the search of a modern code-generation tool I’ve explored Razor Templates, Mustache, and Mustache-like templates (like DotLiquid and Handlebars), and others. I’ve noticed that there’s a growing popularity of using C# code (Roslyn or not) to do the code-generation using pure C# - which makes sense, since you can get strong typing, compile-time checking, full IDE support (with debugging), cross-platform (dotnet core), full access to all .NET Framework (SqlServer, Dapper, Newtonsoft JSON, etc). So you get a full-fledged language instead of using a templating-engine which only offers a subset of features of the underlying language.

C# Script Files (CSX) were introduced with Roslyn, and can be executed in Roslyn or in other compatible cross-platform scripting engines like dotnet-script or even with C# REPL called csi.exe. Those scripting engines certainly have some limitations (like using namespaces), but they allow us to virtually invoke any C# code, with essential features like loading external assemblies, loading code organized across multiple source files, etc. And it’s much easier than using Powershell to invoke C#.

Sample CSX Script

CSX scripts inside Visual Studio have some support for Intellisense (auto-completion) and compile-time checks, but those features work much better in CS files. So it’s a good idea to put as much as possible into cs files and as little as possible in CSX scripts. I like to use CSX only for basic things like loading libraries, setting connection strings, settings paths, and invoking the real code in CS files.

MyProgram.cs:

public class MyProgram
{
   public void MyMethod()
   {
      Console.WriteLine("Hello from MyMethod");
   }  
}

MyScript.csx:

#load "MyProgram.cs" 

new MyProgram().MyMethod(); 
Console.WriteLine("Hello Code-Generation!");

Running CSX Script using C# REPL (CSI.EXE)

Visual Studio ships with a command line REPL called CSI that can be used to run .csx scripts.

You can run CSI.EXE directly from Visual Studio Developer Command Prompt (csi MyScript.csx):

Assembly References

In the same sense that it’s a good idea to use simple statements in CSX to invoke more complex CS code, it’s also a good idea to load external assemblies when you can rely on existing libraries.

CSX allows loading assembly references by using the #r directive in the top of your scripts:

// CSI.EXE requires absolute paths for loading external assemblies: 
#r "C:\Users\drizin\.nuget\packages\dapper\2.0.35\lib\netstandard2.0\Dapper.dll" 

#load "File1.cs" 
#load "File2.cs" 
#load "MyProgram.cs" 

new MyProgram().MyMethod(); 
Console.WriteLine("Hello Code-Generation!");

NuGet Packages

If you need to reference a NuGet package, you can just rely on NuGet tools (and Visual Studio build process) to automatically restore the packages required by your script. For achieving that, you can just add the CSX as part of a Visual Studio project, so when each developer tries to build the project Visual Studio will download the missing packages, and the developer just needs to fix the assemblies location.

Another alternative, which does not require to use csproj at all, is using PowerShell to download the required nugets (see script at the end of this post).

Invoking C# REPL (CSI running CSX scripts) from PowerShell

Although you can run CSI.exe directly from Visual Studio Developer Command Prompt, invoking it through PowerShell is very helpful for a few reasons:

  • You can run outside of Visual Studio. You don’t even need Visual Studio to run CSX.
  • Allows us to find NuGet locations and reference external assemblies with relative paths on the CSX (more about this below).
  • Restoring NuGet files (example script at the end of the post)

To invoke CSI using Powershell, we must know the location of csi.exe.

CSI is shipped with Visual Studio but can also be installed by using the NuGet package Microsoft.Net.Compilers.Toolset (warning: as described here the installation script for the package changes .csproj file to use the compiler provided by the package - this is likely not what you want, so make sure to roll-back such changes if they are made during installation).

So the first step is to search for csi.exe in multiple locations as I show in the sample Powershell script RunMyScript.ps1 below:

# Locate CSI.EXE by searching common paths
$csi = ( 
    "$Env:userprofile\.nuget\packages\microsoft.net.compilers.toolset\3.6.0\tasks\net472\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2019\Enterprise\MSBuild\Current\Bin\Roslyn\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2019\Professional\MSBuild\Current\Bin\Roslyn\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Current\Bin\Roslyn\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2017\Enterprise\MSBuild\15.0\Bin\Roslyn\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2017\Professional\MSBuild\15.0\Bin\Roslyn\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2017\Community\MSBuild\15.0\Bin\Roslyn\csi.exe"
) | Where-Object { Test-Path $_ } | Select-Object -first 1

$dir = Split-Path $MyInvocation.MyCommand.Path 
$script = Join-Path $dir "MyScript.csx"

& $csi $script

To launch the PowerShell script from the command line, it’s just about running Powershell Full-Path-To-Your-Script-ps1.

Running from Visual Studio IDE

To run from Visual Studio, you can just add the PS1 to your project or solution, right-click the file, and click the option “Open with PowerShell ISE”, which is the IDE for editing/running PowerShell scripts.

Another alternative is that you can add new actions to your right-button actions - you can click “Open With…” and configure PowerShell to be executed directly from Visual Studio:

The list of possible actions will include this new option of invoking PS1 scripts directly from the IDE, and you can also set this as the default action for opening PS1 files.

Allowing Unsigned Scripts

If you have never executed unsigned PowerShell scripts you’ll have to enable PowerShell unsigned scripts by running Powershell as Administrator and running this command:
Set-ExecutionPolicy -ExecutionPolicy Unrestricted.
Don’t forget to enable for both Windows PowerShell (64-bits) and for Windows PowerShell (x86), which is the one that is invoked from inside Visual Studio IDE.

Relative Assembly References

As we’ve seen before, CSX accepts absolute references like this:

#r "C:\Users\drizin\.nuget\packages\dapper\2.0.35\lib\netstandard2.0\Dapper.dll"

One of the major problems with CSI is that the #r directive (for loading assembly references) doesn’t accept nuget-like references or environment variables so all assembly references should be specified with full paths. This is not a showstopper but it’s a little annoying since it makes it harder to share code among multiple developers since each developer would have to fix their references.

One of the advantages of using PowerShell (as described above) is that we can use environment-variables and use #r directive with relative paths. In the PowerShell script, we just have to locate the base path where your assemblies are located and pass that to CSI so it can use this folder to search for referenced assemblies, like this:

$assemblies = "${env:userprofile}\.nuget\packages\";
& $csi /lib:"$assemblies" $script

And then in the CSX, you can use relative paths like this:

#r "dapper\2.0.35\lib\netstandard2.0\Dapper.dll"

PackageReference (NuGet 4) vs packages.config (NuGet 3)

The new MSBuild format (“SDK-Style”, which uses PackageReference inside the csproj) installs the NuGet packages in this per-user folder.

The old MSBuild format (“non-SDK-Style”, before Visual Studio 2017, which uses packages.config) installs the NuGet packages in the “packages” folder under the Solution folder.

We can adjust our PowerShell scripts according to where our project will restore NuGet packages:

$csi = ... # (locate your csi.exe)
$dir = Split-Path $MyInvocation.MyCommand.Path 
$script = Join-Path $dir "MyScript.csx"

# Call csi.exe and specify that libraries referenced by #r directives 
# should search in a few nuget locations

# New NuGet 4.0+ (PackageReference) saves User-specific packages
# in "%userprofile%\.nuget\packages\"
$nuget1 = "${env:userprofile}\.nuget\packages\";

# New NuGet 4.0+ (PackageReference) saves Machine-wide packages 
# in "%ProgramFiles(x86)%\Microsoft SDKs\NuGetPackages\"
$nuget2 = "${env:ProgramFiles(x86)}\Microsoft SDKs\NuGetPackages\";

# Old NuGet (packages.config) saves packages in "\packages" folder at solution level.
# Locate by searching a few levels above
$nuget3 = ( 
    (Join-Path $dir ".\packages\"),
    (Join-Path $dir "..\packages\"),
    (Join-Path $dir "..\..\packages\"),
    (Join-Path $dir "..\..\..\packages\"),
    (Join-Path $dir "..\..\..\..\packages\")
) | Where-Object { Test-Path $_ } | Select-Object -first 1

# if you're using new NuGet format (PackageReference defined inside csproj) 
& $csi /lib:"$nuget1" $script  

# if you're using old NuGet format (packages.config)
# & $csi /lib:"$nuget3" $script  

And our CSX would use relative references:

// CSX can load libraries by defining their relative paths

// New NuGets (PackageReference) are installed under "${env:userprofile}\.nuget\packages\" 
// or "${env:ProgramFiles(x86)}\Microsoft SDKs\NuGetPackages\")
// and have this format:
#r "dapper\2.0.35\lib\netstandard2.0\Dapper.dll"

// Old NuGets (packages.config) are installed under "(SolutionFolder)\packages"
// and have this format
// #r "Dapper.2.0.35\lib\netstandard2.0\Dapper.dll"

//...
new MyProgram().MyMethod();
Console.WriteLine("Hello Code-Generation!");

So cool and so easy, isn’t it?

Other tools

I have played with other tools like dotnet-script, nake, and the popular scriptcs. For different reasons I couldn’t make any of them work fine (some weren’t even installing in a traditional .net framework project, some had complex methods for using nugets, and some simply weren’t working (not finding include files, etc)). So I decided to stick with the plain csi.exe, which most users will already have installed.

Final PowerShell Script

This script below has some nice features:

  • Allows to search for assemblies in multiple paths, including all NuGet locations
  • Can pass multiple paths to csi.exe, even though csi does not accept spaces in paths
  • Can restore missing NuGet packages (will even download nuget.exe)
# To Execute Powershell Scripts from Visual Studio:
# 1) Right-button PS1 file - "Open With...""
# 2) Configure:
#      Program: Powershell.exe
#      Arguments: -noexit -File %1
#      Friendly Name: Execute PowerShell Script

# To execute CSX scripts you'll need CSI.EXE (C# REPL) which is shipped with Visual Studio
# but can also be installed by using the NuGet package Microsoft.Net.Compilers.Toolset - https://www.nuget.org/packages/Microsoft.Net.Compilers.Toolset/

# For more info about launching CSX scripts from PowerShell or from Visual Studio, check https://drizin.io/code-generation-csx-scripts-part1/

$dir = Split-Path $MyInvocation.MyCommand.Path 
$script = Join-Path $dir ".\GenerateSimplePOCOs.csx"
$requiredLibs = @(
    @{ Name = "Newtonsoft.Json"; Version = "12.0.3" },
    @{ Name = "CodegenCS"; Version = "1.0.5" }
);

# By default we'll only use NuGet 4 locations. But you can change to 3 if you're hosting 
# your scripts in a project with the old packages.config format and want to rely on existing project packages
$NuGetVersion = 4; 



$ErrorActionPreference = "Stop"

# Locate CSI.EXE by searching common paths
$csi = ( 
    "$Env:userprofile\.nuget\packages\microsoft.net.compilers.toolset\3.6.0\tasks\net472\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2019\Enterprise\MSBuild\Current\Bin\Roslyn\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2019\Professional\MSBuild\Current\Bin\Roslyn\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Current\Bin\Roslyn\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2017\Enterprise\MSBuild\15.0\Bin\Roslyn\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2017\Professional\MSBuild\15.0\Bin\Roslyn\csi.exe",
    "$Env:programfiles (x86)\Microsoft Visual Studio\2017\Community\MSBuild\15.0\Bin\Roslyn\csi.exe"
) | Where-Object { Test-Path $_ } | Select-Object -first 1

if (!$csi)
{
    Write-Host "---------------------------------------" -for red
    Write-Host "Can't find csi.exe" -for red
    Write-Host "Please fix search paths above, or install NuGet Microsoft.Net.Compilers.Toolset" -for red
    Write-Host "---------------------------------------" -for red
    Exit 1
}
Write-Host "Found csi.exe: $csi" -for cyan

# List of locations to search for assembly references
$libPaths = @()
$libPaths += $dir

if ($NuGetVersion -eq 4)
{
    # New NuGet 4.0+ (PackageReference) saves User-specific packages in %userprofile%\.nuget\packages\
    $libPaths += "${env:userprofile}\.nuget\packages";
    if (Test-Path "${env:userprofile}\.nuget\packages") { $missingNuGetPackagesLocation = "${env:userprofile}\.nuget\packages" }

    # New NuGet 4.0+ (PackageReference) saves Machine-wide packages in %ProgramFiles(x86)%\Microsoft SDKs\NuGetPackages\"
    $libPaths += "${env:ProgramFiles(x86)}\Microsoft SDKs\NuGetPackages";
}

if ($NuGetVersion -eq 3)
{
    # Old NuGet (packages.config) saves packages in "\packages" folder at solution level.
    # Locate by searching a few levels above the script
    $missingNuGetPackagesLocation = ( 
        (Join-Path $dir ".\packages"),
        (Join-Path $dir "..\packages"),
        (Join-Path $dir "..\..\packages"),
        (Join-Path $dir "..\..\..\packages"),
        (Join-Path $dir "..\..\..\..\packages")
    ) | Where-Object { Test-Path $_ } | Select-Object -first 1
    $libPaths += $missingNuGetPackagesLocation
}

# where to download missing NuGet packages
if ((Test-Path $missingNuGetPackagesLocation) -eq $false)
{
    $missingNuGetPackagesLocation = $dir
}


# csi /lib parameter allows multiple paths but does not accept spaces (or quotes) so we have to use short DOS 8.3 paths
$fso = New-Object -ComObject Scripting.FileSystemObject
$libPaths = ($libPaths | Where-Object { Test-Path $_ } | ForEach { $fso.GetFolder($_).shortpath  });


Write-Host "CSI will use the following paths to search for assembly references:`r`n   - $($libPaths -Join "`r`n   - ")" -for cyan


$missingLibs = @()
$requiredLibs | foreach {
    $requiredLib = $_;
    Write-Host "Checking for $($requiredLib.Name) version $($requiredLib.Version)..." -for Cyan -NoNewLine

    if ($NuGetVersion -eq 4)
    {
        # NuGet 4+ format
        $found = $libPaths | 
        ForEach { Join-Path $_ ($requiredLib.Name + '\' + $requiredLib.Version) } | 
        Where-Object { Test-Path $_ } | Select-Object -first 1

        if ($found -eq $null)
        {
            Write-Host "`n$($requiredLib.Name) not found. Will install using NuGet" -for Yellow
            $missingLibs += $requiredLib
        }
        else
        {
             Write-Host "Found: $found" -for Cyan
        }
    }

    if ($NuGetVersion -eq 3)
    {
        # NuGet <=3 format
        $found = $libPaths | 
        ForEach { Join-Path $_ ($requiredLib.Name + '.' + $requiredLib.Version) } | 
        Where-Object { Test-Path $_ } | Select-Object -first 1

        if ($found -eq $null)
        {
            Write-Host "`n$($requiredLib.Name) not found. Will install using NuGet" -for Yellow
            $missingLibs += $requiredLib
        }
        else
        {
             Write-Host "Found: $found4 $found3" -for Cyan
        }
    }
}

if ($missingLibs)
{
    $nuget = Join-Path $env:TEMP "nuget.exe"
    if ((Test-Path $nuget) -eq $False)
    {
        Write-Host "Downloading NuGet.exe into $nuget" -for cyan
        $webClient = New-Object System.Net.WebClient 
        $webClient.DownloadFile("https://dist.nuget.org/win-x86-commandline/latest/nuget.exe", $nuget)
    }

    $missingLibs | foreach {
        $missingLib = $_
        Write-Host "Downloading $missingLib...";
        $libName = $missingLib.Name
        $libVersion = $missingLib.Version
        if ($libVersion -eq $null)
        {
            & $nuget install $libName -OutputDirectory $missingNuGetPackagesLocation
        }
        else
        {
            & $nuget install $libName -Version $libVersion -OutputDirectory $missingNuGetPackagesLocation
        }
        if ($lastExitCode -ne 0)
        {
            Write-host "-------------`nError downloading $missingLib - aborting...`n-------------" -for red
            Exit 1
        }
    }
}


$stopwatch = [System.Diagnostics.Stopwatch]::StartNew()
Write-host "Starting csi.exe $script ..." -for yellow
& $csi /lib:$($libPaths -Join ';') $script

$stopwatch.Stop()
Write-Host "Finished in $($stopwatch.Elapsed.TotalMilliSeconds) milliseconds"

# Since I configured "-noexit" parameter in Visual Studio I don't need this
#if ($host.Name -notmatch 'ISE') { Write-Host -NoNewLine "(Just press Enter to exit)" -for cyan; read-host; }  

And my final CSX

/// <summary>
/// This CSX Script will invoke SimplePOCOGenerator, which builds simple POCO classes based on a JSON file with schema of SQL database
/// The easiest way to launch csi.exe (which is shipped with Visual Studio) to run this script is by using PowerShell script GenerateSimplePOCOs.ps1
/// You can do that from Visual Studio (see instructions in RefreshDatabaseSchema.ps1) or you can just execute "Powershell GenerateSimplePOCOs.ps1"
/// </summary>

// System libraries
#r "System.Data.dll"

// Load third-party libraries by their relative paths, relative to "$Env:userprofile\.nuget\packages\"
#r "newtonsoft.json\12.0.3\lib\netstandard2.0\Newtonsoft.Json.dll"
#r "codegencs\1.0.5\lib\netstandard2.0\CodegenCS.dll"

// CS files are better than CSX because Intellisense and Compile-time checks works better. 
#load "DatabaseSchema.cs"
#load "SimplePOCOGenerator.cs"

using System;
using System.IO;
using System.Runtime.CompilerServices;
using System.Data;
using System.Data.SqlClient;

// Helpers to get the location of the current CSX script
public static string GetScriptPath([CallerFilePath] string path = null) => path;
public static string GetScriptFolder([CallerFilePath] string path = null) => Path.GetDirectoryName(path);


// locations relative to the CSX script
string inputJsonSchema = Path.GetFullPath(Path.Combine(GetScriptFolder(), "AdventureWorksSchema.json"));
string targetFolder = Path.GetFullPath(Path.Combine(GetScriptFolder(), @".\POCOs\"));
string csProj = Path.GetFullPath(Path.Combine(GetScriptFolder(), @".\CSX-PowerShell-VisualStudio2015.csproj"));

var generator = new SimplePOCOGenerator(inputJsonSchema);
generator.Namespace = "CSX_PowerShell_VisualStudio2015";
generator.Generate(targetFolder, null);

Full source code is available here, with a sample project that reads a JSON file with AdventureWorks database schema and generates POCOs, both for SDK-Style (new csproj format) and for non-SDK-style (old Visual Studio 2015 format).

So cool and so easy, isn’t it?

In my next post, I’ll create a SQL schema extractor, and this schema will be used in next posts to build code generator for POCOs, EFCore, and other related services.

Enjoy!

This was the first part of a multi-part post series I’m going write do about Code Generation using C#: