How to make your T-SQL thirty percent faster

Last week, I mentioned a feature in SQL Server, and now in Azure SQL Database, called Memory-Optimized Table-Valued Parameters, which makes use of In-Memory OLTP structures to work around a dependency on tempdb, and thereby drastically improve performance.

Short version: We can see more than double speed improvement by switching from temp tables to Memory-Optimized TVPs, and a 30% speed improvement by switching from standard TVPs.

This week, I will compare the performance characteristics of the following three data structures:

  • temp tables (and table variables),
  • standard table-valued parameters, and
  • memory-optimised table-valued parameters.

These tests will be performed on my venerable Asus laptop, which is around four years old now, running SQL Server 2016 Developer Edition. It has 16GB of RAM, a 2.4GHz quad-core CPU, and an Intel 530-series SSD.

(Edit: With SQL Server 2016 Service Pack 1, you can now create Memory-Optimized objects in all editions of SQL Server.)

The test is very simple and makes use of a technique that one of my customers uses extensively: writing some rows to a temp table, so that another process can reuse those values as parameters, and perform an action, after which it destroys the temp table.

Unfortunately, the method my customer uses to populate a temp table, does not work on Azure SQL Database, because they explicitly refer to the temp tables with three-part naming (tempdb..#temptable), which is not permitted.

For the sake of the exercise, I will have a process that writes to a data structure and compare the times.

What is a Table-Valued Parameter anyway?

At its heart, a Table-Valued Parameter is a way to pass more than one row of data around in T-SQL. Instead of having many individual parameters in a stored procedure, or passing around values in temp tables, you could simply pass in the TVP, which is defined as a table type.

Warning sounds should already be going off in your head, because, yes, TVPs are just table variables. This means that they don’t have associated statistics and cannot be indexed (apart from a primary key). In other words, you do not want to use TVPs to pass in large amounts of data. Performance is going to suck.

So what do we know this far? Table variables, temp tables, and standard table-valued parameters are pretty much equivalent, in that they are backed by tempdb, which is persisted on the hard drive, which means we will be bound by I/O.

Memory-Optimized TVPs, by definition, are in-memory structures. By making use of the In-Memory OLTP engine, we know that they use optimistic concurrency and have no locks, only lightweight latches. This makes them fast. Because they reside in memory, they are compiled into native code, which means they’re even faster.

Create the test objects!

Let’s start with a temp table. For your amusement, I will be creating both a temp table and a table variable.

I’m lumping these together, because they’re almost the same thing under the covers for this scenario. The table is allocated on tempdb, data is inserted into it—which has to be fully logged (I/O bound)—and if for any reason the query spills to disk (and there are a lot of reasons, including sorts), it will incur an I/O penalty in the data file too.

I was not going to use any indexes or primary keys here, to try and keep the performance comparable, but memory-optimized data structures require indexing, so I’ve added a primary key. As we know, adding a clustered index, common on temp tables, will make it slower because it has to sort the table on the clustering key.

To make this even simpler, my sample data will only comprise one column. I’ve picked a bad example, because developers love GUIDs, but SQL Server does not.

-- Temp table creation
CREATE TABLE #temptable
(ID UNIQUEIDENTIFIER NOT NULL PRIMARY KEY);

-- Table variable creation
DECLARE @tablevariable AS TABLE
(ID UNIQUEIDENTIFIER NOT NULL PRIMARY KEY);

Now we have two tables backed by tempdb. One of them has statistics, and the other (depending on the version) has 1 or 100 rows, according to SQL Server. Fantastic.

Let’s create a standard table-valued parameter with the same structure:

CREATE TYPE dbo.TVPStandard AS TABLE
(ID UNIQUEIDENTIFIER NOT NULL PRIMARY KEY);

DECLARE @TempTVPStandard AS dbo.TVPStandard;

Now let’s create a memory-optimized table-valued parameter. Note that the major difference here is that we specify the MEMORY_OPTIMIZED state (there are some additional requirements for setting up In-Memory OLTP, which are outside of the scope of this post).

CREATE TYPE dbo.TVPInMemory AS TABLE
(ID UNIQUEIDENTIFIER NOT NULL PRIMARY KEY NONCLUSTERED)
WITH (MEMORY_OPTIMIZED = ON);

Note: I had to cheat here. You’ll notice that my previous data structures default to a clustered index on a primary key. Memory-Optimized Table-Valued Parameters cannot use a clustered index, but instead require a non-clustered index.

DECLARE @TempTVPMemory AS dbo.TVPInMemory;

We now have the following structures at our disposal:

  • #temptable (a temp table)
  • @tablevariable (a table variable)
  • @TempTVPStandard (a standard TVP)
  • @TempTVPMemory (a memory-optimized TVP)

Test the Performance!

For repeatable tests, I was going to use STATISTICS IO and STATISTICS TIME, and insert 100, then 1000, then 10,000 rows into each data structure.

I ran into a minor problem at first: I can’t use the GO <n> batch command to make things execute a lot of times, because table variables and TVPs are local only to the current batch. I had to rely on a WHILE loop.

After going through this a few times, I also couldn’t actually measure performance with STATISTICS TIME, because even my old laptop was too quick. It’s a common problem we see in performance tuning: small data structures are fast.

Instead, I used a start and end time, using SYSUTCDATETIME, and a WHILE loop, inserting ten thousand, then a hundred thousand, and then a million rows into each structure. A contrived example, sure, but one necessary to demonstrate how performance differs with larger rowsets.

PRINT SYSUTCDATETIME();

SET NOCOUNT ON;

DECLARE @i INT = 0;
WHILE @i < <number of executions>
BEGIN
INSERT INTO <object name>
SELECT NEWID();
SELECT @i = @i + 1;
END;

SET NOCOUNT OFF;

PRINT SYSUTCDATETIME();

Results

At low row counts, all four data structures performed around the same speed. Provided tempdb’s data and log files are correctly sized and don’t have to grow, it averages out.

Things got more interesting at 10,000 rows. I ran each series three times, and took the lowest value from each run.

Data Structure 10,000 100,000 1,000,000
Temp Table (Clustered) 291 ms 2,872 ms 29,110 ms
Table Variable (Clustered) 173 ms 1,708 ms 17,665 ms
Standard TVP (Clustered) 184 ms 1,694 ms 17,764 ms
Memory-Optimized TVP (Non-Clustered) 120 ms 1,174 ms 12,541 ms

Because of the type of query I’m running, the performance profile is linear. A million rows took 100x more time than ten thousand.

Unsurprisingly, the standard TVP and table variable performed the same, since they’re effectively the same data structure. Why the temp table is slower, is an exercise for a later blog post.

(When all three disk-bound structures used Non-Clustered Primary Keys, they were significantly slower than with clustered indexes. Performance with clustered indexes, versus no primary key at all, was the same.)

Keep in mind that these are row-by-row operations. If we were populating these structures from a set-based operation, it would be a lot faster. However, this demonstrates clearly how much more efficient a memory-optimized table-valued parameter is compared to the other data structures.

Caveat: memory-optimized TVPs reside exclusively in memory. We have to remember that it’s possible to run out of memory when using these structures. The last thing we want to see is the following error message:

The statement has been terminated.
Msg 701, Level 17, State 109, Line 92
There is insufficient system memory in resource pool 'default' to run this query.

Memory-Optimized Table-Valued Parameters are at least 30% faster than any of the other disk-bound data structures and more than twice as fast as temp tables specifically, but we need to ensure they are used correctly.

For more reading, check out the MSDN article “Faster temp table and table variable by using memory optimization”.

Next week, I will present the findings for this same test, but performed on an Azure SQL Database on the Premium Tier (because that’s where In-Memory OLTP lives).

If you have any performance tricks you would like to share, please find me on Twitter at @bornsql .

Author: randolph

Randolph West is a Microsoft Data Platform MVP, and has worked with SQL Server since the late 1990s. When not consulting, he can be seen acting on the stage and screen, or doing voices for independent video games. Randolph is available for talks on SQL Server, and technology in general. He also offers training for junior DBAs. Connect with Randolph on Google+ or Twitter.