The ongoing saga of a slow tempdb on a fast server

Last week, I wrote about what should have been a fast system but was performing very poorly, weighing heavily on tempdb I/O performance.

Our original changes did not make much of a difference, but now we could rule out errant file growth. Our next step was to investigate how tempdb was being used by the application. There was one obvious culprit.

The warehouse frequently* runs a query to find out how much stock* they have on hand. This is a query that has to be almost instantaneous. In our best case scenario, it runs in seven to ten seconds, but it should be sub-second.

Did you notice the asterisks? Those are red flags: a big query, running frequently, and spilling to disk (which means spilling to tempdb).

Our first rule of performance tuning is to keep everything in memory. Disk is two to three orders of magnitude slower than memory. We need to make this query a thousand times faster.

The query itself is based on a view. Nothing wrong with that in principle, assuming you keep your table joins to a minimum (a rule of thumb is fewer than ten). I found 25 tables, two of which were derived tables.

After rewriting the two derived tables as common table expressions (CTEs) at the top of the query, it was much easier to see what was happening.

Asking For Too Much

The first derived table was a partition query using ROW_NUMBER to find records with the latest date by product type, and then fetching the first row. It makes use of a sort operator, which is a blocking operation. In many cases, sorts will spill to tempdb.

The second derived query was the one doing actual stock counts. It used a GROUP BY, which also requires a sort operation, and therefore potentially spilling to tempdb.

Twenty three other tables were joining in some way or another (mostly LEFT JOINs, some INNER JOINs) to the derived tables. Each of these joins was represented in the query plan as a hash join, which also spills to tempdb.

(Hash joins aren’t bad in and of themselves. They are an efficient way to join large rowsets, but they often force the data to spill to tempdb.)

Every time this query runs, the second derived table always reads 8 million rows, or 360MB worth of data, before joining to the first derived table.

Predicate Pushdown

Because both derived tables are only looking for a single row of data, based on the product being counted (the predicate), what we want is for the the predicate to be pushed down into both derived tables, right at the start of the query, so that any joins done on the tables are performed on a single row of data.

This is called predicate pushdown. In our case, the query optimizer could not find the most efficient query plan, and was timing out, so the second derived table never saw the predicate until the end.

Limiting the derived tables to one row also changes the joins in the plan from hash joins to nested loop joins, avoiding tempdb altogether, because nested loops are far more efficient for small rowsets.

The Fix

There is an extremely simple way to get the predicate pushed down to the derived tables, and that is to physically put it there. Unfortunately we cannot specify a value for the product in the view definition.

A stored procedure with a single @ProductID parameter would allow us to add WHERE ProductID = @ProductID to both derived tables, which would make the query really fast. In my testing, it ran in under 100ms.

The problem is that this would require numerous code changes to the existing system. Many of the queries that rely on the view also use additional JOINs and WHERE conditions, meaning that a stored procedure is impractical.

Enter the table-valued user-defined function, or TVF.

I copied the exact SELECT statement from the view definition, making two small changes: I added WHERE ProductID = @ProductID to both derived tables.

Then, using the output of the view as the basis of my table definition, I populated the top of the table-valued function with that.

So now, whenever we query the TVF, it builds a table definition in memory, runs the original query with its 20-odd joins, but limits the data to a single row, which is then returned in the same way if querying the view.

Outcome

Original query with a single ProductID: 46,636 logical reads, 10 seconds.
New query with a single ProductID: 35 logical reads, 109 milliseconds.
One thousand times faster: check.