(This post was updated on 5 February 2022.)
A few months ago I suggested that the following settings should be the default for most SQL Server instances:
- Set cost threshold for parallelism to 50
- Disable lightweight pooling if it is enabled
- Disable priority boost if it is enabled
- Set optimize for ad hoc workloads to enabled
- Set max server memory (MB) to a custom value consistent with Jonathan Kehayias’s algorithm
- Set backup compression default to enabled
- Set the power saving settings on Windows to high performance if possible
- Provide an option to flush the plan cache as needed
Over the next few posts I will dive into the why. This week we look at enabling optimize for ad hoc workloads, which is more of a memory optimization feature than a performance feature.
The plan cache
Memory in SQL Server, very simply, is divided between query plans in the plan cache, and data in the buffer pool (there are other uses for memory which are outside of the scope of this post).
All query plans take up space in the plan cache. There are various ways to flush them once they are there, including but not limited to:
- SQL Server may clear out older plans using a least-used algorithm
- We might issue a manual
DBCC FREEPROCCACHEcommand to clear out one or more plans from the cache
- We might schedule a regular SQL Agent job to clear out all plans
- A configuration change may flush the entire plan cache
- A stored procedure recompilation may force an old plan to be flushed
- Statistics updates and index rebuilds may force a plan to be flushed
- Code changes such as functions or views may force a plan to be flushed
- Issuing the command
ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHEon SQL Server 2016 or higher
The term “ad hoc” means “as needed,” and in the case of SQL Server it refers to a single-use query plan, meaning that a plan is generated for a specific query and never used again. Single-use plans are common in database-backed applications where ORMs are used (Object-Relational Mapping tools, such as Entity Framework, Dapper, Hibernate, NHibernate, LINQ to SQL, and many others), and in cases where T-SQL code is generated using string concatenation.
These single-use plans require memory, but once the query has been run, they continue to sit there inside the plan cache, adding up to what we call plan bloat. Since these plans are never going to be used again, we might want to remove them from the cache. The question becomes — especially on a busy instance — how do we do this efficiently?
Note: In modern SQL Server instances, we may not want to remove single-use plans from the cache while troubleshooting query performance. Check out the post Don’t optimize for ad hoc workloads as a best practice for more details.
One of the methods of resolving plan bloat is to enable forced parameterization. This setting forces T-SQL queries to become parameterized. By way of example, it means that this query, containing literal values:
SELECT col1, col2, col3
WHERE col1 = 256
becomes the following:
SELECT col1 , col2 , col3 FROM table WHERE col1 = @i
The first thing we see in this contrived example is that the white space in the original query (line breaks and spaces) are condensed so that the query is on a single line, and single spaces are placed around everything including the commas. Finally, the WHERE condition is parameterized. What this does is aggressively force the queries into a consistent format so that there is a better chance of reusing an existing plan.
The major issue with forced parameterization is a feature called parameter sniffing. If our data is skewed in any way and SQL Server comes up with a specific query plan according to how that data is distributed (let’s say it performs a nested loop join instead of a hash match), that plan will be reused even for queries where a different join type would be better. Erik Darling goes into some detail in this post.
No real-world data is evenly distributed, so forced parameterization is not recommended as a default setting, however (and this is an important note) that doesn’t mean it isn’t useful. If you find that forced parameterization is useful, please use it.
Additionally, if your database is in compatibility mode 150 or higher (available on SQL Server 2019 and Azure SQL Database), this parameter sniffing problem is mitigated to a large extent with adaptive joins. On SQL Server 2017, adaptive joins can be forced using a filtered columnstore index hack, which Kendra Little writes about.
Enter ad-hoc caching
Enabling the optimize for ad hoc workloads configuration setting will reduce the amount of memory used by all query plans the first time they are executed. Instead of storing the full plan, a stub is stored in the plan cache. Once that plan executes again, only then is the full plan stored in memory. What this means is that there is a small overhead for all plans that are run more than once, on the second execution.
The official documentation has this to say:
If the number of single-use plans take a significant portion of SQL Server Database Engine memory in an OLTP server, and these plans are Ad-hoc plans, use this server option to decrease memory usage with these objects.
Single-use plans consume much less memory with this setting enabled, and in the vast majority of database-based applications that use an ORM or string-concatenation to generate T-SQL queries (which is most real-world SQL Server instances I’ve seen), this is a set-and-forget configuration option that will make more memory available where it matters: the buffer pool.
Forced parameterization can be great, especially on SQL Server 2019, but not everyone has access to that just yet. In the meantime, you might want to consider enabling optimize for ad hoc workloads to reduce the memory footprint of single-use plans in the plan cache and keep your system running cleaner. If however you want to troubleshoot single-use queries, this is not a good configuration setting to enable as a best practice.
Share your memory optimization tricks in the comments below.