(This post was updated on 5 February 2022.)
A few months ago I suggested that the following settings should be the default for most SQL Server instances.
- Set cost threshold for parallelism to 50
- Disable lightweight pooling if it is enabled
- Disable priority boost if it is enabled
- Set optimize for ad hoc workloads to enabled (read Don’t optimize for ad hoc workloads as a best practice)
- Set max server memory (MB) to a custom value consistent with Jonathan Kehayias’s algorithm
- Set backup compression default to enabled
- Set the power saving settings on Windows to high performance if possible
- Provide an option to flush the plan cache as needed
Over the next few posts I will dive into the why. Last time we started with cost threshold for parallelism. This week is a quick look at lightweight pooling.
A “go faster” button this is not
Let’s start with the official documentation, which recommends leaving this turned off:
Fiber mode is intended for certain situations in which the context switching of the UMS workers are the critical bottleneck in performance. Because this is rare, fiber mode rarely enhances performance or scalability on the typical system. Improved context switching in Microsoft Windows Server 2003 has also reduced the need for fiber mode. We do not recommend that you use fiber mode scheduling for routine operation. This is because it can decrease performance by inhibiting the regular benefits of context switching, and because some components of SQL Server that use Thread Local Storage (TLS) or thread-owned objects, such as mutexes (a type of Win32 kernel object), cannot function correctly in fiber mode.
and:
The lightweight pooling option should only be enabled after thorough testing, after all other performance tuning opportunities are evaluated, and when context switching is a known issue in your environment.
Unless you have a degree in computer science, a lot of that is meaningless, but there are two things that need highlighting.
- It refers to Windows Server 2003, so clearly this is something that used to matter before then. In context, it means we’re talking about an issue that is almost two decades old. Modern CPU architecture and operating systems are far better equipped to handle task scheduling than they were five years ago, never mind 20 years ago.
- The key takeaway is the sentence “We do not recommend that you use fiber mode scheduling for routine operation.”
Threads, UMS and mutexes, oh my
A thread is a scheduled execution task on a CPU. On computers with more than one CPU core or socket, threads can be scheduled over more than one core, which is referred to as multi-threading.
User Mode Switching (UMS) is a method where instead of relying on the underlying operating system to schedule threads, the application does it instead using something called “fibers” (a lightweight thread, ho ho ho). Instead of one thread per SQL Server SPID (Process ID), lightweight pooling uses one thread to handle several execution contexts.
In theory this allows for improved throughput under certain conditions, so long as the application is not making a lot of system calls. On the first point, there are three conditions Microsoft considers prerequisites, which are listed further below in this post. On the second point, SQL Server makes extensive use of system calls, especially with mutexes.
The word “mutex” stands for “mutual exclusion,” or an algorithm that prevents use of shared resources concurrently. A simple example is if one process (thread) is reading from a data page in memory, you don’t want another process writing to that same data page at the same time and corrupting the data.
Are there any side-effects?
Aside from the system calls mentioned above, let’s say you want to use the .NET Framework Common Language Runtime (CLR). While we might not have any custom CLR code deployed in the database, certain functions make use of the CLR and these would no longer function. Instead we’d see an unhelpful error message along the lines of:
Msg 5846, Level 16, State 2, Line 13
Common language runtime (CLR) execution is not supported under lightweight pooling.
Disable one of two options: "clr enabled" or "lightweight pooling".
When can I enable lightweight pooling then?
Don’t. But if you must, these are the conditions under which Microsoft suggests it may be useful:
- Large multi-processor servers are in use.
- All servers are running at or near maximum capacity.
- A lot of context switching occurs (greater than 20,000 per second).
We can measure context switching with a performance counter in Performance Monitor on Windows, so the last two items on this list can be monitored. Use the Context Switches/sec counter to track context switches.
Summary
For all new SQL Server configurations, I make sure this configuration setting is disabled. It should only be enabled under very specific circumstances, otherwise it might actually make performance worse, as well as disabling other features (like CLR for example).
Next time we will look at priority boost and why it should be turned off as well. Share your thoughts in the comments below.