First though, some background on parallelism.
How do parallel queries work on SQL Server?
SQL Server uses a costing algorithm to decide how “expensive” a query is. The cost is measured in query bucks if you’re Brent Ozar (blog | Twitter), or farkles if you’re Kimberly Tripp (blog | Twitter), but the point is that the number doesn’t relate to physical resources or time, and that’s important to keep in mind.
On a SQL Server instance with access to more than one CPU core, it is possible for parts of a query to run in parallel using multiple cores to hopefully process the data quicker and return a result faster. These parts are then all consolidated at the end.
There are two main settings that the query optimizer considers with deciding if a query is able to go parallel:
- Cost Threshold for Parallelism, which doesn’t have a neat abbreviation, so let’s call it the cost threshold;
- Maximum Degree of Parallelism, which is abbreviated to MAXDOP.
All queries start out life as a single thread. Most of the time they’ll stay single-threaded. If the cost of a plan happens to goes over the cost threshold value, the query optimizer will start costing out new query plan options which can use more than one CPU core up to the maximum (see the next section) number of cores as configured in the MAXDOP setting. If the optimizer finds a plan in that space that has a lower cost than the single-threaded plan, that’s the one it’ll use instead. All of this calculation happens in microseconds.
(If you want to get more technical, SQL Server associates a scheduler with each core so the MAXDOP value actually applies to the number of schedulers, and those schedulers run tasks.)
What has changed then?
While I and numerous other people have written that MAXDOP controls the maximum number of CPU cores that a particular query has access to, that isn’t entirely accurate.
In Pedro’s words:
What MAXDOP really controls [is] the number of tasks that can be used in each branch of a query plan.
For practical reasons — as he states later in that section — this doesn’t make much of a difference compared to the old way of describing it, because in the vast majority of cases it doesn’t matter, but that doesn’t provide the whole story.
Let’s say that our SQL Server has a 32 cores across 2 sockets (16 cores per socket). If we restrict the MAXDOP configuration setting to 8 — according to the best practices recommendation from Microsoft — there is an additional thread (the thread that started the query) that is not included in this limit, and won’t necessarily be in the same NUMA node either. This is sometimes called the coordinator thread, but the term is misleading in my opinion because it too can be used to run part of the query; it’s not just sitting around waiting.
This post by Pedro was in response to a post by Paul White (blog | Twitter) called “How MAXDOP really works.” In that post Paul refers to the additional thread as well, and calls it the parent thread.
The bottom line is that MAXDOP relates to the number of parallel tasks inside a query that can run at one time, but there is always the first thread — the one that starts the query — that isn’t included in that number. When you see MAXDOP, know that it will effectively be DOP + 1.
Next time you’re watching the CPU graph on a SQL Server instance with a MAXDOP set to 8 and you happen to notice that nine cores are being used for a particular query, now you know why.
Share your thoughts in the comments below.