Nested views are bad. Let’s get that out of the way.
What is a nested view anyway? Imagine that you have a
SELECT statement you tend to use all over the place (a very common practice when checking user permissions). There are five base tables in the join, but it’s fast enough.
Instead of copying and pasting the code wherever you need it, or using a stored procedure (because for some reason you’re allergic to stored procedures), you decide to simplify your code and re-use that
SELECT in the form of a database view. Now whenever you need to run that complicated query, you can instead query the view directly.
What harm could this do?
Ostensibly, none. Your code is cleaner; it’s simpler to debug and maintain.
Then your database grows. The number of users and developers increases, and they start using your magical view everywhere (after all, that’s why you built it!). Unfortunately, they begin nesting your view in their own database views. Instead of selecting directly from your view, the code path has to go through another view first, which selects from other tables, then joins to your view.
Performance begins to suffer, but the new developers can’t understand why. SQL Server must be the problem, so they ask the powers that be for more memory and CPU for the database server. Performance gets better.
Six months or a year later, new developers start using the nested views in their own views, all in the interest of abstracting away complicated code, simplifying code, keeping things easy to maintain.
Performance tanks again. Except this time, the newly requisitioned hardware has a flaw, and the database takes another 5% to 30% performance hit. Users complain about screens timing out. Developers complain that it’s hard to troubleshoot code because the generated query plans are so massive, almost impossible to read in SentryOne’s excellent (and free!) Plan Explorer tool.
You are called in to “fix the database”. You decide to look at one of these query plans and find that there are four or more layers of views, all of which depend on your original view at the bottom, which itself joins against five base tables.
Once you follow the trail all the way down, you realise that over 50 tables are being joined together by the query optimizer. Unsurprisingly, unwinding all the joins through each layer, you find that many tables are being repeated, but the query optimizer times out before excluding those obvious repeated tables, in the interest of bringing data back as quickly as possible. The optimizer can’t spend all day finding the perfect query, so it has to make a decision in microseconds.
Fellow Data Platform MVP Andy Mallon wrote about nested views and table-valued functions, with the following diagram to demonstrate why they can be bad:
Can someone help me tune this query?
Just kidding… I already fixed it. Blog post is forthcoming… pic.twitter.com/KV0w3x1ICV
— Andy Mallon 🌻 #BlackLivesMatter✊🏽✊🏾✊🏿 (@AMtwo) August 18, 2017
The bottom line is that code reuse is often treading a very fine line with performance. If you want your database to perform well, you may have to compromise on some of your coding policies. SQL Server can only do so much before it gets bogged down by really bad query plans, which are caused by really bad queries, which can in turn be caused by nested views.
Take care out there. If you’d like to share your nested views story with me, I’m on Twitter at @bornsql.