Nested views are bad. Let’s get that out of the way.
What is a nested view anyway? Imagine that you have a
SELECT statement you tend to use all over the place (a very common practice when checking user permissions). There are five base tables in the join, but it’s fast enough.
Instead of copying and pasting the code wherever you need it, or using a stored procedure (because for some reason you’re allergic to stored procedures), you decide to simplify your code and re-use that
SELECT in the form of a database view. Now whenever you need to run that complicated query, you can instead query the view directly.
What harm could this do?
Ostensibly, none. Your code is cleaner; it’s simpler to debug and maintain.
Then your database grows. The number of users and developers increases, and they start using your magical view everywhere (after all, that’s why you built it!). Unfortunately, they begin nesting your view in their own database views. Instead of selecting directly from your view, the code path has to go through another view first, which selects from other tables, then joins to your view.
Performance begins to suffer, but the new developers can’t understand why. SQL Server must be the problem, so they ask the powers that be for more memory and CPU for the database server. Performance gets better.
Six months or a year later, new developers start using the nested views in their own views, all in the interest of abstracting away complicated code, simplifying code, keeping things easy to maintain.
Performance tanks again. Except this time, the newly requisitioned hardware has a flaw, and the database takes another 5% to 30% performance hit. Users complain about screens timing out. Developers complain that it’s hard to troubleshoot code because the generated query plans are so massive, almost impossible to read in SentryOne’s excellent (and free!) Plan Explorer tool.
You are called in to “fix the database”. You decide to look at one of these query plans and find that there are four or more layers of views, all of which depend on your original view at the bottom, which itself joins against five base tables.
Once you follow the trail all the way down, you realise that over 50 tables are being joined together by the query optimizer. Unsurprisingly, unwinding all the joins through each layer, you find that many tables are being repeated, but the query optimizer times out before excluding those obvious repeated tables, in the interest of bringing data back as quickly as possible. The optimizer can’t spend all day finding the perfect query, so it has to make a decision in microseconds.
Fellow Data Platform MVP Andy Mallon wrote about nested views and table-valued functions, with the following diagram to demonstrate why they can be bad:
Can someone help me tune this query?
Just kidding… I already fixed it. Blog post is forthcoming… pic.twitter.com/KV0w3x1ICV
— Andy Mallon 🌻 #BlackLivesMatter✊🏽✊🏾✊🏿 (@AMtwo) August 18, 2017
The bottom line is that code reuse is often treading a very fine line with performance. If you want your database to perform well, you may have to compromise on some of your coding policies. SQL Server can only do so much before it gets bogged down by really bad query plans, which are caused by really bad queries, which can in turn be caused by nested views.
Take care out there. If you’d like to share your nested views story with me, I’m on Twitter at @bornsql.
Photo by freestocks.org on Unsplash.
The nested use of views within views is definitely a concern, but so too is standardization and developer and systems analyst time, which is very costly. When appropriately used, views can be great tools for standardizing various manipulations of data. If case statement logic and other logic is encapsulated in a view (or inline TVF), then it needn’t be repeated in hundreds of stored procedures; if it ever needs to change, it can be changed in one place. Of course, this can also be a problem if as time goes by the logic continues to be appropriate for many or most use-cases, but is inappropriate in some use-cases.
So it really is a question of trade-offs. The usage of nested views may be partially addressed by requiring a naming convention for any view (e.g,, prefixing V_ or VW_ in the name), so that developers are aware when they are using views. Corporate policies could also be leveraged.
PS: The same concern about nested views would also apply to nested inline table-valued functions in SQL Server.
This is too radical to be true. One of my (analytically intensive) database is around ~40k lines of code. Yes, it has nested views over nested views over nested views and so forth.
And yes, it’s performance is decent and stable over years on a very consice server.
Had I flattened them, like the author suggests, i’d probably end up with millions of lines of code – good luck maintaining this
Let’s talk about programmable languages. You create one class with one responsibility, implement its interface, and extend it as needed. You don’t nest getters and setters, neither create a unique class with 10 responsibilities, 50 interfaced public methods, 1K LOC, and them create other class multi-inheriting it with 3 another classes.
SQL works differently. As said on the article, if you create a view that makes some joins, it should e meant to be end-used. If you join that view in a query, maybe you won’t need all its joined tables. Instead of using the view you should open it and look at what it does, and try to make the same job on your own query, in a way it’s optimized.