Using indexed views? What is an imprecise or non-deterministic convert?

Green rope meshwork

After last week’s post about using WITH (NOEXPAND) to query indexed views even on SQL Server Enterprise Edition, this week is a short but interesting side-road into deterministic values, and why it is important to get your data types correct. Longtime readers will know I care deeply about data types.

Here is a T-SQL query I wish to make into a view, and then index it because I want to join to the values in question:

The DateSold column in this table has the DATE data type, and there is a default value on this column, because the original designer of this table hates NULL values apparently. While I can’t blame them for trying to avoid adding a null bitmap to a table for whatever reason, it means that there’s now what we call a “magic string,” or default value that has a very specific meaning that is different from the obvious.

In other words, if we see a date in this table where the DateSold is 1900-01-30 (let’s call it a “magic date”), then the item has not been sold yet. This means regular date range queries won’t work unless we take this magic date into account.

Please don’t do this. Firstly, you’re lying to anyone who queries the table and is unfamiliar with the data. Secondly, if you don’t know what value should go there, use NULL which was invented for this kind of thing. I had a yellow sticky note on my desk when I worked at a bank that read “NULL is always unknown.” It’s a good mantra.

So now I need to exclude all items that are unsold, which means excluding anything dated 1900-01-30. As you can see in my query above, I chose 1 January 1901 as my start point, because 30 January 1900 is not round enough for my liking. I could just as easily have looked for the lowest legitimate value and picked a date before that.

When I create this view, I want to add an index on the ItemID so that I can join to it in other queries, which means I need to use the hint WITH SCHEMABINDING. What this does is enforce the structure of the underlying table I’m querying from. In other words, schema-binding prevents changes being made to that table until this view is dropped.

As we already know, we can’t use an ORDER BY clause in a view (even if there are “clever” hacks to do so, we should avoid them because they might affect our index), so we have to remove that. The view creates correctly and I can select from it, but as soon as I try to create my clustered index I get this wonderful message.

It’s referring to the WHERE clause, specifically my DateSold > '1901-01-01' condition. There’s an implicit conversion happening to make the string version of the date work with the DateSold column. To make it actually work as I intended, I need to convert the date into a deterministic value — an unambiguous date.

Using the magic of our favourite search engine and Microsoft Docs, we find ourselves on the CAST and CONVERT (Transact-SQL) page, and, if you’re like me, you’ve been here before and might even have it bookmarked.

On that page we need to cast (or convert) our date into an unambiguous value, which means eliminating any date or time style with 2-digit centuries. We’re left with a lot of options. While going through the list, we come to our old standby, ISO8601, except the conversion style is 23, so it’s probably not going to work. Sure enough, changing the WHERE clause to read WHERE DateSold > CONVERT(DATE, '1901-01-01', 23) results in the same error.

But wait! There’s an old standby that the SQL Server old guard like to use, and that’s yyyyMMdd (in other words, the year as four digits, and the month and day as two digits each, without any hyphens). I happen to remember that it’s style 112, which is easy to check against that list. If I had a 24-hour time as well, I would use 121.

As expected, this version of the query successfully allows me to create an index on the view, which in turn enables the use of WITH (NOEXPAND) as discussed in the previous post.

Leave your thoughts in the comments below.

Photo by Clint Adair on Unsplash.

4 thoughts on “Using indexed views? What is an imprecise or non-deterministic convert?

  • Huh?

    I’m working with pretty much the same problem – dates where the table uses a low default date, but also where time is stored in a separate column of type char(6) – to represent hhmmss.

    However, from your article, I don’t understand:

    * Why does the implicit conversion fail? The supplied string uses a four digit year, so your comment “eliminating any date or time style with 2-digit centuries” seems irrelevant?

    * Why does the convert fail with one code, but not the next?

    I don’t think the article explained *why* the solution is what it is … which means I can’t figure out what the issue is for my 6 number timestamps either.

    Is the issue, more specifically, that a string is being supplied and strings are presumed to be non-deterministic (even though they are, in this case, hard-coded in the view definition?) Why would that be? I suppose the string might vary if you modify some database settings after creating the view (but couldn’t the DB then prevent the setting change, requiring you to drop the view, make the change and re-create the view?)

    • Hi Chris, thanks for stopping by. The answer is “I don’t know”, but I can give you a little more insight.

      The issue isn’t whether or not you have the right length necessarily, it’s that only specific combinations of the CONVERT / CAST function with a style format, will ensure the column is not ambiguous. I used the documentation as a guideline and tried to create indexes on views using all of the conversion formats.

      The following styles failed: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 22, 23, 24, 25, 100, 106, 107, 109, and 113.

      If we look at the ones that succeeded (namely 20, 21, 101, 102, 103, 104, 105, 108, 110, 111, 112, 114, 120, 121, 126, 127, 130, and 131), there doesn’t seem to be anything particularly consistent about them compared to the failures. One might extrapolate rules, but without seeing the source code for myself, I can’t be sure why some formats are allowed and others are not.

      Sorry that I can’t be of more use. Hopefully this will satisfy your curiosity. For what it’s worth, I used 112 in my example because it’s an ISO standard that I happen to know SQL Server can read without any issues, irrespective of regional settings.

  • Thanks Randolph. For the sake of completing the discussion, I worked around this issue by breaking my various expressions down into single columns – so rather than build a view which does all the convoluted converstions (from datetime to date to string then concatenating to substrings of other strings before casting back to datetime!) I did each step separately and then tried creating the index on the view. Where it failed was in casting a datetime value to a date, then a string and concatenating it to a space character – I am still not sure why. But in my case the underlying data type I began with was datetime – just the application was not populating the time part – that was being held in another column. So instead I broke up the time string into hour, minute and second values (as strings) and used try_cast to cast to int, then used the dateadd function to progressively add the hour, minute and second parts to the existing datetime. This resulted in the final value being deterministic – so I presume something to do with type casting from datetime to date, or from date to string (of uncertain data type) may have led to the expresion being non-deterministic initially. *Mayyyybe* this description of approach might at least help someone down the line. Cheers.

    • Great! Thanks for closing the loop on this. One thing’s for sure, I’ll never be bored with the weird stuff in this product.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: