How SQL Server stores data types: DATETIMEOFFSET

If you’d like to check out the previous instalment in this series on storing dates and times, click here.

I avoided mentioning this data type because I didn’t think a lot of people used it, and then my co-author William Assaf (blog | Twitter) told me on Twitter that he uses it, so here we are.

Cast your mind back to our discussion on DATETIME2. As you know, DATETIME2 is basically the same as squishing DATE (3 bytes) and TIME (between 3 and 5 bytes depending on the scale) into the same column. You end up with a persisted value that is between 6 and 8 bytes wide.

DATETIMEOFFSET is kinda sorta the same thing, but with more bytes on the end. If you take a look at the Microsoft Docs page, the similar idea of a varied column size is retained. For a scale of 0 fractions of a second you only need 8 bytes to store your value, while the default scale of 7 decimal places for storing seconds requires the full 10 bytes.

Here’s something really interesting on that page though, which is really important (emphasis added):

“The data is stored in the database and processed, compared, sorted, and indexed in the server as in UTC. The time zone offset will be preserved in the database for retrieval.”

I’m writing this on Saturday 12 June, and I live in Calgary, so let’s use the following example: 2021-06-11 18:00:01.0000000 -06:00.

If we inspect the binary of this data type stored on disk as DATETIMEOFFSET(7), it looks like this: 0x8096980000a0420b98fe

However, the same value — without the explicit time zone, stored as DATETIME2(7), looks like this: 0x80e667e0969f420b

To quote one of the great poets of our time, what’s up with that?

If you convert that original time to a UTC value and store that as a DATETIME2(7) column, you’ll confirm that this binary value converts very comfortably to 2021-06-12 00:00:01 in UTC: 0x8096980000a0420b

So, the documentation bears this out, and the last two bytes of the value (which again I remind you is stored as UTC) are used to figure out the time zone offset in minutes.

Except this isn’t a simple case of “slap on two extra bytes with some meaning relating to the time zone offset.” Nope. Because each value is stored in UTC, even adjusting the time zone by one minute means a completely different UTC value is stored, which can make it a real exercise in patience if you’re inspecting raw binary.

With a different example, let’s use 2021-06-12 00:00:01 +00:00¹, which is 1 second after midnight on 12 June 2021, in the UTC time zone.

Stored as a DATETIMEOFFSET(7), this is no different to the equivalent value in DATETIME2(7), because it is already in UTC: 0x8096980000a0420b0000. It is the same as in the previous example, with the two extra bytes at the end set to zero.

But let’s increment the time zone by 1 minute. Now we have 2021-06-12 00:00:01 +00:01, which is still 1 second after midnight on 12 June 2021, but this time the time zone is 1 minute ahead of UTC.

Looking at the persisted DATETIMEOFFSET(7) value, we see 0x80103f07c99f420b0100. If you’re playing along at home and wondering why those two bytes at the end seem backward, remember that data is processed by x86 CPUs as Little Endian, or byte-reversed. Those bytes are stored as 0x0100, so when you reverse them they are 0x0001 as expected.

In any case, let’s drop those two bytes and see what the UTC value is actually being stored: 2021-06-11 23:59:01.0000000

That makes sense. If you add one minute (the 0x0001 suffix) to the UTC value, you get the original value of 2021-06-12 00:00:01 +00:01, in that time zone. Keep this in mind when you’re spelunking in the hex editor of your choice.

It also means you can’t just slap any old suffix on the end and expect it to work. The preceding date and time value must be valid when taking the time zone into account. The documentation even states:

“[…] both UTC and local (to the persistent or converted time zone offset) datetime value will be validated during insert, update, arithmetic, convert, or assign operations. The detection of any invalid UTC or local (to the persistent or converted time zone offset) datetime value will raise an invalid value error.”

And finally, here’s an interesting note on the offset bytes for negative time zones (i.e., those time zones to the west of the Greenwich meridian). SQL Server recognizes time zones up to 14 hours either side of UTC, but they are stored in minutes, so the bytes only need to keep a value of up to 840 (or -840). However you can’t store a negative value in binary unless you use two’s complement or an extra byte, so (for example) in my time zone -06:00 (or -360 minutes from UTC) the amount is subtracted from 65,536 — 0x10000 in hex — which is where 0xFE98 comes from in the original example. Converting 0xFE98 to decimal we get 65,176. Add 360 to that, and we have our 65,536 maximum value.

The DATETIMEOFFSET data type might be useful to you, especially in combination with the AT TIME ZONE clause, but consider that SQL Server is storing your data in UTC which you should be doing yourself anyway. Is it worth the extra two bytes per column per row to specify the time zone for that value? Maybe. For the kind of systems I design, I normally store the offset in a single configuration table column and save myself the storage, but this might be exactly the thing you’re looking for.

Share your DATETIMEOFFSET tips in the comments below.

Photo by Djim Loic on Unsplash.