If you’d like to check out the previous instalment in this series on storing dates and times, click here.
Cast your mind back to our discussion on
DATETIME2. As you know,
DATETIME2 is basically the same as squishing
DATE (3 bytes) and
TIME (between 3 and 5 bytes depending on the scale) into the same column. You end up with a persisted value that is between 6 and 8 bytes wide.
DATETIMEOFFSET is kinda sorta the same thing, but with more bytes on the end. If you take a look at the Microsoft Docs page, the similar idea of a varied column size is retained. For a scale of 0 fractions of a second you only need 8 bytes to store your value, while the default scale of 7 decimal places for storing seconds requires the full 10 bytes.
Here’s something really interesting on that page though, which is really important (emphasis added):
“The data is stored in the database and processed, compared, sorted, and indexed in the server as in UTC. The time zone offset will be preserved in the database for retrieval.”
I’m writing this on Saturday 12 June, and I live in Calgary, so let’s use the following example:
2021-06-11 18:00:01.0000000 -06:00.
If we inspect the binary of this data type stored on disk as
DATETIMEOFFSET(7), it looks like this:
However, the same value — without the explicit time zone, stored as
DATETIME2(7), looks like this:
To quote one of the great poets of our time, what’s up with that?
If you convert that original time to a UTC value and store that as a
DATETIME2(7) column, you’ll confirm that this binary value converts very comfortably to
2021-06-12 00:00:01 in UTC:
So, the documentation bears this out, and the last two bytes of the value (which again I remind you is stored as UTC) are used to figure out the time zone offset in minutes.
Except this isn’t a simple case of “slap on two extra bytes with some meaning relating to the time zone offset.” Nope. Because each value is stored in UTC, even adjusting the time zone by one minute means a completely different UTC value is stored, which can make it a real exercise in patience if you’re inspecting raw binary.
With a different example, let’s use
2021-06-12 00:00:01 +00:001Note that although there isn’t technically an offset, the specification requires a + or – symbol in front of the offset, which is 1 second after midnight on 12 June 2021, in the UTC time zone.
Stored as a
DATETIMEOFFSET(7), this is no different to the equivalent value in
DATETIME2(7), because it is already in UTC:
0x8096980000a0420b0000. It is the same as in the previous example, with the two extra bytes at the end set to zero.
But let’s increment the time zone by 1 minute. Now we have
2021-06-12 00:00:01 +00:01, which is still 1 second after midnight on 12 June 2021, but this time the time zone is 1 minute ahead of UTC.
Looking at the persisted
DATETIMEOFFSET(7) value, we see
0x80103f07c99f420b0100. If you’re playing along at home and wondering why those two bytes at the end seem backward, remember that data is processed by x86 CPUs as Little Endian, or byte-reversed. Those bytes are stored as
0x0100, so when you reverse them they are
0x0001 as expected.
In any case, let’s drop those two bytes and see what the UTC value is actually being stored:
That makes sense. If you add one minute (the
0x0001 suffix) to the UTC value, you get the original value of
2021-06-12 00:00:01 +00:01, in that time zone. Keep this in mind when you’re spelunking in the hex editor of your choice.
It also means you can’t just slap any old suffix on the end and expect it to work. The preceding date and time value must be valid when taking the time zone into account. The documentation even states:
“[…] both UTC and local (to the persistent or converted time zone offset) datetime value will be validated during insert, update, arithmetic, convert, or assign operations. The detection of any invalid UTC or local (to the persistent or converted time zone offset) datetime value will raise an invalid value error.”
And finally, here’s an interesting note on the offset bytes for negative time zones (i.e., those time zones to the west of the Greenwich meridian). SQL Server recognizes time zones up to 14 hours either side of UTC, but they are stored in minutes, so the bytes only need to keep a value of up to 840 (or -840). However you can’t store a negative value in binary unless you use two’s complement or an extra byte, so (for example) in my time zone
-06:00 (or -360 minutes from UTC) the amount is subtracted from 65,536 —
0x10000 in hex — which is where
0xFE98 comes from in the original example. Converting
0xFE98 to decimal we get 65,176. Add 360 to that, and we have our 65,536 maximum value.
DATETIMEOFFSET data type might be useful to you, especially in combination with the
AT TIME ZONE clause, but consider that SQL Server is storing your data in UTC which you should be doing yourself anyway. Is it worth the extra two bytes per column per row to specify the time zone for that value? Maybe. For the kind of systems I design, I normally store the offset in a single configuration table column and save myself the storage, but this might be exactly the thing you’re looking for.
DATETIMEOFFSET tips in the comments below.
- 1Note that although there isn’t technically an offset, the specification requires a + or – symbol in front of the offset