In my previous posts in this series we’ve seen reference to Transact-SQL (T-SQL) functions that are used to get the specific part of a date and/or time (year, month, day, hour, minute, second, etc.). This week we’ll go through one of them and see how it works.
Introducing DATEPART
, a built-in function that takes two parameters: the date part we want back, and the input date. Per the documentation, the input date parameter must be in a format “that can be resolved to a time, date, smalldatetime, datetime, datetime2, or datetimeoffset value.” In a future post we’ll see just how vague this definition is, however for this post just know that almost anything goes if it looks remotely like a date or time.
DATEPART
can return a lot of parts, and the result is returned as an INT
.
What’s in a number?
Conveniently, our clock and Gregorian calendar system mostly revolves around fairly small integers. There are 60 seconds in a minute, 60 minutes in an hour, 24 hours in a day, up to 31 days in a month, 12 months (and up to 366 days) in a year, and so on. These numbers are whole numbers and can be expressed easily in words and writing.
When it comes to weekdays, it gets more complicated. We have weekdays which (depending on the country and language) might start on a Sunday or a Monday. This can have a knock-on effect on weeks of the year, not to mention the ISO week (see below).
The following table taken from the documentation shows an abbreviation we can use in place of the datepart parameter. I recommend getting out of the habit of using the abbreviation, as there is very little additional effort expended in typing the full parameter name, and avoiding the abbreviations help with writing clearly understandable code.
datepart | Abbreviations |
year | yy, yyyy |
quarter | qq, q |
month | mm, m |
dayofyear | dy, y |
day | dd, d |
week | wk, ww |
weekday | dw |
hour | hh |
minute | mi, n |
second | ss, s |
millisecond | ms |
microsecond | mcs |
nanosecond | ns |
TZoffset | tz |
ISO_WEEK | isowk, isoww |
The documentation also says:
To avoid ambiguity, use four-digit years.
Follow this advice. It is good advice. We can be lazy, but we can’t be negligent.
What’s four bytes between friends?
As we learned some time ago, an INT
(integer) in T-SQL uses four bytes, and has a maximum value greater than zero of over 2 billion (there are more than 4 billion values in an integer if we take the negative values into account).
Why then are date and time parts expressed as an INT
, which have a lot of overhead for values like 24, 31, and 60?
There are two reasons:
- Integers make things simpler. When writing arithmetic in T-SQL and other programming languages, the default data type is usually an integer. Having to memorise which data types are returned from built-in functions becomes unnecessary when we know it will be an
INT
. Yes, it uses extra memory, but four bytes is a reasonable trade-off against complexity. - One of the return values is nanoseconds. Although
DATETIME2(7)
only has a granularity down to 100 nanoseconds,DATEPART
allows us to return this value, which requires a data type large enough to contain an integer up to 1 billion (nanoseconds can range from 0 to 999,999,900 in increments of 100).
What’s the catch?
There are some considerations when dealing with DATEPART
that are affected by server configuration.
- Language: Because the input date can be any value that could be interpreted as a date, the language environment matters. We should take care that our input parameter matches the default language of the instance, otherwise we have to set the
LANGUAGE
setting for the session to ensure compatibility. - Date format: If the input date is a string literal, it is also affected by the
DATEFORMAT
setting. Additionally the documentation tells us that “SET DATEFORMAT does not change the return value when the date is a column expression of a date or time data type”. - Date first: As noted previously, the first day of the week can affect the result provided by the week or weekday output. The default first day of the week is set using the
DATEFIRST
setting.
ISO_WEEK considerations
In 2011, one-off alarms on iPhones across the world stopped working for the first two days of the year. This was a very curious bug that happened to be related to the International Standards Organization (ISO) standard 8601, which deals with date and time.
In the ISO week standard a week starts on a Monday, and when matched to the Gregorian calendar, the first week of the year begins with the week that contains the first Thursday. In January 2011 the new year started on a Saturday, thus the first week of the year according to the ISO standard was not the same as the first week in the calendar. January 1st and 2nd were in a strange limbo, so phone alarms didn’t ring.
This kind of uncommon scenario (what we call an “edge case”) might cause wide-reaching bugs in our own T-SQL code if we are working with the ISO 8601 standard, especially across regions.
The documentation states:
European countries / regions typically use this style of numbering. Non-European countries / regions typically do not use it. […] The numbering systems of different countries/regions might not comply with the ISO standard.
Hopefully this gives you a better understanding of the DATEPART
function in T-SQL. Share your ISO bugs with me on Twitter at @bornsql.