Recently Brent Ozar posted a link to the PostgreSQL “Don’t do this” page, which I am shamelessly reproducing below, re-tailored for a SQL Server audience.
Don’t use -P
with sqlcmd
sqlcmd
is a cross-platform interactive command-line utility that allows you to connect to a SQL Server (or Azure SQL Database) instance and perform database operations. One of the parameters is -P
which allows you to supply a password.
Why not?
If sqlcmd
requires a password at the command-line, it will ask for one. Generally speaking we should be using Active Directory authentication (which is supported on Windows and Linux), so specifying a command-line password is redundant in that case. As well, if we have a password hard-coded into a script, we’re doing it wrong.
When should we?
Never. It’s not worth it to save a network round-trip to the server.
Don’t use NOT IN
It’s a common practice to use the IN
condition in a WHERE
clause, so that we can limit our search arguments to specific values. However using NOT IN
can have some unexpected results for the same reasons that we see in PostgreSQL.
Why not?
The first reason has to do with NULL
values. Let’s use the same example as the PostgreSQL page:
SELECT * FROM foo WHERE col NOT IN (1, NULL);
This type of query will always return zero rows. It is impossible to know if something is NOT NULL
because NULL
is always unknown, thus NOT (NULL)
returns NULL
. The WHERE
condition will never resolve to a valid value in the above example. Since col
cannot be evaluated, the query optimizer simplifies that condition to a Constant Scan against an empty internal table.
The second reason why this is a bad idea is because it makes the query optimizer pick a really bad plan. Consider this example:
SELECT * FROM foo WHERE col1 NOT IN (SELECT col1 FROM bar);
If one of the sub-query values is a NULL
, this query will return zero rows as noted above. More importantly and perhaps surprisingly, the execution plan will use a Hash Match (Right Anti Semi Join). On smaller sets this won’t be noticeable, but it will matter a lot when you have more than a few thousand rows and your performance tanks, as a Hash Match is a very expensive join operator.
When should we?
If we use NOT IN
, we should ensure our list of elements doesn’t contain a NULL
value, and we should also avoid using a sub-query in the WHERE
clause to avoid that blocking Hash Match.
Don’t use TIMESTAMP
We covered this in detail in a previous post, What about TIMESTAMP
? It’s better to pretend that this data type doesn’t exist.
Why not?
It is not what you think it is. TIMESTAMP
is actually a row version value based on the amount of time since SQL Server was started. If you need to record an actual date and time, use DATETIME2
instead.
When should we?
Never.
Don’t use SMALLMONEY
, FLOAT
or REAL
for money
If we need to store values that represent amounts of money (or any values that require accuracy), avoid SMALLMONEY
, FLOAT
and REAL
data types.
Why not?
In SQL Server and Azure SQL Database, the SMALLMONEY
data type has a maximum range of -214,748.3648 to 214,748.3647. It’s clearly inappropriate for larger amounts. It may be tempting to use the MONEY
data type instead, but be wary of rounding when performing calculations, since MONEY
only stores four decimal places.
Speaking of rounding during calculations, FLOAT
and REAL
are terrible choices for calculations on any value that require accurate results. Floating point arithmetic will result in lower accuracy (sometimes significant), and on financial values this can cause incorrect results.
When should we?
Any time we have to deal with money, use the DECIMAL
data type with an appropriate scale. In a pinch we could use the MONEY
data type, but certainly never SMALLMONEY
, FLOAT
or REAL
for money.
Don’t use BETWEEN
This is an easy one to get wrong, especially with dates.
Why not?
BETWEEN
uses a closed-interval comparison: the values of both ends of the specified range are included in the result. That might sound like what we’d want, but consider the following example where we want to get all the values in the next week:
SELECT * FROM table WHERE datecol BETWEEN '2018-08-01' AND '2018-08-08';
This will include results up to where the date and time is exactly 2018-08-08 00:00:00.0000000
, but not dates and times later in that same day. We may not actually want any events that might have occurred at exactly midnight on the second value.
Instead, do:
SELECT * FROM table WHERE datecol >= '2018-08-01' AND datecol < '2018-08-08';
With this construction, we are getting all the values up to but not including midnight on the morning of August 8th.
When should we?
BETWEEN
is safe for discrete quantities like integers or dates, as long as you remember that both ends of the range are included in the result. It is better to use combinations of >
, =
and <
operators instead.
Summary
SQL Server and PostgreSQL have a lot in common, especially when it comes to what not to do. I’d love to hear your own rules for either platform, with a focus on SQL Server of course. Sound out in the comments below.
Don’t use SELECT *…
😀
This wins for the most obvious thing you shouldn’t do in SQL Server! Thanks for the comment.
I may be a basic bitch as well, but how am I supposed to avoid TIMESTAMP in PostgreSQL?
… Did you read their suggestion?
I can’t find anything that says not to use TIMESTAMP/TIMESTAMPTZ. From what I can tell this is the equivalent of DATETIME2 in SQL Server and they chose the name to follow the ANSI standard.
I don’t think TIMESTAMP was intended for that use. We use it as a simple way to filter ETL processes. If the source and destination timestamps do not match then the source row changed. (Destination timestamp in this case is the one copied from, the source when the row was added to/modified on the destination.) Even if the destination row was updated by hand this test forces the row to come back to matching the true source. No date math or having to consider time (like your BETWEEN comments).
TIMESTAMP is the wrong name for the data type, per Microsoft’s own guidance. Call it ROWVERSION. Otherwise your comment is perfectly valid.
Comments are closed.