Data Efficiency in SQL Server: DECIMAL

This is part two of a short series of posts about how I assign efficient data types when designing a new table or database.

Use less space with BIGINT

Last week, I spoke about a more efficient DATETIME than DATETIME. This week I have a more efficient DECIMAL than DECIMAL.

If you’re planning to store a long number that will never have a fraction, and you were considering using DECIMAL, I’m going to ask you to think about BIGINT first.

Granted, this is a very specific use-case, and requires some forethought, but that’s our job: to think ahead.

Here’s the scenario where I opted in for a BIGINT over a DECIMAL: IMEI numbers.

According to Wikipedia, the International Mobile Station Equipment Identity (that long number on your mobile device that identifies it uniquely—well, mostly) is usually from 14 to 16 digits in length, and may be as short as 8 digits in length.

A DECIMAL data definition requires two values: precision and scale.

Let’s say you want to store the value 123,456.789. The number of digits in total (both to the left and right of the decimal point) is nine. Thus, the precision is 9.

The scale is everything to the right of the decimal point, which in this case is 3.

In other words, we would define the this value as DECIMAL(9,3).

A precision of 1–9 digits only requires 5 bytes of storage, which makes this far more efficient than a BIGINT, which requires the full 8 bytes of storage, plus it cannot store the fraction.

However, as the precision increases, the storage requirements increase. A precision of 10–19 digits requires 9 bytes (a 45% jump in storage required), and this is the sweet spot where a BIGINT is an effective alternative.

If you do not need to store a fraction, as in the case of an IMEI number, consider BIGINT. 8 bytes will safely store a value up to 19 digits in length (for a maximum value of 2⁶³-1).

So if you’re storing IMEI numbers for a country’s worth of mobile devices, say 250 million rows, one byte will save you 238MB in memory, queries, storage, indexing, backups, and so on.

I think that’s worth thinking about.

Hit me up on Twitter, at @bornsql, to tell me how you’ve made efficient data design choices.