On Twitter, Michael Dyrynda writes:

⚠️

Never ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever ever store currency as a float in your database

I completely agree with this statement. Never store values used in financial calculations as floating point values, because a floating point is an approximate representation of a decimal value, stored as binary. In most cases it is inaccurate as soon as you store it. You can read more in this excellent — if a little dry — technical paper.

With that out of the way, we get into an interesting discussion about the **correct** data type to store currency values.

I’ve contributed to several books about SQL Server, and I wrote recently on this website about `MONEY`

and `SMALLMONEY`

. I like to think I know a little bit about picking the right data type for a certain column.

So I was surprised when a friend on that Twitter thread mentioned that he stores currency values as integers (multiplying them by 100 first). Being as literal as I am, I thought he meant the `INTEGER`

(or `INT`

) data type, which is a four-byte data type that goes up to around 2.1 billion (for values greater than zero), but Michael reminded me that “big integers exist,” so of course my friend was talking about `BIGINT`

.

I wrote in a reply that “Decimal will have to do. I’m not going to store money as a `BIGINT`

.” Nevertheless, I found it an interesting thought experiment to write about here.

Assume that your currency has a hundred possible decimal values from 0.00 to 0.99. Now assume that you can eliminate pesky rounding errors by storing currency data in cents as a `BIGINT`

. This requires you to multiply the currency amount by 100 when inserting it, and dividing by 100 when displaying it. That seems counterproductive in my mind, but bear with me as we investigate this further.

What problem is this meant to solve? Why avoid the use of the `DECIMAL`

data type, which is engineered to store decimal values with a decimal precision? Is the risk of rounding errors so great as to avoid a data type completely as the floating point? Surely `DECIMAL`

can handle this? Most of the time we use simple sums (`SUM`

), averages (`AVG`

), and regular arithmetic operations like addition, subtraction, multiplication and division, and in my experience — provided I put the parentheses in the right place — I’ve never run into a rounding error that was significant enough (i.e. outside of the tolerance of the organization calculating the amounts) to be a problem.

As I’ve pointed out before, `BIGINT`

ranges from negative 9 quintillion, to positive 9 quintillion (a number that is 19 digits wide). That’s a lot of squillions. Even storing these values as cents, you can store amounts that could handle even the ill-fated Zimbabwean dollar, which — at the time it was discontinued — was distributed in $100-trillion notes (100,000,000,000,000.00, or 17 digits wide). A `BIGINT`

needs 8 bytes of uncompressed storage for each value.

Now `DECIMAL`

is a different beastie. If you need the highest precision, a `DECIMAL`

can use up to 17 bytes for each value. Generally though, I like using `DECIMAL(19,4)`

for currency, which needs 9 bytes and can store numbers 19 digits wide, where the last four digits are after the decimal place. Coincidentally, this has the same range of values as the `MONEY`

data type. Maybe it’s not a coincidence that I chose that then, eh?

Purists will say that the `MONEY`

data type is woefully inaccurate when it comes to calculations, including multiplication and division. To those purists I say “pish!” Unless we’re talking Bitcoin, a currency amount usually has two decimal places. By using four decimal places, this means that rounding errors are less likely. In the vast majority of cases you will be doing things like summing values, calculating percentages, working out differences between two values, and so on. If you are being caught out by rounding errors, perhaps you should increase the precision of the input values **for the purposes of that calculation only**.

After all, if you’re going to the effort of storing currency values as cents in a `BIGINT`

column, you’re already doing conversions of your data. Why not just eliminate that overhead of multiplication and division on storage and retrieval respectively, store these values as `DECIMAL(19,4)`

, and increase the precision only when you need it? In fact, sometimes all you need to do is change the order of operations in your calculation.

When I design tables on SQL Server that are going to store a currency value, I generally use `DECIMAL(19,4)`

. If I encounter a customer database that stores these values using the `MONEY`

data type, I have no problem whatsoever with that, because they are basically the same thing anyway.

*(Note: I know they’re not identical, but for the sake of storing currency values, it makes no practical difference.)*

I don’t see the point of using `BIGINT`

to store currency values. It causes confusion, and will introduce bugs when multiplying or dividing by 100. This is especially moot since `MONEY`

is stored on disk in exactly the same format as `BIGINT`

anyway.

If you’re trying to save storage space, just compress your data using row or page compression. One thing we can all agree on though, is never to store currency as a floating point.

Share your thoughts in the comments below.

*Photo by Josh Appel on Unsplash.*

Here’s why people use BIGINT and why it may be necessary and should not be dismissed out-of-hand. In many cases in the accounting world we aggregate currency amounts in non-additive ways that may not make sense to the lay person. Aggregating these numbers can easily result in USD that is in the quadrillions which will not fit in dec(19,4). There are also a handful of currencies that do in fact have precision to 4 decimal points. Then there is the Korean Won which is like a billion Won to the dollar (exaggerating and too lazy to check current exchange rates). Converting those numbers also spills a dec(19,4). So the “trick” that most accounting software use is to use BIGINT when a number is stored PLUS a currency id for the column. The currency table has metadata col that holds the currency and a multiplier. An API is written that ensures that ALL DATA ACCESS goes through the API which handles the value for display purposes. So, if I want to store my salary which is $45,876.56 I would store it in a BIGINT as 4587656 with currencyid 2 which has a multiplier of 100. You can quickly see how that is beneficial. I’m not suggesting this solution should be used EVERYWHERE (and definitely not salary), but it’s helpful to not be so dismissive of these solutions which are IMPERATIVE in the acctg world, for some use cases. Understand your use case, understand good logical and physical modeling, and understand the capabilities of your rdbms.

Considering that MONEY and BIGINT are stored in *exactly* the same way by the database engine (see last week’s post), I don’t see any benefit to what you describe. I’d sooner use a wide DECIMAL with a high precision.

Regarding not using float in the database, I would also apply the same principal to programming languages too. If an application stores a value in the database where the datatype in the application is float, accuracy is already lost. It’s just not obvious like it is within the database.

Never using float is a bit of an extreme. It has a use case where an approximate value is good enough. For example, the position of a planet in the sky does not need to be exact for many astronomers. However, if you were plotting a course to orbit a planet, this inaccuracy would cause you to miss the planet.

Steal my thunder for next week’s post, why don’t you? 🙂

That is twice that I have done that to you. Great minds think alike! 🙂

No doubt. Thanks for your thoughts on this week’s post.

Just to counter this point, you could make your decimal larger. It doesn’t need to be (19,4). You could do (38,4) in which case decimal will hold a boatload more numbers than BIGINT would. For example, maximum BIGINT is:

9223372036854775807

While maximum decimal(38,4) is:

9999999999999999999999999999999999.9999

which is substantially larger than a BIGINT value. Mind you it uses more space than a BIGINT, but it also allows for more precision and less “lookups” to see what you should be dividing by. Using decimal(38,19) would give you the same number of digits before the decimal as BIGINT while offering 19 additional digits for precision.

so I think the argument for BIGINT hits a snag there, no?

The old Sybase MONEY datatypes are left over from the first early days of SQL. COBOL includes formatting in its declarations. They use what is called a PICTURE clause that gives floating currency symbols, commas, leading zeros, and other punctuation in currency amounts. Sybase was trying to keep COBOL programmers when they designed the MONEY datatypes, so they built in display features to match the PICTURE clauses. The problem was they also defined the precision of those MONEY datatypes and the rules for doing arithmetic with them. They got the rules wrong! You can Google it, but basically when you try to do multiplication and division with MONEY datatypes, you get rounding errors.. Add to this the number of decimal places is also not correctly defined according to EU standards.. You would really need to sit down with your accounting department and figure out which are required to do by law as well as by EU and GAAP accounting standards.

Thanks Joe. As I covered in my previous post, MONEY is stored as BIGINT anyway, so it makes sense it’ll have the same kind of rounding issues.

please give me a real world example, where you have to multiply two money values.

Multiplying money with integer or decimal (usually an amount) is very common, but I never had to multiply 100$ by 5$ (only 100$ by 5)

It doesn’t matter if YOU see benefits, the fact is, many accounting systems are architected this way. Understanding the WHYs is helpful to understand when/if you see it in the wild. Let me be clear, under no circumstances would I model a greenfield schema this way in 2020, but that doesn’t mean you won’t encounter it in the wild. Remember that many acctg systems were architected in a time when a customer could choose which dbms they wanted to implement it on. Not every dbms internally represents the data structures for MONEY and BIGINT the same way, in fact, some don’t support larger numerical data needed to support things like non-additive aggregations and Korean Won conversions that are required. To make my point again, it’s important to understand use cases and decisions, even poor ones, so we can work with them but ensure we don’t make the mistakes of 1971 again in 2020.

My blog post is titled “How should I store currency values in SQL Server?”. Knowing the history of accountants doing it wrong is interesting, but ultimately irrelevant. This post stems from a series I’m writing about how the storage engine persists data, and as a data professional working with SQL Server in 2020, I’m going to use DECIMAL with a high precision starting at (19,4) when it comes to currencies, unless there is a good use case otherwise. No one is arguing that point. The fact that some people use BIGINT instead of MONEY is an interesting tangent, but ultimately meaningless *to the database engine* because they’re persisted the same way on disk and in memory. That’s the point of this post (and tomorrow’s post explores in detail why FLOAT is a terrible choice for currency). I appreciate your comments, and hope you stop by in the future. You are welcome to continue discussing this with me on Twitter.

What’s your definition of “real world”? I mentioned this in the first comment…accountants can do some odd things with data. Here’s one. Large companies often have the concept of “controls” on their reports and data extracts. Something like, I know this report is accurate if I can do some crazy math against 2 summary/control accounts, and the number gets put in the footer or a control file. Often this is modulo math on multiple columns, but that’s really just multiplication, right? This was done, as Joe Celko mentioned above, b/c older acctg systems may use PIC(x) in cobol when the dbms stores both as MONEY or a column where the underlying UOM is a currency. Having a control file gives a warm fuzzy that something wasn’t rounded/trunc’d. You may say that is stupid in 2020 and computers don’t make basic errors like that and a control file/footer is unnecessary. But then why do most acct systems still actually store their data like an old school green bar accting book with DB/CR entries? Answer: because that’s what it takes to make the accountants feel good. In today’s world we do the same thing with “control files”, just using more modern technologies such as hashing or bin_checksum(). In 2020 this is a terrible implementation but unfortunately we still see and need to understand the WHY of it.