Locks, Blocks, and Isolation Levels

Last week we looked at ACID compliance. This week we dive a little deeper into the Isolation portion, and what it all means.

Isolation levels work to manage how my transactions may be isolated from your transactions, to allow concurrency. They work by locking sections of the database until a transaction is committed or rolled back, which results in blocking other access to those sections of the database.

In other words, locking and blocking is expected behaviour in a RDBMS.

There are four isolation levels in SQL Server (as quoted from SQL Server Books Online):

  • Read uncommitted (the lowest level where transactions are isolated only enough to ensure that physically corrupt data is not read)
  • Read committed (Database Engine default level)
  • Repeatable read
  • Serializable (the highest level, where transactions are completely isolated from one another)

The default on SQL Server (Read committed) means that readers block writers, and writers block readers. If I’m performing an UPDATE, DELETE or INSERT on a table when you’re trying to run a SELECT, and they happen to be trying to access the same set of 8 KB data pages that make up that table, whichever transaction started first (the one with a lower LSN) will take precedence, and the database engine will lock all the portions of the table that are being affected. Any other operation that also tries to access those portions of the table will be blocked until the transaction is complete. Locks happen thousands of times a second on a busy system, and most of the time we barely notice.

Deadlocks occur when two operations manage to lock each other out. SQL Server will kill a deadlock by terminating the process that was using fewer resources. This means that a long-running SELECT could easily take precedence over a short-running UPDATE. There’s no way of knowing who the deadlock victim will be, which is why our code should always have a retry mechanism built in. Always keep in mind that concurrency, by design, must assume locking, blocking, and deadlocks.

Whither NOLOCK?

We may have used the query hint WITH (NOLOCK) in the past because we noticed that queries seemed to run a lot faster. That’s because the NOLOCK keyword forces the lowest possible isolation level.

Referring back to the Books Online link above (or even the Wikipedia article), Read Uncommitted, or NOLOCK, permits the following bad behaviour:

  • dirty reads – a row that is uncommitted (either has not been committed yet, or may be rolled back) will end up in your query.
  • non-repeatable reads – a row might be read twice, with differing results, because the database engine might decide to delete and reinsert that row at the end of a table, for example.
  • phantom reads – a row that will eventually be deleted, can also show up in your results.

Consider the reliability of these uncommitted reads. Consider the negative impact of making business or financial decisions based on these results.

Don’t use NOLOCK on data that is changing all the time.

RCSI

There’s a fairly painless solution, though. We use NOLOCK in the first place because the default Read Committed isolation level causes readers and writers to block each other. A database-level setting, called Read Committed Snapshot Isolation, will use tempdb to keep track of older versions of committed data.

When a data modification (INSERT, UPDATE, DELETE) takes place, the database engine stores the last committed version of each affected row in an internal table on tempdb. When a SELECT comes along, the engine redirects it to see the last known committed version instead of having to wait for the data modification to complete.

Nothing is for free, though. There is a slight performance impact, relating to the disk requirements for tempdb, along with additional space. However, in all cases that I have implemented RCSI, the benefits far outweigh the costs.

You can read more about this feature in a post by Paul White for more details.

Find me on Twitter at @bornsql if you have any dirty read stories to share.

 

Acids and Databases: A look at ACID and transactions

Relational database management systems (RDBMS) such as SQL Server, Oracle, MySQL, and PostgreSQL use transactions to allow concurrent users to select, insert, update, and delete data without affecting everyone else.

An RDBMS is considered ACID-compliant if it can guarantee data integrity during transactions under the following conditions:

ACID
  • Atomic – transactions are all-or-nothing. If something goes wrong, the entire transaction should go back to a previously known state.

  • Consistent – the data has to follow the internal rules of the database structure, including (but not limited to) data types, foreign key constraints, and default constraints. When we see a failure due to this type of error, the database must revert back to a known state again.

  • Isolated – when I run an operation, it should not impact how you run an operation. My operations are isolated from yours. That does not mean that you and I cannot update the same column in the same table. If your transaction starts before mine, your change will go through first, and then mine will go through. Isolation does not prevent more than one change on the same data, nor is it designed to.

  • Durable – once the transaction has been committed, that data is now persisted, and any failure that occurs after the commit will not affect the durability of the operation.

Transaction Log

This is why the transaction log on SQL Server is so important. It doesn’t just keep track of successful changes in the database. Every single modification in the database must be recorded in the transaction log as it occurs, and each modification has to have a start and an end marker.

If there’s a failure, the rollback will “undo” all the logs to that point from the starting marker (called an LSN, or Log Sequence Number in SQL Server). Yes, that generates even more logs as it makes changes to the database to perform the rollback, until the database is in a state that is equivalent to the moment just before the transaction began.

There are cases where operations from the original query may cause changes to the physical database and file structure, which may not be undone exactly, but the rollback will ensure that the database is at least in a consistent state and that any data modifications performed during a failed transaction are reversed.

Crash Recovery

When SQL Server starts up, sometimes it can take a while for a database to come back online again. That’s because it goes through a process called crash recovery, because there’s no way for the service to know with 100% certainty that it started cleanly for each database.

Think of the transaction log as a replay button.

The transaction log is scanned for active portions which contain information about transactions that were not yet committed. If they contain an end marker, the transaction is rolled forward (it runs the transaction again and modifies the database so that it matches what’s in the transaction log). If there is no end marker, or the original transaction was rolled back, that transaction is rolled back again.

The same process happens when a database is restored from a backup because there may have been transactions in progress when the backup started, some of which may have ended by the time the backup completed.

Implicit vs. Explicit Commit

While some RDBMS like Oracle may require explicit commits (in other words, adding a COMMIT keyword to the end of an operation) to end a transaction, SQL Server has implicit commits by default (no COMMIT keyword is required). We have to be mindful of this when moving between certain platforms. On SQL Server, we need to be sure that the change we are making in a production environment is the right one, or at least test our changes beforehand.

Stay tuned next week when we discuss locking, blocking, isolation levels, and NOLOCK.

Find me on Twitter at @bornsql.