Microsoft presents: MySQL and PostgreSQL?

For the longest time, MySQL has been the recommended database platform for blogs and other websites. It’s free. It’s available on every version of Linux, which is the most common web server platform. If you need something more complex but still free, there’s PostgreSQL.

But there’s a lot going on beyond that word “free”. Because although the MySQL or PostgreSQL licence might be free, there are hidden costs to running a website these days.

Apart from the security aspect, which alone should make us want to turn off our computers and walk away, we need to keep backups, and if something goes wrong, we need to be able to restore those backups. We need to test those backups. We need to keep the server patched. We need to keep the server running at all times.

If we run a hosting company, and have paying customers, we might be able to roll that administrative overhead into our hosting fee, but it’s hard to remain competitive and still make money in this game. So we stack our customers, and things slow down because it’s only cost effective to have hundreds of customers on a single server. It increases risk because so many more customers might be affected by even planned maintenance. If your customers are global, how do you plan downtime that suits all of them?

This Azure thing starts to look appealing. Perhaps an Azure SQL Database on the Basic tier for 5 DTUs at $6 a month (Canadian) would be able to bring the power of SQL Server to blogging engines without breaking the bank. Who can’t afford $6 a month in this day and age to host a website backed by the best and most secure RDBMS around?

I’d get acceptable DTU performance for the low traffic most hosted websites get and all the benefits, including SSL encryption, fully managed, point-in-time backups, plus the very latest features. No need to maintain a separate server, or set up maintenance jobs to keep the site and its contents safe. No need to worry about testing my backups through phpMyAdmin, which, let’s face it, isn’t as great as SQL Server Management Studio.

But we already know MySQL. It has been around for ages. Why switch to a Microsoft technology?

Microsoft seems to have asked themselves that same question, probably because they understand we’ve all built websites on MySQL. One of my favourite customers uses a LAMP stack (Linux, Apache, PHP and MySQL) for their site, and they are happy. Sure they could move to SQL Server, or Azure SQL Database, but then I’d have to rewrite a whole bunch of things. There’s no such thing as a drop-in replacement for any RDBMS, no matter what the marketing folks tell you. Well, unless you’re migrating from SQL Server to Azure SQL Database. And even then, there are challenges.

So now Microsoft has this thing where you can host your MySQL (or PostgreSQL) database on the exact same management framework as Azure SQL Database. It’s identical. The only difference is the database engine.

What does that mean? How about acceptable transactional throughput for the low traffic most hosted websites get, and all the benefits, including SSL encryption, fully managed, point-in-time backups, plus the very latest features? (Sound familiar?)

Instead of hosting a Linux-based, self-managed and maintained server with lots of moving parts, you can decouple your website and your database, and run the database on the same RDBMS you know and love (MySQL or PostgreSQL), with all the care and feeding taken care for you, starting at $20 a month and including 50 GB of space.

It’s good to have options. Fully patched. Fully maintained. Fully backed up. Fully secured.

This is what you can get, today, on Azure. The choice is yours:

  • Azure SQL Database, compatible with SQL Server;
  • Azure Cosmos DB, a global-scale eventually-consistent 99.99%-uptime guaranteed distributed database;
  • Azure Database for MySQL; or
  • Azure Database for PostgreSQL.

If your platform of choice isn’t available as a Platform-as-a-Service solution, you can still spin up a VM template that contains your Oracle and SAP HANA solutions.

The future is looking pretty rosy in Azure.

Talk to me on Twitter at @bornsql about your favourite database platform.

This is how I recovered from a catastrophic failure

I was fresh off the boat* from South Africa, working in a small computer store in an equally small town in Saskatchewan. Five days a week in winter, six days a week in summer, mainly removing malware from laptops, selling new computers (which is why I’m happy to recommend Asus and MSI to this day).

One of the side effects of having an MCSE in my back pocket back then was that I was occasionally pulled in to help the boss, Bill, with bigger jobs.

A notice went out to the entire city from the local electricity provider that they would be performing emergency maintenance on one section of the grid, and that all sensitive electronic devices should be shut down and unplugged from the wall, just to be safe.

A large client of ours, who had only recently signed with us, heeded this warning. They had unplugged every single desktop and laptop in the building. They were conscientious to a fault. The fault here being that they forgot about a Windows Small Business Server in a custom closet that had been specially built.

This server did not have any backups. I know this because the backup hardware was on backorder for them.

We discovered this after a panicked phone call to our shop in the morning after the power notice had gone out. No one could connect to their email (don’t you love how it’s always email?).

Bill and I got in his truck and off we drove, speculating that they had forgotten about the server. We arrived, and I beelined to the custom closet. The client’s most senior board member was there, as was the head of HR, who also happened to be head of IT. Small town Saskatchewan works that way.

After taking photographs of the server in situ, we unplugged and removed it from the closet and I removed the side cover.

Many of you reading this know the smell I’m about to describe. Ozone has a very distinctive odour.

I used a flashlight to look into the power supply unit, because that’s where the smell was coming from. It wasn’t pretty, but a burnt out PSU isn’t meant to be pretty.

I showed Bill what I saw, bearing in mind two high level folks from the client were watching our every move. I made sure to document it all while I went, mainly with photographs.

The inevitable question came from the big boss: “So how bad is it?”

Being seven years ago now, I don’t remember my exact wording, but I looked him in the eye and I said, “There are two words I never like to say together in the same sentence, when it comes to computers. Those words are ‘catastrophic failure’. We will have to take this back to the shop and pull it apart to figure out what happened, but I want you to understand that there is a significant chance that I will be unable to recover any data.”

In that moment I understood a little bit of how emergency room doctors feel when they give bad news. The client knew they would go out of business because of this data loss.

We reattached the side of the server, stuck it in Bill’s truck and went back to the office.

I cleared some room on my desk and set to work, documenting the removal of hardware. There were four drives in the server. Based on the smell alone, I believed that all four drives were irrecoverable, but I had to prove it. Insurance would cover hardware replacement for the client, but we don’t make assumptions about the lifeblood, which is their data.

I want to stop here and talk about dust. Saskatchewan is dusty because it’s flat and there’s a lot of wind and open space. How windy? I have had metal garden furniture flying off my deck.

The server interior was so dusty that the canned air I used on it wasn’t enough. I had to use a static-free cloth to wipe down the components I removed.

The first order of business was to clone the hard drives so that I could work off the clones to attempt data recovery. There was no way of knowing if they’d spin up, and I had no idea of knowing how they’d been configured. The server was not even starting up.

Once the drives were cloned, I replaced the power supply unit and plugged in the cloned drives to the motherboard, to see if I could at least establish the RAID configuration.

There was a warning that the RAID mirror was broken, which initially I thought was because I’d pulled the original drives, but the date didn’t match. The error was showing two months prior, long before we had been asked to look after this new client.

After acknowledging the error, I looked at the RAID configuration. Two separate RAID 1 mirrors, for a C: and D: drive. Well, that was some good news at least. If the drives were striped, my recovery options were close to zero. At least mirrored drives might have some opportunity to recover.

By now I had the cloned drives plugged in and configured in the same way as the originals, just updating the devices in the BIOS.

The server didn’t boot. No surprise there. I tried a few more combinations and nothing worked.

Then, I decided to try one of each drive, with no mirrors in place, and see what that did. After the third combination, the server started up.

Windows Small Business Server 2007. It said Safe Mode on it, but it was a running operating system.

And a login screen.

I logged in, expecting the worst. That’s exactly what I got. Files were missing or corrupt. SharePoint wasn’t working. Exchange Server wasn’t working. (Safe Mode, remember?) It was a mess.

Then I did what anyone would do in this situation (well, I’d do it again if I was in this situation). I turned it off, removed the cloned drives, and stuck the original drives back in, just with the new combination I’d discovered with the cloned drives.

It booted again. The drives spun. The Windows Safe Mode screen was still there, but it booted. I logged in, and stuff was there. It was there, but it was two months old.

Yup, I had managed to somehow get the degraded side of the RAID mirror working. Given the risks, I began a disk2vhd across the network, cloning the drive that way.

Once I had the data in a VHD, I did a full hardware and software check of the drives. The degraded mirror drives, the ones that failed two months prior, were fine. We had, effectively, a backup from two months prior.

There is a lot more to say about rebuilding an entire network because the Active Directory portion of the data was irrecoverable. And about how I extracted PSTs from the corrupt Exchange Server database. But that’s for a much later post. I still have flashbacks.

For the record, the new server and two weeks of my time to rebuild everything for the new client was not cheap. Amusingly, the backup hardware we had on backorder arrived during that first week.

I’m not a hero. I was lucky. The client treated me like a hero, but I was lucky. There was no good reason for this to have worked.

In my postmortem, I figured that the power supply unit had blown up, and taken the “good” side of each RAID mirror with it. I believe that the degraded drives were just lucky to not get nuked by the power spike (maybe the other drives took the brunt of it). There was still some obvious corruption though, which is probably what caused the mirror to degrade in the first place. Active Directory was gone, some email was gone (not to mention two months’ worth of new data).

Test your backups. Please.

  • It was a plane, and I was about three months in town already.

Locks, Blocks, and Isolation Levels

Last week we looked at ACID compliance. This week we dive a little deeper into the Isolation portion, and what it all means.

Isolation levels work to manage how my transactions may be isolated from your transactions, to allow concurrency. They work by locking sections of the database until a transaction is committed or rolled back, which results in blocking other access to those sections of the database.

In other words, locking and blocking is expected behaviour in a RDBMS.

There are four isolation levels in SQL Server (as quoted from SQL Server Books Online):

  • Read uncommitted (the lowest level where transactions are isolated only enough to ensure that physically corrupt data is not read)
  • Read committed (Database Engine default level)
  • Repeatable read
  • Serializable (the highest level, where transactions are completely isolated from one another)

The default on SQL Server (Read committed) means that readers block writers, and writers block readers. If I’m performing an UPDATE, DELETE or INSERT on a table when you’re trying to run a SELECT, and they happen to be trying to access the same set of 8 KB data pages that make up that table, whichever transaction started first (the one with a lower LSN) will take precedence, and the database engine will lock all the portions of the table that are being affected. Any other operation that also tries to access those portions of the table will be blocked until the transaction is complete. Locks happen thousands of times a second on a busy system, and most of the time we barely notice.

Deadlocks occur when two operations manage to lock each other out. SQL Server will kill a deadlock by terminating the process that was using fewer resources. This means that a long-running SELECT could easily take precedence over a short-running UPDATE. There’s no way of knowing who the deadlock victim will be, which is why our code should always have a retry mechanism built in. Always keep in mind that concurrency, by design, must assume locking, blocking, and deadlocks.

Whither NOLOCK?

We may have used the query hint WITH (NOLOCK) in the past because we noticed that queries seemed to run a lot faster. That’s because the NOLOCK keyword forces the lowest possible isolation level.

Referring back to the Books Online link above (or even the Wikipedia article), Read Uncommitted, or NOLOCK, permits the following bad behaviour:

  • dirty reads – a row that is uncommitted (either has not been committed yet, or may be rolled back) will end up in your query.
  • non-repeatable reads – a row might be read twice, with differing results, because the database engine might decide to delete and reinsert that row at the end of a table, for example.
  • phantom reads – a row that will eventually be deleted, can also show up in your results.

Consider the reliability of these uncommitted reads. Consider the negative impact of making business or financial decisions based on these results.

Don’t use NOLOCK on data that is changing all the time.


There’s a fairly painless solution, though. We use NOLOCK in the first place because the default Read Committed isolation level causes readers and writers to block each other. A database-level setting, called Read Committed Snapshot Isolation, will use tempdb to keep track of older versions of committed data.

When a data modification (INSERT, UPDATE, DELETE) takes place, the database engine stores the last committed version of each affected row in an internal table on tempdb. When a SELECT comes along, the engine redirects it to see the last known committed version instead of having to wait for the data modification to complete.

Nothing is for free, though. There is a slight performance impact, relating to the disk requirements for tempdb, along with additional space. However, in all cases that I have implemented RCSI, the benefits far outweigh the costs.

You can read more about this feature in a post by Paul White for more details.

Find me on Twitter at @bornsql if you have any dirty read stories to share.


Acids and Databases: A look at ACID and transactions

Relational database management systems (RDBMS) such as SQL Server, Oracle, MySQL, and PostgreSQL use transactions to allow concurrent users to select, insert, update, and delete data without affecting everyone else.

An RDBMS is considered ACID-compliant if it can guarantee data integrity during transactions under the following conditions:

  • Atomic – transactions are all-or-nothing. If something goes wrong, the entire transaction should go back to a previously known state.

  • Consistent – the data has to follow the internal rules of the database structure, including (but not limited to) data types, foreign key constraints, and default constraints. When we see a failure due to this type of error, the database must revert back to a known state again.

  • Isolated – when I run an operation, it should not impact how you run an operation. My operations are isolated from yours. That does not mean that you and I cannot update the same column in the same table. If your transaction starts before mine, your change will go through first, and then mine will go through. Isolation does not prevent more than one change on the same data, nor is it designed to.

  • Durable – once the transaction has been committed, that data is now persisted, and any failure that occurs after the commit will not affect the durability of the operation.

Transaction Log

This is why the transaction log on SQL Server is so important. It doesn’t just keep track of successful changes in the database. Every single modification in the database must be recorded in the transaction log as it occurs, and each modification has to have a start and an end marker.

If there’s a failure, the rollback will “undo” all the logs to that point from the starting marker (called an LSN, or Log Sequence Number in SQL Server). Yes, that generates even more logs as it makes changes to the database to perform the rollback, until the database is in a state that is equivalent to the moment just before the transaction began.

There are cases where operations from the original query may cause changes to the physical database and file structure, which may not be undone exactly, but the rollback will ensure that the database is at least in a consistent state and that any data modifications performed during a failed transaction are reversed.

Crash Recovery

When SQL Server starts up, sometimes it can take a while for a database to come back online again. That’s because it goes through a process called crash recovery, because there’s no way for the service to know with 100% certainty that it started cleanly for each database.

Think of the transaction log as a replay button.

The transaction log is scanned for active portions which contain information about transactions that were not yet committed. If they contain an end marker, the transaction is rolled forward (it runs the transaction again and modifies the database so that it matches what’s in the transaction log). If there is no end marker, or the original transaction was rolled back, that transaction is rolled back again.

The same process happens when a database is restored from a backup because there may have been transactions in progress when the backup started, some of which may have ended by the time the backup completed.

Implicit vs. Explicit Commit

While some RDBMS like Oracle may require explicit commits (in other words, adding a COMMIT keyword to the end of an operation) to end a transaction, SQL Server has implicit commits by default (no COMMIT keyword is required). We have to be mindful of this when moving between certain platforms. On SQL Server, we need to be sure that the change we are making in a production environment is the right one, or at least test our changes beforehand.

Stay tuned next week when we discuss locking, blocking, isolation levels, and NOLOCK.

Find me on Twitter at @bornsql.

What is a good Disaster Recovery Plan?

During a Q&A session I hosted at our local Calgary SQL Server User Group last month, one of the attendees mentioned an interview question he’d had and wanted to hear my thoughts.

How would you design a disaster recovery solution for a 1 TB database, using Simple Recovery Model? Replication is not permitted, and switching to the Full Recovery Model is not permitted.

Reminder: Simple Recovery Model means no transaction log backups. Only full and differential backups are possible.

My first question to the attendee was what the Service Level Agreement (SLA) says. As we know from previous posts, a disaster recovery strategy is dictated by business requirements, not technical ones. The Recovery Point Objective (how much data loss is acceptable) and Recovery Time Objective (how much time there is to bring everything back) will guide my proposal.

He told me that the SLA was 24 hours, so I started writing on the white board while I was thinking aloud.

On average, a fast storage layer can read and write around 200 MB/s, so it would take 5.12 seconds to write 1 GB, or just under 85 minutes to restore the database back to disk, not counting the log file or crash recovery. I never assume that Instant File Initialization is enabled, plus I won’t know how big the transaction log file is going to be, and that needs to be zeroed out.

The first rule of designing a disaster recovery plan is to expect the worst. Imagine the building has burned down and there’s no Internet access. That means no new server, no operating system, no software installation media. It’s the worst it could be.

Since 85 minutes goes into 24 hours quite comfortably, I then started asking about network speed. Imagine that the backup file has to be copied over an Ethernet connection at (best case) gigabit speeds, which has a theoretical maximum of 125 MB/s. Given that “theoretical” word in there, I always take away 20% and I’m left with 100 MB/s. So assuming the backups are stored on a network device that can deliver our files at gigabit speeds, we’re looking at almost three hours to write the database files, at minimum.

Here we get into assumptions. We have to assume that new hardware can be provided within that 24 hour window. If I don’t have a server to install SQL Server onto, then I won’t be able to meet the SLA. If I don’t have Windows Server (or Linux!) to install onto the server, I won’t meet the SLA. If I cannot install the latest updates to Windows and SQL Server, I won’t meet the SLA.

Now we get into a more challenging aspect: how to take backups so that there’s the minimum amount of downtime and the maximum use of disk space for the backups.

With the Simple Recovery Model, we have our full database backups, which take the entire database and as much of the transaction log as is required to make the database transactionally consistent when restored. In other words, if there are any in-flight transactions that are committed by the time the backup finishes, those should roll forward during restore, and any that don’t commit by then are rolled back. SQL Server handles this for us during the backup itself, and again during the crash recovery phase, but a long-running transaction that has to roll back can delay startup, and I’ve seen cases where this can take several hours.

The other thing we can do with Simple Recovery Model, is differential backups. These are backups that keep track of all the extents in the database that were changed since the last full backup. In many cases, these differential backups will be significantly smaller than the full backup, so it becomes easier to keep a smaller set of files but still have a reasonably up-to-date backup.

Remember that a differential backup is not an incremental backup. If you do a differential backup every night, it will be bigger than it was the previous night.

Of course I’d recommend using backup compression, which has been a standard feature since SQL Server 2008 R2 in Standard Edition, and before that in Enterprise Edition. Compressed database backups take less time to back up and restore, but at a cost of higher CPU utilisation. I think this trade-off is acceptable.

I’d also seriously look at having the backups go offsite to Azure Blob Storage or Amazon S3. If you use Amazon Glacier, keep in mind that there is a built-in delay of four hours before the first file comes back from Glacier.

So, to summarise, if the building has burned down:

  • Do we have electricity?
  • Do we have a network?
  • Do we have a Storage Area Network (SAN)?
  • Do we have a server?
  • Do we have software?
  • Do we have Internet access?
  • Do we have a local copy of the latest backups?
  • Can we get a local copy of the latest backups on site within that 24-hour window?
  • Can we restore the backups and verify them (DBCC CHECKDB) within that 24-hour window?

Sometimes, if it’s a real disaster, the best thing might be to spin up a virtual machine in Azure, Google Compute, or Amazon Web Services, and restore the backup to there. If offsite backups are going to the cloud already, this might be the optimal disaster recovery plan in 2017. A new VM in Azure takes under half an hour to restore. Taking network speed into account, it might take three to six hours to restore the latest full and differential backup files, and then doing a DBCC CHECKDB to verify the database, which adds another two hours or so.

The attendee indicated that he would use SAN replication to bring the database back from the dead. My counter to that was that if it was a true disaster, there wouldn’t be a SAN at all.

If you have any thoughts about how you would tackle this scenario, find me on Twitter at @bornsql.

Be Wary of Date Formatting in T-SQL

Today’s public service announcement is a reminder to be wary of date formatting in SQL Server.

On a recent mailing list discussion, one person demonstrated the following code (which I’ve adapted for this post):

SET LANGUAGE 'us_english';
DECLARE @StartDate DATETIME = '2017-07-12';
SELECT @StartDate;
DECLARE @StartDate DATE = '2017-07-12';
SELECT @StartDate;
DECLARE @StartDate SMALLDATETIME = '2017-07-12';
SELECT @StartDate;
DECLARE @StartDate DATETIME2 = '2017-07-12';
SELECT @StartDate;

As you might expect, every single SELECT returned a date of 12 July 2017.

Now this is where it gets interesting. Change the language from US English to British English:

DECLARE @StartDate DATETIME = '2017-07-12';
SELECT @StartDate;
DECLARE @StartDate DATE = '2017-07-12';
SELECT @StartDate;
DECLARE @StartDate SMALLDATETIME = '2017-07-12';
SELECT @StartDate;
DECLARE @StartDate DATETIME2 = '2017-07-12';
SELECT @StartDate;

Look carefully. DATE and DATETIME2 are showing the date of 12 July 2017 as expected. Unfortunately, the DATETIME and SMALLDATETIME data types are showing a date of 7 December 2017.

That’s not good at all. It means that the ISO 8601 standard does not work the way we might expect it to. The reason is simple, if annoying: we need to add a time to the date to make it pass the ISO 8601 standard.

Microsoft’s own documentation is very clear about this (emphasis added):

The ISO 8601 date with time format is as follows:

  • YYYY-MM-DDThh:mm:ss[.nnnnnnn][{+|-}hh:mm]
  • YYYY-MM-DDThh:mm:ss[.nnnnnnn]Z (UTC, Coordinated Universal Time)

To use the ISO 8601 format, you must specify each element in the format. This includes the T, the colons (:), the + or – , and the periods (.). The brackets indicate that the fractional seconds or time zone offset components are optional.

In other words, if we want to use a hyphen separator in our date strings, we need to include a time as well.

DECLARE @StartDate DATETIME = '2017-07-12T00:00:00';
SELECT @StartDate;
DECLARE @StartDate DATE = '2017-07-12T00:00:00';
SELECT @StartDate;
DECLARE @StartDate SMALLDATETIME = '2017-07-12T00:00:00';
SELECT @StartDate;
DECLARE @StartDate DATETIME2 = '2017-07-12T00:00:00';
SELECT @StartDate;

We could also use a different date format. Instead of separating the year, month and day with a hyphen, use YYYYMMDD format instead. This is considered unambiguous, if a little trickier to read.

The following code works irrespective of the LANGUAGE setting.

DECLARE @StartDate DATETIME = '20170712';
SELECT @StartDate;
DECLARE @StartDate DATE = '20170712';
SELECT @StartDate;
DECLARE @StartDate SMALLDATETIME = '20170712';
SELECT @StartDate;
DECLARE @StartDate DATETIME2 = '20170712';
SELECT @StartDate;

Moral of the story: don’t make assumptions.

If you want to explode your brain on dates and times in SQL Server, check out Tibor Karaszi’s ultimate guide to the datetime datatypes.

To share more language compatibility weirdness, find me on Twitter at @bornsql.

Balanced Power Saving T-SQL script

We can easily spend tens of thousands of dollars on core licences for SQL Server, and then we go and install the product on an operating system with the default Balanced Power Plan, which is, well, idiotic.

No! Balanced Mode Is Wrong!
No! Don’t do this! Click “High performance” immediately! Exclamation mark!

Imagine buying one of the fastest road cars money can buy (a Bugatti Veyron), which used to cost US$2,700,000 for the Enterprise Edition Super Sport Edition.

To make this car hit the top speed of over 265 miles per hour, there is a special mode that must be entered when the car is at rest, by toggling a key to the left of the driver’s seat.

Windows has the same setting. It’s in the Power Options under Control Panel, and for all servers, no matter what, it should be set to High Performance.

Here’s a free T-SQL script I wrote that will check for you what the power settings are. We don’t always have desktop access to a server when we are checking diagnostics, but it’s good to know if performance problems can be addressed by a really simple fix that doesn’t require the vehicle to be at rest.

(The script also respects your settings, so if you had xp_cmdshell disabled, it’ll turn it off again when it’s done.)

DECLARE @isCmdShellEnabled BIT;
DECLARE @isShowAdvanced BIT;
@isCmdShellEnabled = CAST(value AS BIT)
name = 'xp_cmdshell';
@isShowAdvanced = CAST(value AS BIT)
name = 'show advanced options';
IF(@isShowAdvanced = 0)
EXEC sp_configure 'show advanced options', 1;
IF(@isCmdShellEnabled = 0)
EXEC sp_configure 'xp_cmdshell', 1;
--Run xp_cmdshell to get power settings
EXEC xp_cmdshell 'powercfg /list';
--Turn off 'xp_cmdshell'
IF(@isCmdShellEnabled = 0)
EXEC sp_configure 'xp_cmdshell', 0;
--Turn off 'show advanced options'
IF(@isShowAdvanced = 0)
EXEC sp_configure 'show advanced options', 0;

On a server that is set correctly, this is the output. Notice that the High performance option is active.

All is right with the Power settings

If a different power setting is active, talk to a server admin (maybe that’s you) to change it to high performance.

I’ve also put this script with my Max Server Memory script on GitHub.

If you have any other stories to share about throwing money away, find me on Twitter at @bornsql.

Where To From Fundamentals?

The Database Fundamentals series is now done. We started with understanding what a database is, and then spent a little time understanding how databases store text, known as collation.

The next step was understanding data types in general, and how SQL Server stores them. After that was understanding normalization, with a focus on only having one version of the truth. There was a lot to process, so we went into a little more detail about normalization and foreign key relationships.

Halfway through the series, someone told me they didn’t actually understand what a byte was, so I went into some detail about bits and bytes.

To wrap up the series, we looked at querying a database. The basic methods to query are SELECT, INSERT, UPDATE and DELETE.

Where to next?

With our very basic understanding of how databases work from a theoretical level, we might want to jump right into the official Microsoft documentation, SQL Server Books Online. Unfortunately, it would take us so long to read it, the next three versions of the product would have been released. However, there are some good places to jump in:


But let’s set our sights on a smaller step up into this wonderful world of databases and check out the ongoing series from, called SQL101. You can jump into each of the blogs independently:

I also strongly recommend following their Accidental DBA series, which is a list of 30 posts covering the most likely issues facing a new or accidental database administrator.

Other Websites
Social Media

There is a strong Twitter presence of SQL Server experts and community members, and, provided your question is concise, you can post a question on Twitter with the #sqlhelp tag, and it will be answered fairly quickly.

You must be concise with your question (due to the limits of Twitter), while including as much information as you can.


SQLSaturday was created by PASS, the organisation with which most SQL Server community members are associated (including me). PASS used to stand for the Professional Association for SQL Server, but it does so much more now (SQL Server, Azure, Business Intelligence, and so on). The PASS Summit is a three-day conference held every year in October / November, in Seattle, Washington.

Before then, though, if you are able to travel to Norfolk in Virginia, the Compañero Conference is happening on 4–5 October 2017, specifically designed for Accidental, Lone or Junior DBAs. It’s a gentle introduction to what’s out there. I will of course be presenting a session there.

Contact Me

Finally, you can always chat to me directly. I spend a lot of time on Twitter, under the name @BornSQL, and my direct messages are open for any questions you might have about SQL Server or being a data professional.

Image credit: Aaron Burden

Why You Need A Test Environment

In 2017, there’s no excuse not to have at least a testing environment, and preferably a development environment as well, in order to create and deploy code changes (software, database, you name it) outside of a production environment. If you’re making changes in a production environment without testing them first, you’re doing it wrong.

Some companies I’ve worked with have different forms of testing environments, including QA (Quality Assurance), IAT (Internal Acceptance Testing), and UAT (User Acceptance Testing). What they are called doesn’t matter, so long as they exist.

In a typical IT deployment, whether using Waterfall, Agile, or other development methodologies of the month, it pays to have a basic developmenttestproduction deployment path.


This environment can be a general playground, building things, writing code, installing weird and wonderful third-party plugins, getting the job done as far as producing new things is concerned.

In the development environment (which might be a shared machine, or a bunch of sticker-encrusted laptops used by developers), there should be something called Source Control.

Entire books have been written about source control, but the premise is that development environment is a cesspit of trouble, so any code that is written and compiled (without breaking the build), should be checked into source control as soon as possible.

This serves three purposes:

1) Other developers can access the new code without having to plug in a USB drive and risk malware infection on Steve’s laptop (sorry, Steve).

2) If Steve’s laptop is stolen, or he spills coffee on the keyboard, or it gets infected by malware because he doesn’t keep the operating system up to date, the source code is checked in safely somewhere else, and the team doesn’t lose what could amount to days of productivity.

3) The third purpose is for something called Continuous Integration, which is a fancy way of saying that as soon as code is checked into source control, an automated testing process is able to compile the code, deploy it somewhere, and run a battery of tests against it without human intervention. CI is much faster than manual testing.

It’s imperative to keep in mind that all software contains bugs. We may not have a Steve in our company, but developers always make assumptions—they have to; it’s part of the job—so little things can escalate into big things if the software is not used as intended.


The test environment should be set up exactly like production, down to hardware and operating system configuration. The only differences should be application configuration that might point to a test vs. a production database server, for instance.

Why is it so important to make sure that test and production look and behave the same?

In the case of SQL Server for example, a simple thing like data distribution in a table, along with the amount of RAM or CPU cores, will cause a specific query plan to be selected. If the same query is deployed to production with a different underlying hardware configuration or data distribution, a worse query plan might be selected by the query optimizer, and the developers are going to have to work nights and weekends to fix it.

Yeah, that'd be great
Yeah, that’d be great

Whose fault is it, though? In this particular case, it’s the fault of whoever set up the test environment, because it doesn’t mirror production.

When I mentioned the different types of test environments above, this is what they are for:

  • Internal Acceptance Testing – the developers and possibly the product owners have tested the code and it works like the functional specification says it should.
  • User Acceptance Testing – the target audience (users) get their hands on the code and do the things that you never expected them to, and often will go off script or off spec. I worked with a guy called Gordon who I knew would break something. Make your code Gordon-proof.
  • Quality Assurance – the code is tested by a dedicated QA team who test the code against a functional specification as well as a set of guidelines created by the company to ensure software meets certain standards, but don’t make assumptions about it (developers always make assumptions). Think about Microsoft products. Different teams write Word and Excel, but they have to work in similar ways and have a high level of confidence that data won’t go missing or get corrupted.

Usually, there’s a bit of a cycle in the process here, where if something fails in test, it has to be fixed in development and redeployed, until it passes muster.


Once code has been signed off by the end users and QA, it can be deployed into production. This can be done in many ways, and unfortunately one of those ways is that Steve has a USB drive containing files that he transferred from his coffee-and-malware-infested laptop.

Deployment to production should be the exact same process as deployment to test. Why? Because if it is not, then assumptions are being made and things can go wrong.

Remember source control? The best deployments are automated deployments. They are scripted, which makes them repeatable and therefore predictable. An automated process should be able to grab code from source control, compile it, and deploy it, the same way every time.


Most companies will have a maintenance window during which deployments can take place. These are usually outside of working hours and on a weekend, but in my experience, weekends are a terrible time to deploy new code, so I prefer it to happen on a Wednesday night. That way, Steve is most likely in the office on Thursday morning to fix his bad code.

Source control should be the single version of the truth. If a developer is deploying from a thumb drive, stop doing that right now. There are free and cheap solutions available that can automate deployments for you that require only a small amount of effort to achieve repeatable success.


Source Control: Get it now. If you want to use an online provider (which I recommend), check out GitHub (mostly free), Visual Studio Team Services (free for up to five developers), or BitBucket (mostly free). There are probably hundreds more.

Continuous Integration: Check it out. Visual Studio Team Services has it built in, even with the free version. Remember, the whole point of this is automated builds and repeatable results.

If you would like to discuss this with me further, you can find me on Twitter at @bornsql.

How to write a DELETE query

My First DELETE Statement

Here are the links to the previous posts in this series:

  1. My First SELECT Statement
  2. My First INSERT Statement
  3. My First UPDATE Statement

This week is a much shorter post, where we will learn how to remove data from a table using a DELETE statement.

We will also refresh our memories about the importance of the WHERE clause.

DELETE with a WHERE clause

As with the INSERT statement, a DELETE statement affects the entire row in a table.

To delete information, we identify which row(s) to get rid of using the DELETE keyword, the table name, and a WHERE condition:

DELETE FROM [dbo].[Stores]
WHERE [StoreID] = 2;

In this example, we use the Primary Key StoreID as our row identifier, because it is guaranteed to be unique.

Pro tip: Write the WHERE clause before you write the rest of DELETE statement. It is incredibly easy to accidentally delete every single row in a table.

If that happens, and it’s a production environment, we will have to open up the Run Book to figure out how to restore from a backup.

Remember, the purpose of a WHERE clause is to restrict the data modification using a filter or limit.

DELETE without a WHERE clause

This is an exercise for the reader, with the complete understanding that unless we have a full backup of our demo database, we should never run a DELETE on a table without a WHERE clause.

To run a DELETE without a WHERE clause, simply delete the WHERE condition from the above statement, keeping in mind that the entire table will be emptied out, with no way to go back.

The Final Results

If you’re following along in SQL Server Management Studio, we can run the DELETE statement like so:

DELETE FROM [dbo].[Stores]
WHERE [StoreID] = 2;

Once we have executed the statement (using F5 on the keyboard, or clicking the Execute button in the toolbar), we see a message in the Messages pane:

(1 row(s) affected)

Using our familiar SELECT, we will see the new result set, containing only one row, because the West Edmonton Mall row has been erased from existence:

Congratulations! We have deleted a row from our table, and a SELECT query proved that it was deleted.

Next time, we will find out why a test environment is so important (which was hinted at by the warnings about our WHERE clause). Stay tuned.

Look me up on Twitter if you want to ask any questions about this series, at @bornsql.