Remote Storage Sync and Restore updated

Recently, I presented my “Back up and Restore your Database with Azure Blob Storage” session to the Bellingham PASS Chapter.

There are two components to the Azure Blob Storage Sync and Restore utility: Sync and Restore.

I wrote the Sync tool to work around difficulties I’d experienced with Azure Blob Storage’s Infinite Lease, and limitations in AzCopy. Lately these problems have been made mostly redundant, but the Restore tool remains extremely valuable, and I use it regularly for testing backups.

I’m happy to announce that new versions of both the sync and restore components have been released today, and they have a new name.

Both include some new features, which I’d like to highlight below, mostly to the Restore tool.

New Features

Point-In-Time Restore

During the Bellingham session, one of the questions asked was whether the Restore tool could perform a point-in-time restore using the RESTORE ... STOPAT keyword.

You can now perform a point-in-time restore using the Restore tool, and it will only download the files from the latest backup that it needs, up to and including that point in time.

Note: This feature only works on the latest available backup. A feature enhancement will be included in a later release to perform a point-in-time restore for any backup, if the requisite files exist.

Download Indicator

It’s handy to know how fast a file is downloading, but there aren’t easy ways to do this on a command-line tool, and it can get confusing in a disaster recovery scenario if you don’t know if the download is stuck.

AzureBlobStorageRestore will now show your download progress, using a rotating cursor similar to one I first saw on the MS-DOS version of Doctor Solomon’s Antivirus.

Big shout-out to Daniel Wolf for his ProgressBar gist (available free under the MIT License).

Sort Order

Previously, the Restore tool would download files from smallest to largest, but this could be disconcerting in a disaster recovery scenario as it might look like files are missing.

Now files can be downloaded in alphabetical order, or the default of smallest to largest.

One More Thing

There’s a lot of emphasis on off-site storage these days, and not everyone uses Azure Blob Storage.

For example, one of my customers has a 5TB file share, mapped using UNC (in the format \\server\share) and wants to copy files there.

What if you already keep tape backups, managed by a company that takes your tapes off-site in a fireproof lockbox, and all you need to do is get the files to a network share to be backed up to tape?

What if you just want to copy your database backup files to a local network share as well as Azure Blob Storage?

What if you know you should be moving your backup files to a network share but keep forgetting to do so?

More importantly, what if your backup files are already on a local network share and you have a disaster recovery situation. Downloading from Azure Blob Storage will take too long, and you just want to restore the latest backup from a local network share.

Introducing File Storage Sync and Restore.

FileStorageSync

Like its older sibling, FileStorageSync synchronises the contents of a local directory with a network share using a UNC path (in the format \\server\share). You can provide a username and password as well, which is convenient for creating scheduled tasks. All other features from AzureBlobStorageSync are included.

FileStorageRestore

Like its older sibling, FileStorageRestore parses all the files in a UNC share, and downloads only the the latest full, differential and transaction log backup files required to perform a restore of your database.

It also includes Point-In-Time Restore (for the latest backup) and Sort Order customisation.

A download indicator is not provided in this version because local network speeds are usually so quick that it would add unnecessary overhead.

All other features from AzureBlobStorageRestore are included.

Note: A future enhancement which restores the latest backup from the UNC network share, without copying the files first, is coming soon. You will be able to restore your database as fast as possible.


If you want to play with the new features, you can find them in the GitHub repository.

This project is released free under the MIT License (free for commercial and non-commercial use), and all you have to do is give me credit.

Find me on Twitter at @bornsql if you have any thoughts. Feature requests can be submitted on the GitHub repository itself.

Interview Questions for a SQL Server DBA

When I’m interviewing a SQL Server DBA, I have three questions that I always ask.

My favourite interview question is this:

“What is the difference between a clustered and a non-clustered index?”

My second favourite question is this:

“What is the right disaster recovery strategy for my company?”

The first question is one of basic understanding. Depending on their work experience, the candidate may have an inkling about why such a distinction matters, and this lets me know how much I have to teach. What it does not do is disqualify them. I am not averse to teaching, which is why I write on this website in the first place.

The second question should be answered with “it depends”, followed by the candidate asking me questions about my company so that they can figure out a way to begin answering it. If the candidate blurts out an answer without asking questions, well, their résumé goes in the garbage. This is the disqualification question.

I’m mostly self-taught, with formal training to fill in the gaps much later in my career, so I know the difference between “I know this thing, but not the textbook definition”, and “I read this somewhere on Google and have never used it”.

If the candidate uses words that are blatantly wrong but have the right idea (“the blob is sorted on the disk, but the non-blob one is a copy of the blob, but smaller, because you can only have one blob, and it’s shaped like a Christmas tree”), then that’s a pass for me. SQL Server doesn’t care how you describe an index to an interviewer, as long as the syntax is correct, and that’s what Books Online is for.

Heck, I might even say “if you can’t explain it with words, draw it for me on the board”. I love pictures. I’ve drawn many lopsided databases in my time. That’s what the dry-erase board is for: working through a problem that exists in the real world. It’s not there to make someone regurgitate a textbook definition. Again, that’s what Books Online is for.

Here’s a practical example from my school days, as a student and as a teacher: the notorious open-book exam.

In this scenario, the exam taker is allowed to refer freely to their textbook throughout the exam, and the invigilator is there to make sure no one is distracting their fellow students.

The open book exam relies on the exam takers having a working knowledge of their subject. In the case of English literature, for example, it helps to have read the book. Summarised study guides only get you so far — if I’m asking why the love between Catherine and Heathcliff can be described as “demonic”, providing examples from the text, you would have to know where to look.

Even so, the open book exam is unfair for those people who haven’t read the book but have seen the movie ten times and can quote their favourite lines from it. Or if they prefer the audio book, because they struggle to read words on a page, especially under stressful situations. Or their parents read the story to them many times in their youth. Or English isn’t their first language. Or their second language.

If you have haven’t read the book, you will run out of time to answer the questions, because you’ll be frantically reading the textbook for the first time, trying to understand why anyone wants to know what a linked list is for (a very typical interview question for software developers). Meanwhile, you’ve used linked lists before in that job you had writing code for your cousin’s bicycle store, but you didn’t realise that was their actual name.

If I’m asking you, with only a few years of experience administering an Access database for your uncle’s grocery store, to create an index in SQL Server, without using the graphical interface of Management Studio, the first place you’re going to look is Books Online.

As my friend Gail Shaw (b | t) once said many years ago, when you are working on a SQL Server database, you should have two windows open, and one of them is Books Online. Why guess arcane syntax when you can simply look it up?

Which brings me to my third favourite question:

“What was the last mistake you made, and how did you recover from it?”

I’ll go first.

This morning I was working on a stored procedure for a new system. The code called for an upsert-style stored procedure (INSERT and UPDATE in one block of code, to check for the existence of a row of data in order to update it, or insert it if it doesn’t exist). My UPDATE statement was missing a WHERE clause.

Fortunately, the tools that I have at my disposal managed to catch this before the code was ever run. My point is, even with decades of experience, and being able to recite all of the keywords in an UPDATE statement, even the most battle-tested person forgets a WHERE clause once in a while and updates or deletes the entire table.

This is why I love my second favourite interview question, because I get to ask the candidate this bonus question:

“Does the disaster recovery strategy you came up with in question 2 cater for the scenario you just described?”

After all, backups matter only if you can restore them promptly and accurately.

Final Thought

This discussion has made a resurgence thanks to my friend Janie Clayton, who has posited (in my interpretation anyway) that technical interviews are designed to exclude certain people from applying for a position, for arbitrary reasons (university degree, culture, gender, language, etc.), for the sake of some puritanical ideal.

My take is obvious: If you have working knowledge that you can demonstrate, and some experience, you can learn the rest on the job. Every system is different, with different goals guiding a decision. It takes time to become accustomed to a new system, and even the smartest person in the room will have to ask questions.

The question is not “what have you memorised?”, but “what can you learn?”.

By all means, if you’re applying for a specialised role (for instance, performance tuning), then you need to have specialised domain knowledge. For everything else*, there’s Books Online.

* MasterCard did not pay for this post.

Planning matters

It’s the 2016 PASS Summit at the end of this month, and I’ll be attending for my third year.

Because I’m a glutton for punishment, I’ll also be writing bonus posts for the month of October to celebrate Summit.

Fortunately, because I’m the luckiest person in the world, I am attending the Pet Shop Boys concert in Vancouver on the Monday evening beforehand, so that is going to be exciting, too.

Getting on a plane and flying from Calgary to Seattle is such a convenience these days. You book your flights, drive to the airport, breeze through airport security, and the plane jets you, undelayed, to Vancouver, where you hop on a train to your hotel, have dinner nearby, and then walk to the concert. The following morning after a good night’s uninterrupted sleep, you hop back on the train to the airport as scheduled, pass quickly through passport control (so as not to trouble you with that on the other side, one of the benefits of flying to the US out of Canada), once again breeze through airport security, and the plane jets you to Seattle-Tacoma airport, where you get off the plane quickly, fetch your luggage, breeze through customs, and there you are. A short train ride to the hotel later, you’re ready to face the networking track of the Summit for three days.

If only migrating a SQL Server database to Azure SQL Database were this simple.

Of course, I’m joking. There are many very important steps involved when thinking about flying to another country, even if it’s just for a few days. This is especially complex when you add in a side-trip to Vancouver for an evening. It requires careful planning, making sure enough time exists to get everything done. And you can’t forget your passport either.

So why do so many people in the Information Technology industry want to do things without careful planning?

Every time (and I’m guilty of this at times, too), a new technology comes out, and it becomes trendy, we all jump on the bandwagon and start playing around with it. So far, so good. That’s the equivalent of kicking the car tyres.

And then management says (because management plays golf with vendor sales, or the developer lead likes that it uses tabs instead of spaces) that we have to move to this new technology to save the company lots of money. Less good, but doable.

The next logical step is to do a proof of concept. That’s where we research the requirements of the existing system, the limitations of the new technology, and build a working model to see if it is fit for purpose. If it is not fit for purpose, the amount of time we have expended is minimal, and we continue bug fixing on the older technology stack, letting management know that it is not possible, or practical, to move to this new technology.

If the proof of concept shows merit, then we go back to the project manager and business analysts, and begin planning for migration. We include all the stakeholders, including customers (internal and external), and slowly develop clear goals and timelines. There’s a project lead, and politics don’t enter into it. And unicorns are real.

Why is it that we want so badly to “play” with the new technology that we lose sight of this necessary step?

Microsoft announced Azure SQL Database in 2009. While it has been seven years since that time, and children born in that year are now at school, it’s a very short time in the technology world. SQL Server itself is only 27 years old. COBOL, which is still very much in use, is 57 years old. C is 44 years old. Large chunks of the Internet infrastructure are built on C and C++.

My point is, if we are going to move a company’s systems to newer technology, we have to consider longevity:

  1. How long will the new technology be around?
  2. How long will the system we’re migrating be around?
  3. (Bonus question) Will someone getting upset on the Internet destroy my system?

The bonus question relates to the crisis NPM faced earlier this year, when a single 11-line JavaScript file caused thousands of projects, that depended on it, to fail.

Moving to Azure SQL Database is a complicated process. While the underlying structure might look familiar, and while most of our code will probably work fine, there are major stumbling blocks that could scupper our plans entirely.

Take for example the fact that you are not allowed to address tempdb directly. You can create local temp tables and table variables just fine, but you cannot check for their existence by referring directly to tempdb.

This works:

CREATE TABLE #test (col1 INT);
DROP TABLE #test;

This does not:

IF EXISTS (SELECT * FROM tempdb..sysobjects
WHERE id = OBJECT_ID('tempdb..#test')
DROP TABLE #test;

This “minor” issue occurs in many more places than you might think it does, because everyone reading this knows about that one legacy system we’re too scared to touch.

If management is pressuring you to migrate to the new stuff because it’s trendy, they’re doing it wrong.

Careful planning is what you need. This includes a proof of concept, a 100% pass rate using the Data Migration Advisor, and a project manager who understands the complexities involved, plus a proper testing plan that includes automated unit tests, edge cases, and humans bashing away at keyboards and mice.

And don’t forget to plan for a disaster, especially for a disaster that occurs during the migration.

Share your migration horror stories with me on Twitter, at @bornsql .

PSA: Check Your Backups

This post comes to you from the Spanish countryside. I travel a lot these days, sometimes even for work. The best part about travelling, apart from food, is knowing that my backups are in good shape.

So my Public Service Announcement for you today is to make sure your SQL Server backups are valid.

Step 1: Back up your SQL Server databases. All of them. That’s your production, test, QA, and development environments, user and system databases. The only database you should not be backing up (and cannot) is tempdb.

Step 2: Test your databases by restoring them. Yes, use the checksum option when backing up, but as we all know, a good backup is one that you can restore.

Step 3: Check the consistency. Run DBCC CHECKDB WITH NO_INFOMSGS, ALL_ERRORMSGS, DATA_PURITY, EXTENDED_LOGICAL_CHECKS. If the database is really massive, you can break this up into DBCC CHECKALLOC, DBCC CHECKCATALOG, and then run DBCC CHECKTABLE per table.

Step 4: Take lots of photos of the Spanish countryside.

If you’d like to share your photos, or discuss how you test your backups, leave a comment, or find me on Twitter at @bornsql .

AzCopy 101 – Copying SQL Server backup files to Azure Blob Storage

SQL Server 2016, released last month, now lets us back up our database files to two locations simultaneously. This is very good news if we’re running the latest and greatest (and it really is a good upgrade).

For everyone else, the bad news is that we are targeting an on-site destination for our backups (local network, SAN, NAS, tape) and using some form of automation to push these backups offsite. One of my budget-conscious customers uses Dropbox.

Since the release of Azure Cool Blob Storage, at a third of the price of typical Blob Storage, we now have a cost-effective off-site archive that is competitive with Amazon Glacier, and remains in the Microsoft family.

We can either use my Azure Blob Storage Sync tool to push our database backups to Azure Blob Storage, or if there is a policy to use Microsoft-only tools, we can use AzCopy.

Many tutorials already exist for how to use AzCopy, including the official documentation. This post is more about gotchas that I’ve discovered while working extensively with the tool.

Keys and Containers

Azure Blob Storage stores its blobs (we can think of them as files) in a container, which is addressable by a URL. When we first set up our storage account, we need to create a container to store the files. The path to that container would look something like this:

https://myaccount.blob.core.windows.net/mycontainer

A key is then needed to access that container in that storage account. It is a long string of characters, with == at the end, which looks something like this:

Zx7uEPwA6MfJ5MXML0MYUqc8k78lYYCvq7h+lG0grumpyMG1TvEpp931SQLXWpoZWfgItEzhvWnKzy9RKGTYfA==

In the examples below, I use the word key to represent this key.

Command Line

We can copy files to and from Azure Blob Storage using the following command.

AzCopy /Source: /Dest: [Options]

If we’re copying files to Azure, the /Source switch would have an on-premises path, and the /Dest switch would have a container path. The /DestKey must then be specified.

(If we were copying from Azure, the Source and Destination would be reversed, and the SourceKey would be specified.)

For example, let’s copy files from my local drive to an Azure storage container:

AzCopy /Source:D:\SQLData\Backup /Dest:https://myaccount.blob.core.windows.net/mycontainer /DestKey:key /S /XO /Z:D:\_Ops\AzCopy

Notes:
/Source – my local hard drive
/Dest – the Azure storage container URL
/DestKey – the secure key to access the container
/S – recurses through all the subfolders in the source location
/XO – ignores all older files that already exist
/Z – specify a fixed path for the journal

The Nitty-Gritty

Don’t Run Multiple Copies

If AzCopy is already running, don’t run it again. This is strongly advised by the official AzCopy documentation.

The reason is that it makes the best use of available resources, or in more practical terms, it’s thrashing the disk and CPU.

Running multiple copies of AzCopy simultaneously will render your machine almost unusable, and it will probably break the journal anyway.

Journal

The biggest headache working with AzCopy is that it uses a journal to keep track of your progress during a copy operation.

The journal normally lives in %LocalAppData%\Microsoft\Azure\AzCopy. This becomes a problem when automating the AzCopy process. Usually we run automated tasks under a service account, which means that we don’t monitor the task until it fails.

Unpredictable journal behaviour

Unfortunately, AzCopy does not fail if a previous copy operation was not completed. Instead, it will wait with a Y or N prompt, which never comes, because the task is running in a service account context.

AzCopy provides the option to use /Y to suppress confirmation prompts.

There are two main reasons why the journal will await a confirmation:

  • The previous operation failed, and pressing Y will allow it to resume where it stopped;
  • The current operation is similar to a previous operation, and pressing Y will overwrite the journal and start over.

As we can see, these are vastly different results. Therefore I do not recommend using the /Y switch.

If for whatever reason AzCopy is interrupted (usually network-related issues), the journal will have to be cleared out manually.

Note that the %LocalAppData% path is user-specific. I would recommend specifying a journal path with the /Z switch, which allows us to set a default location.

Ignore Older Files

Like Xcopy and Robocopy, we can skip older files, which is useful when archiving transaction log backups every 15 minutes.

Use the /XO switch to only copy new files, and ignore the ones that already exist.

Summary

AzCopy is a great tool, but it has a lot of moving parts and can cause headaches if we don’t manage the journal correctly.

If you would like to continue the discussion, or ask me questions, look me up on Twitter at @bornsql.

The Resilience of the Log Backup Chain

Much has been written about the log backup chain in SQL Server, where we are able to restore a database to a point in time using a combination of full, differential and transaction log backups.

This requires a database being in Full Recovery Mode and transaction log backups being taken at regular intervals.

If the database is put into simple recovery mode, and then switched back to full recovery mode, the log backup chain is broken, and a full or differential backup must be taken to fix it.

I’ve been working on a long-term project to migrate over fifty databases running on SQL Server 2008 R2 to Azure virtual machines running SQL Server 2014.

We use a custom form of Log Shipping to keep the Azure environment in sync with production. The log chain is stored on Azure Blob Storage, and each log file is restored every half hour to the new environment using a custom PowerShell script.

This allows us to have a very small maintenance window for the final cutover, requiring just the tail of the log backup to be restored.

Last week, I wondered aloud what would happen if we had to fail back to the original system, after processing transactions on the new environment. Would a log backup taken on the Azure environment work on the older servers? Technically speaking, the log backup chain would not have been broken. Perhaps a version conflict?

Let’s look at a test case.

  1. Create a database in SQL Server 2008 R2 with Full Recovery, and perform a full backup, plus several transaction log backups.
  2. Restore the full backup and log chain to a new SQL Server 2016 instance, perform some actions here, and then perform another log backup from the new instance.
  3. Test if the newest log backup can be applied, along with the older ones, in a recovery on SQL Server 2016. I say “Yes”.
  4. Test if the newest log backup can be applied, along with the older ones, in a recovery on SQL Server 2008 R2. I say “No”.

Tune in next week for the results. You’re always welcome to comment here or discuss it on Twitter at @bornsql.

Expiration Date on Database Backups

What does the expiration date mean on SQL Server database backups?

In SQL Server Management Studio, there’s an option in the Back Up Database wizard which explicitly provides an option for “Backup set will expire”. But until this week, I had never given it much thought.

This feature has been there for as long as I can remember. Does it mean that we can’t restore databases if the expiry date has passed? Maybe it has something to do with backing up to tape.

Let’s find out.

Test 1: Restore a database with an old expiry date

Using the EXPIREDATE option as specified in SQL Server Books Online, we will perform a simple backup using last week Saturday as our parameter.

USE [master];
GO
CREATE DATABASE [BackupExpiryTest];
GO
BACKUP DATABASE [BackupExpiryTest]
TO DISK = N'D:\Temp\BackupExpiryTest.bak'
WITH EXPIREDATE = N'2016-04-29 00:00:00';
GO

Now let’s see if restoring the database today, 3 May 2016, still works. To be extra sure, we will drop the database we’ve just backed up.

USE [master];
GO
DROP DATABASE [BackupExpiryTest];
GO
RESTORE DATABASE [BackupExpiryTest]
FROM DISK = N'D:\Temp\BackupExpiryTest.bak';
GO

Processed 344 pages for database 'BackupExpiryTest', file 'BackupExpiryTest' on file 1.
 Processed 2 pages for database 'BackupExpiryTest', file 'BackupExpiryTest_log' on file 1.
 RESTORE DATABASE successfully processed 346 pages in 0.068 seconds (39.723 MB/sec).

The EXPIREDATE option has no effect on whether we can restore the database.

What is it for, then? Let’s go and read Books Online, specifically the BACKUP command.

In the section labelled Media Set Options, we see the following explanation:

{ NOINIT | INIT }
Controls whether the backup operation appends to or overwrites the existing backup sets on the backup media. The default is to append to the most recent backup set on the media (NOINIT).

Even further down, under INIT, we see the following:

By default, BACKUP checks for the following conditions and does not overwrite the backup media if either condition exists:

  • Any backup set has not yet expired. For more information, see the EXPIREDATE and RETAINDAYS options.

Now we’re getting somewhere. If you back up your database to an existing file, the default setting (NOINIT) is to append that backup to the file. If you explicitly use INIT, however, it will overwrite the backup file and delete any existing backups, unless the EXPIREDATE value is a date sometime in the future.

Test 2: Append to existing backup file

Let’s append to our existing backup and set this backup to expire on Friday this week:

USE [master];
GO
BACKUP DATABASE [BackupExpiryTest]
TO DISK = N'D:\Temp\BackupExpiryTest.bak'
WITH EXPIREDATE = N'2016-05-07 00:00:00';
GO

Processed 344 pages for database 'BackupExpiryTest', file 'BackupExpiryTest' on file 2.
 Processed 2 pages for database 'BackupExpiryTest', file 'BackupExpiryTest_log' on file 2.
 BACKUP DATABASE successfully processed 346 pages in 0.055 seconds (48.161 MB/sec).

Notice that it says “file 2” in the output. This indicates that the backup we just performed is the second one in the backup file (or media set). If we wanted to restore this particular backup, we would have to specify WITH FILE = 2 in the restore script.

Now let’s try overwriting everything in the existing backup file using the INIT option:

USE [master];
GO
BACKUP DATABASE [BackupExpiryTest]
TO DISK = N'D:\Temp\BackupExpiryTest.bak'
WITH INIT;
GO

Msg 4030, Level 16, State 1, Line 1
The medium on device 'D:\Temp\BackupExpiryTest.bak' expires on May  7 2016 12:00:00:000AM and cannot be overwritten.
Msg 3013, Level 16, State 1, Line 1
BACKUP DATABASE is terminating abnormally.

Oops.

The big hammer

If we look further down the Books Online entry about BACKUP, we will see the following:

To override these checks, use the SKIP option.

Sure, let’s try overriding it:

USE [master];
GO
BACKUP DATABASE [BackupExpiryTest]
TO DISK = N'D:\Temp\BackupExpiryTest.bak'
WITH INIT, SKIP;
GO

Processed 344 pages for database 'BackupExpiryTest', file 'BackupExpiryTest' on file 1.
 Processed 2 pages for database 'BackupExpiryTest', file 'BackupExpiryTest_log' on file 1.
 BACKUP DATABASE successfully processed 346 pages in 0.056 seconds (48.139 MB/sec).

Summary

If you want to store multiple database backups in a single file, and want to control when these backups can be overwritten, you can set an expiration date with EXPIREDATE. Then the INIT option overwrites any backups that have not yet expired.

However, this can easily be overridden by the SKIP command. So this raises the question: if using the SKIP option simply overrides any expiration date in the backup file, why use EXPIREDATE in the first place?

I have never used expiration dates with my backups, and I don’t think I’m going to start now. I see no benefit in reusing existing files to append database backups. I will stick to using just one database backup per file, with a strong naming convention (including the date and time of the backup).

This way, dates in the file name make it easy to see at a glance when backups were taken, and then I have a higher confidence rating that files can be managed without having to scan their contents. Additionally, files older than a certain date can be cleaned out by a maintenance plan.

On Best Practices

We hear a lot about “best practice”, but what does it mean in the context of SQL Server? Wikipedia describes best practice as follows:

A best practice is a method or technique that has been generally accepted as superior to any alternatives because it produces results that are superior to those achieved by other means or because it has become a standard way of doing things, e.g., a standard way of complying with legal or ethical requirements.

We will take a high-level look at high availability and disaster recovery, an important part of database management and administration and a good example of how a certain way of doing things has become a standard. Then we will discuss what it means to implement a best practice for disaster recovery.

High Availability

There are many ways to achieve high availability (HA) — the idea that our database has zero (or close to zero) downtime — and usually this is dictated by our budget. We want business continuity in the face of planned and unplanned downtime. This may take the form of database mirroring, Availability Groups, clustering, or log shipping.

Disaster Recovery

Disaster recovery (DR) is slightly different. It assumes that a disaster has occurred, our highly-available environment has gone down, and now the clock is ticking. Company losses are mounting by the minute.

How this affects me, the DBA

HA/DR is usually expressed in a Service Level Agreement using two terms:

  • RPO, or recovery point objective: the maximum amount of data loss allowed by the business.
  • RTO, or recovery time objective: the maximum amount of time it should take to recover from a disaster.

What we want is a reliable, predictable and repeatable disaster recovery plan, even if we have a high availability solution that has never failed before. If our Availability Group doesn’t fail over cleanly, or the drive array is on fire, or that Windows Update managed to kill the network interface, we need to roll up our sleeves and activate the DR plan.

So, according to guidance, that means having regular full backups, differential backups, and transaction log backups ready to restore at a moment’s notice, right? Yes, but that’s not where our job stops.

Putting our best foot forward

This is where best practice comes in. It’s all very well having a good disaster recovery plan, but it’s useless if we don’t test it. Best practice in this case would be having both a good backup strategy and a methodology to make sure our DR plan is tested often, so that when something goes wrong, we have a script to follow. That way we will be able to recover with a small RPO and a short RTO, so that business is impacted as little as possible.

We call this a Run Book. Anyone in our organisation should be able to follow the script and perform a successful recovery.

What if we are incapacitated in some way, or in my case recently, stuck on an aircraft with no access to the Internet? What if the only person who is able to restore the environment is the CEO after hearing that her building is on fire?

Our backups must not only be stored securely somewhere (and in my post last week, I provided a budget-friendly solution on how to do off-site backups), but we should be able to restore to a known point in time, within the recovery time objective, losing as little data as possible.

Our run book should be clear enough so that the person who is testing the disaster recovery plan understands it so as not to ask us questions.

Earlier we saw the word “superior” repeated in the definition of best practice. A superior solution doesn’t need to be the most expensive one, using the latest and greatest tools. A run book could be an Excel file containing enough information to rebuild a server and locate magical scripts in the cloud that are tested weekly.

Know the problem we’re trying to solve. Build a solution that is practical and cost-effective for our environment. Test and iterate until it is flawless. That’s best practice.

Compressed and Encrypted Backups on the Cheap

We back up our databases regularly. We store them on a secure network location so that if the database drive fails, we can still satisfy the requirements for a rapid disaster recovery time (RTO, or Recovery Time Objective).

Some of us store these backups off-site as well, as an added security measure, either on tape or using cloud storage solutions (Azure Blob Storage, Amazon S3).

And we encrypt these backups, right?

Right?

SQL Server 2016 has some very neat features for encrypting databases, tables, rows, backups, and even in-memory stuff.

But what if we are still slogging away on the ancient SQL Server 2008 or 2008 R2 Standard Edition, and we need to compress and encrypt our backups, using industry standard AES-256 encryption?

Enter 7-Zip. It is free, open-source, and governed by the LGPL licence.

7-Zip Logo

The GUI (referred to as the 7-Zip File Manager) is installed by default. It looks something like this:

7-Zip File Manager GUI

7-Zip File Manager GUI

Additionally, there is a command-line version (7za.exe), which has the same features as the 7-Zip GUI. You can get the command-line version by downloading the 7-Zip Extra package.

This command-line tool can be used for scripting in batch files or PowerShell scripts, which means we can automate a whole lot of things with our backups. There is both a 32-bit and a 64-bit version available.

7-Zip (a) [64] 15.14 : Copyright (c) 1999-2015 Igor Pavlov : 2015-12-31

Usage: 7za <command> [<switches>...] <archive_name>
       [<file_names>...] [<@listfiles...>]

[Tip: I highly recommend using the default 7z archive format, instead of Zip. It provides a significant disk savings over the Zip format and requires about the same time to compress at the Normal compression level.]

The scenario

Let’s assume that we have a limited budget. We can afford SQL Server Standard Edition and 1TB of off-site storage, in the form of Azure Blob Storage, Dropbox, or Google Drive, to name just a few.

We can use 7-Zip on the command line to produce an encrypted file, even if we don’t want to compress it (7-Zip can create what I call “container” files, which are simply an encryption wrapper around the file) and push that to our Dropbox sync folder.

Remember that SQL Server 2008 R2 and higher allows us to compress backups in Standard Edition. It makes very little sense to compress an already-compressed backup.

Command-line switches

I use this page as a reference for 7-Zip command line examples. Of particular interest is -mx0, (which will store, but not compress, the source file), and -pSECRET (which allows the use of a password, where SECRET is the password).

Putting the pieces together

Step 1: Back up our databases (with compression if available)

Step 2: Copy the files to a secure network location

Step 3: Use 7-Zip to encrypt (and maybe compress) these files

Step 4: Copy the encrypted files to off-site storage

There are myriad tools to perform these steps, and there’s no single correct way to do it, though I prefer PowerShell for a number of reasons. It’s modern, it is still updated, and it can interact with a number of Windows features in a consistent way, which means we can automate a lot of our everyday tasks.

For example, if we stay on the Microsoft stack with Azure, we can use PowerShell to iterate through the backup directory, encrypting each backup file, then copy it to Azure using a tool like AzCopy, or even my Azure Blob Storage Sync tool.

Summary

We don’t need the latest and greatest (and expensive) tools to implement best practices. Using 7-Zip, Dropbox and PowerShell, we can easily ensure that our databases are secure in more ways than one.

Update to Azure Blob Storage Sync and Restore

Blob Storage Sync tool updated

During a SQL Server migration this month, I found some inconsistencies in my Azure Blob Storage Sync tool, so I made several improvements, and fixed an outstanding bug.

As you know, it relies on the naming convention provided in Ola Hallengren’s Maintenance Solution and comes in two parts: the AzureBlobStorageSync command-line application, and the AzureBlobStorageRestore command-line application.

New Features

I realised that it was not possible, using this tool, to download every file from the Blob Storage Container in one go. The code only downloaded the files necessary to perform the latest restore for a single database.

To resolve this, and allow all files to be downloaded from Blob Storage, I have added a new configuration key called DownloadFilesFromAzure, which takes True or False values.

Another new feature is an explicit option to upload all local files to Blob Storage during a sync. Previously, it was implied that all local files in the LocalPath should be uploaded, but you may not want to do that. This is implemented as configuration key CopyFilesToAzure, which takes True or False values.

Deleting Files

There are now two ways to delete files from a Blob Storage Container:

  • Files that do not match the source, that must be deleted off the target;
  • Files that are no longer needed on the target.

The first option is a typical synchronisation feature and was implicit in the previous version. It is now implemented using the configuration key DeleteMissingFilesFromAzure, which takes a True or False value. If it is set to True, files that do not exist on the local drive will be deleted from Blob Storage.

The second option is for deleting files that match a certain string in the file name. This is handy for server migrations where the file names generated by Ola’s backup script contain the old server name. While the backup script can perform cleanup tasks based on timestamps, it will ignore files that have a different server name, and you might be left with orphaned files long after the backup retention window has passed.

The configuration key, called DeleteExplicitFilesFromAzure takes True or False values, plus an additional configuration key, called ExplicitFilesToDeleteMatchingString. Here you can put a string containing the old server name, and any matching file with that name will be deleted. This particular feature only works with one string at a time. You will have to run it more than once if you need to delete file names that match other strings.

Fixes

Yes, the only logged issue in the repository has been fixed! I now use sp_executesql instead of EXEC for the T-SQL portion of the restore tool. This was probably the easiest thing to fix.

A more critical fix, and the main reason for this blog post and tool update, is to do with downloading files.

Imagine an all-too-common scenario where a download fails before it is complete. In an emergency, the last thing you need is your Restore tool failing. Whereas before I was simply using the name of the file, I now also check file size as well. If the file sizes do not match, the file will be downloaded from Blob Storage again.

Files now download from smallest to largest in size. This is a major benefit if you have a lot of small transaction log files.

Notes

Reading the list of files from Blob Storage takes about ten seconds for 2,500 files, before parsing can begin. This is not a lot of time, but it’s something to keep in mind.

Feature Requests and Bug Reports

If you have any feature requests, or have found a bug, please log that on the GitHub repository. Even better, if you know C#, you can add your own features and fix your own bugs, submit a pull request, and I’ll merge your code into the tool.

Questions, comments or complaints? Twitter is the place: @bornsql is where you’ll find me.