SQL Server 2016, released last month, now lets us back up our database files to two locations simultaneously. This is very good news if we’re running the latest and greatest (and it really is a good upgrade).
For everyone else, the bad news is that we are targeting an on-site destination for our backups (local network, SAN, NAS, tape) and using some form of automation to push these backups offsite. One of my budget-conscious customers uses Dropbox.
Since the release of Azure Cool Blob Storage, at a third of the price of typical Blob Storage, we now have a cost-effective off-site archive that is competitive with Amazon Glacier, and remains in the Microsoft family.
Many tutorials already exist for how to use AzCopy, including the official documentation. This post is more about gotchas that I’ve discovered while working extensively with the tool.
Keys and Containers
Azure Blob Storage stores its blobs (we can think of them as files) in a container, which is addressable by a URL. When we first set up our storage account, we need to create a container to store the files. The path to that container would look something like this:
A key is then needed to access that container in that storage account. It is a long string of characters, with == at the end, which looks something like this:
In the examples below, I use the word
key to represent this key.
We can copy files to and from Azure Blob Storage using the following command.
AzCopy /Source: /Dest: [Options]
If we’re copying files to Azure, the
/Source switch would have an on-premises path, and the
/Dest switch would have a container path. The
/DestKey must then be specified.
(If we were copying from Azure, the Source and Destination would be reversed, and the SourceKey would be specified.)
For example, let’s copy files from my local drive to an Azure storage container:
AzCopy /Source:D:\SQLData\Backup /Dest:https://myaccount.blob.core.windows.net/mycontainer /DestKey:key /S /XO /Z:D:\_Ops\AzCopy
/Source – my local hard drive
/Dest – the Azure storage container URL
/DestKey – the secure key to access the container
/S – recurses through all the subfolders in the source location
/XO – ignores all older files that already exist
/Z – specify a fixed path for the journal
Don’t Run Multiple Copies
If AzCopy is already running, don’t run it again. This is strongly advised by the official AzCopy documentation.
The reason is that it makes the best use of available resources, or in more practical terms, it’s thrashing the disk and CPU.
Running multiple copies of AzCopy simultaneously will render your machine almost unusable, and it will probably break the journal anyway.
The biggest headache working with AzCopy is that it uses a journal to keep track of your progress during a copy operation.
The journal normally lives in
%LocalAppData%\Microsoft\Azure\AzCopy. This becomes a problem when automating the AzCopy process. Usually we run automated tasks under a service account, which means that we don’t monitor the task until it fails.
Unpredictable journal behaviour
Unfortunately, AzCopy does not fail if a previous copy operation was not completed. Instead, it will wait with a
N prompt, which never comes, because the task is running in a service account context.
AzCopy provides the option to use
/Y to suppress confirmation prompts.
There are two main reasons why the journal will await a confirmation:
- The previous operation failed, and pressing
Ywill allow it to resume where it stopped;
- The current operation is similar to a previous operation, and pressing
Ywill overwrite the journal and start over.
As we can see, these are vastly different results. Therefore I do not recommend using the
If for whatever reason AzCopy is interrupted (usually network-related issues), the journal will have to be cleared out manually.
Note that the
%LocalAppData% path is user-specific. I would recommend specifying a journal path with the
/Z switch, which allows us to set a default location.
Ignore Older Files
/XO switch to only copy new files, and ignore the ones that already exist.
AzCopy is a great tool, but it has a lot of moving parts and can cause headaches if we don’t manage the journal correctly.
If you would like to continue the discussion, or ask me questions, look me up on Twitter at @bornsql.