Microsoft presents: MySQL and PostgreSQL?

For the longest time, MySQL has been the recommended database platform for blogs and other websites. It’s free. It’s available on every version of Linux, which is the most common web server platform. If you need something more complex but still free, there’s PostgreSQL.

But there’s a lot going on beyond that word “free”. Because although the MySQL or PostgreSQL licence might be free, there are hidden costs to running a website these days.

Apart from the security aspect, which alone should make us want to turn off our computers and walk away, we need to keep backups, and if something goes wrong, we need to be able to restore those backups. We need to test those backups. We need to keep the server patched. We need to keep the server running at all times.

If we run a hosting company, and have paying customers, we might be able to roll that administrative overhead into our hosting fee, but it’s hard to remain competitive and still make money in this game. So we stack our customers, and things slow down because it’s only cost effective to have hundreds of customers on a single server. It increases risk because so many more customers might be affected by even planned maintenance. If your customers are global, how do you plan downtime that suits all of them?

This Azure thing starts to look appealing. Perhaps an Azure SQL Database on the Basic tier for 5 DTUs at $6 a month (Canadian) would be able to bring the power of SQL Server to blogging engines without breaking the bank. Who can’t afford $6 a month in this day and age to host a website backed by the best and most secure RDBMS around?

I’d get acceptable DTU performance for the low traffic most hosted websites get and all the benefits, including SSL encryption, fully managed, point-in-time backups, plus the very latest features. No need to maintain a separate server, or set up maintenance jobs to keep the site and its contents safe. No need to worry about testing my backups through phpMyAdmin, which, let’s face it, isn’t as great as SQL Server Management Studio.

But we already know MySQL. It has been around for ages. Why switch to a Microsoft technology?

Microsoft seems to have asked themselves that same question, probably because they understand we’ve all built websites on MySQL. One of my favourite customers uses a LAMP stack (Linux, Apache, PHP and MySQL) for their site, and they are happy. Sure they could move to SQL Server, or Azure SQL Database, but then I’d have to rewrite a whole bunch of things. There’s no such thing as a drop-in replacement for any RDBMS, no matter what the marketing folks tell you. Well, unless you’re migrating from SQL Server to Azure SQL Database. And even then, there are challenges.

So now Microsoft has this thing where you can host your MySQL (or PostgreSQL) database on the exact same management framework as Azure SQL Database. It’s identical. The only difference is the database engine.

What does that mean? How about acceptable transactional throughput for the low traffic most hosted websites get, and all the benefits, including SSL encryption, fully managed, point-in-time backups, plus the very latest features? (Sound familiar?)

Instead of hosting a Linux-based, self-managed and maintained server with lots of moving parts, you can decouple your website and your database, and run the database on the same RDBMS you know and love (MySQL or PostgreSQL), with all the care and feeding taken care for you, starting at $20 a month and including 50 GB of space.

It’s good to have options. Fully patched. Fully maintained. Fully backed up. Fully secured.

This is what you can get, today, on Azure. The choice is yours:

  • Azure SQL Database, compatible with SQL Server;
  • Azure Cosmos DB, a global-scale eventually-consistent 99.99%-uptime guaranteed distributed database;
  • Azure Database for MySQL; or
  • Azure Database for PostgreSQL.

If your platform of choice isn’t available as a Platform-as-a-Service solution, you can still spin up a VM template that contains your Oracle and SAP HANA solutions.

The future is looking pretty rosy in Azure.

Talk to me on Twitter at @bornsql about your favourite database platform.

Should I upgrade SQL Server 2008 / R2, or migrate to Azure?

Brent Ozar Unlimited runs a website, called SQL Server Updates, which comes in really handy for keeping your on-premises SQL Server up to date.

Last week I noticed something interesting: if you’re running SQL Server 2008 or 2008 R2, Microsoft’s extended support for it ends on 9 July 2019. That’s two versions of SQL Server, almost two years apart in release cycle, with different feature sets, expiring on the same day.

Granted, by that date SQL Server 2018 will probably have been released, which would make 2008 officially five versions old and 2008 R2 four versions old. But in corporations, time moves both slowly and quickly. By that I mean, planning to upgrade takes a long time, and before you know it, it’s 9 July 2019 and everyone is panicking.

It’s the right time, today, to start your upgrade plan. A lot of customers I speak to, running on 2008 or 2008 R2, are talking about upgrading to 2012 or 2014 right now.

On Twitter, I said that after the end of next year, I would not recommend upgrading any instances to SQL Server 2012, for this reason:

In 2019, moving to SQL Server 2016 buys you 7 more years of support. I wouldn’t consider moving to SQL Server 2012 after the end of 2017.

At the end of 2017, SQL Server 2012 will be more than five years old. Despite your best intentions assuming a major upgrade will be quick, and considering that SQL Server licensing costs the same regardless of the version you buy, there’s no practical reason not to go to at least SQL Server 2014, and preferably SQL Server 2016 (which I consider the biggest release since SQL Server 2005). You can always run your database in 2008 Compatibility Mode if you’re running SQL Server 2014.

I wrote down some reasons to upgrade from SQL Server 2005 last year. These reasons still apply.

Whither Azure?

Which brings up the big question: what about Azure SQL Database? As you know, my first question when customers ask about moving to Azure SQL Database is “Are you sure?”, followed by “Are you very sure?”.

By the time July 2019 rolls around, Azure will be ten years old. That’s hard to fathom right now, but at their current rate of progress, Azure SQL Database will be a mature platform (it already is mature for appropriate workloads, when taking elastic pools into account).

So, if you meet all of the following conditions:

  • you use SQL Server 2008 or 2008 R2,
  • you plan to upgrade after 2017, and
  • you’re running a database smaller than 1TB,

I would recommend that you seriously consider migrating to Azure SQL Database, instead of upgrading to an on-premises database server running SQL Server. The time is right, at the end of 2016, to spend 18 months on an appropriate migration strategy to achieve this.

Yes, 18 months from now we’ll be deep into 2018. Scary, isn’t it?

Act now.

If you want to discuss this on Twitter, look me up at @bornsql.

Azure Storage Throughput

Hot on the heels of DTUs from earlier this week, today we’re going to switch focus a little, while staying with Azure, and have a quick look at storage speeds on SQL Server running on Azure Virtual Machines (Infrastructure-as-a-Service).

Microsoft recently lowered their prices for more popular VMs, which is making this an attractive solution for a lot of folks who don’t want to run their servers on-site.

However, there’s a limit to how fast our storage can get, despite the attraction of solid-state drives on the Premium Storage offering.

We in the industry refer to “latency” when discussing storage. This is measured in seconds per MB (s/MB), or how long it takes to access 1 MB of data. Target latency for SQL Server is 5 ms to 10 ms. In other words, the best case scenario for SQL Server is that we should be able to access 1 MB of data in between 5- and 10-thousandths of a second.

Everything on Azure is virtualised, from the CPU and RAM, down to the storage layer. Behind the scenes, all of these components are connected by what Microsoft calls the “fabric layer”. In simpler terms, it’s the network, stupid.

Yes, it has a low latency. Yes, a customer of mine did save almost 50% of their budget by switching from a data centre to Azure, and yes, they did get a nice performance boost in the process.

However, there’s a limit, and that is the theoretical limit on network throughput. The absolute best case for any local network these days is 10 Gb/s, or in other words, 1.25 GB per second (there are 8 bits in a byte, so we have to divide the bits by eight to get the bytes per second).

In realistic terms though, the connection between our virtual machine and the SSD premium storage, depending on our region and storage tier, is more likely to be 1 Gbps, or 125 MB per second. It’s not really clear from the Azure Documentation what the actual limits are, but I’ve personally seen it max out at 125 MB/s. The region we host in matters a great deal.

Inverting that 125 MB/s figure to get the latency, we’re seeing 8 ms at best. That’s still within the window of 5–10 ms, so it’s not terrible. Now add in the fact that we’re not the only person using that underlying hardware, and the latency starts creeping up. There will be competition on the VM host, the network card, the storage, and it all adds up. Pretty soon we’re looking at 20–30 ms of latency, which is now more than twice as long as the preferred amount.

No matter what the CPU speed is, or how much RAM we have, our bottleneck is always going to be the network. Those of us who have experience with older SANs (storage area network) will recognise this. Even having the fastest spindles (hard drives) in the SAN is meaningless when we’re connected to the server with a 1 Gbps half-duplex network card (no sending traffic is allowed while receiving traffic).

As I say on this blog and in person, Azure is not the answer for everyone (whether we’re looking at SQL Server on an Azure VM or an Azure SQL Database). We need to make sure that we understand the inherent limits of the infrastructure before committing to a long-term project to migrate everything.

I’m not saying Azure isn’t a good idea. I just pointed out above that a client I migrated recently was very happy with their outcome. Their data centre provider offering was terrible, and Azure was a great improvement. I personally run a production system on SQL Server, running on a small Azure VM.

20–30ms might be twice as fast as your current solution. If your database is running on 7200 rpm SATA drives in a RAID 5 configuration, you’ll see a massive increase in performance by switching to an Azure VM with P30 Premium Storage.

Here are some fantastic resources to help you plan your migration better:

It all comes back to the question: Are you sure?

Let me know your Azure migration plans on Twitter, at @bornsql . I look forward to hearing from you.

Azure SQL Database Limits

Let’s talk briefly about resource limits with Azure SQL Database. Because we have to share resources with other users, and because Microsoft doesn’t want us affecting others dramatically, they have implemented some limits.

If for some reason our database does get overeager, the operations relating to CPU, RAM and I/O will be queued up by the underlying resource manager, which will allow those operations to continue once the resources are free. Yes, that will slow down our system.

Resources other than CPU, Memory, Log I/O, and Data I/O, as noted by the documentation, will be denied, and clients will see an error message.

Microsoft uses something called DTUs (Database Throughput Units, a combination of CPU, RAM and log writes) to figure out our Azure SQL Database’s resource usage.

Check out the DTU Calculator to find out more, and see how your on-premises database compares. Next week, we will go into more detail about DTUs, and how to calculate the DTU usage of an on-premises database.

It gets really complicated, really quickly, but using the DTU Calculator before we decide to move to Azure SQL Database, and being conservative with our estimates, will make our lives easier.

The biggest an Azure SQL Database can be on the Standard Tier, is 250 GB. The biggest it can be on the Premium Tier, is 1000 GB.

Coincidentally, the SQL Server Customer Advisory Team (SQLCAT) wrote a blog post titled “Real-World Azure SQL DB: Unexpected Database Maximum Size Limit” last month, about a gotcha when upgrading between service tiers. It is recommended reading.

It helps to draw parallels between Standard Tier and Standard Edition of SQL Server, and Premium Tier and Enterprise Edition of SQL Server.

(Edit: With SQL Server 2016 Service Pack 1, you can now create Memory-Optimized objects in all editions of SQL Server.)

There is a Basic Tier for Azure SQL Database, but the resource limits are so strict that I barely consider it. Perhaps if you run a small MySQL database for a WordPress website, with a caching plugin, this may be appropriate.

Using elastic pools (the ability to share the count of resource DTUs between several databases in a pool, that shrinks and grows as your requirements change) can help with balancing costs across several databases, but it is going to seem more expensive at the outset.

The bottom line with Azure SQL Server is that the resource limits may take you by surprise if you don’t plan correctly. Spend the time to evaluate the requirements of your database, and don’t just use Azure SQL Database because it’s new and shiny. If you currently have an Enterprise licence for SQL Server, chances are you’ll spend the equivalent (or more) on a Premium Tier database.

Azure SQL Database is not a way to save money on your SQL Server licences. It is a different technology, with different resource requirements, that cater to a specific workload. Choose wisely.

If you have some thoughts about DTUs, come and talk to me on Twitter, at @bornsql.

Moving a database to Azure SQL Database

This week we will actually move a database into Azure SQL Database, using the first of two of Microsoft’s recommended methods.

The main thing to keep in mind is that SQL Server (on-premises, or “earthed”) and Azure SQL Database (“cloud”) are not the same product. They support the same schemas and data, and allow the same scripts and applications to run, but they are not the same thing.

This is important, because when we migrate to Azure SQL Database from an on-premises SQL Server environment, we can’t simply do a backup and restore of our database.

The simple reason is that there is no file system access in Azure SQL Database. We won’t be able to see a backup file to restore it.

The more complicated reason is that we don’t have access to any instance-level features, including server configuration, tempdb access, and so on, and therefore our databases must be self-contained.

The recommended method to migrate a database to Azure SQL Database is the Migration Wizard in SQL Server Management Studio.

For larger databases, we can create an extract first, which can be imported later using the necessary tools.

Under the covers, the migration is using SQL Server Data Tools to extract a Data-tier Application (DAC). This acronym is not to be confused with Dedicated Administrator Connection.

According to MSDN (emphasis added):

A data-tier application (DAC) is a logical database management entity that defines all of the SQL Server objects – like tables, views, and instance objects including logins – associated with a user’s database. A DAC is a self-contained unit of SQL Server database deployment that enables data-tier developers and database administrators to package SQL Server objects into a portable artifact called a DAC package, also known as a DACPAC.

A BACPAC is a related artifact that encapsulates the database schema as well as the data stored in the database.

From this, we can establish that we are creating a BACPAC file, containing the schema in XML format, a few metadata files with version information, and of course the data itself, in BCP (Bulk Copy) format. This is then compressed (using standard PK-ZIP format) in order to save space.

Anyone who has used other products that create database dump files will be familiar with this process.

The BCP file format is not human-readable, unlike, say, MySQL’s dump files, and if we look at the MSDN article on format files, it will make our heads spin.

That said, I was able to pick out string values from the BCP file in my test that corresponded with row values in the source database table.

To be on the safe side, I like to break up the process into two stages, so that if the second part fails, we can retry it at our convenience:

  • Create the BACPAC file locally and verify it;
  • Import the BACPAC file into Azure SQL Database.

If the database is small enough, we use the Migration Wizard inside the latest version of SQL Server Management Studio and do both steps in one go. Let’s do that now.

(The following images are copyright © 2016 Microsoft Corporation, retrieved from the official documentation pages.)

Inside SQL Server Management Studio, right-click on the database we want to migrate. In this case, it’s [AdventureWorks2012]. Click Tasks > Deploy Database to Microsoft Azure SQL Database….

migrateusingdeploymentwizard01

Here we specify the Server connection of our Azure SQL account, by clicking on the Connect… button:

migrateusingdeploymentwizard02

This is where we connect to the Azure SQL account. Since this migration wizard is doing all of the heavy lifting, we do not need to create a database beforehand.

migrateusingdeploymentwizard03

Now that we have connected to our Azure account, we can create the database on this screen. Remember the different editions and their associated costs, so we won’t pick the $7000 per month option unless we can afford it.

migrateusingdeploymentwizard04

Once we start the migration, it can take a while, and it may even fail due to compatibility issues, which have to be resolved.

We briefly touched on these last week, and they must be resolved before migrating. We would use the latest SQL Server Data Tools to identify errors or warnings.

But because we’ve done our compatibility homework, everything works the first time.

It’s really that simple.

If the database is large, or the network connection is unstable or slow, we will have to do the import from our BACPAC file or using BCP directly. That will be covered in next week’s post.

If you have any questions or comments about this process, feel free to ask me on Twitter at @bornsql .

An Introduction to Azure SQL Database

It’s the second half of 2016, and Azure is the new hotness. I’ve raved about Blob Storage, virtual machines, AzCopy, and all those good things. When the dust settles, though, this blog is about managing data, and sometimes we don’t want to worry about the hassle of maintenance and administration.

Azure SQL Database takes the Administrator out of “DBA”. It’s a cute soundbite, but is it true? Let’s go through a short primer.

What’s with the unwieldy name?

Azure SQL Database is a bit of a mouthful, but be thankful that Satya Nadella took over from Steve Ballmer. Before that happened, it was Windows Azure SQL Database, or WASD. Can you imagine?

“ASD” is an entry from the Diagnostic and Statistical Manual of Mental Disorders, and we can’t call it “SQL Azure” either. “Azure SQL Database” is the official name, so let’s stick with it.

Who is Azure SQL Database for?

For a long time, I struggled to answer this question myself, mainly because I did not understand how anyone could get by with a tiny SQL Server database. The Azure SQL Database tiers seem, well, limited on the low end and expensive on the high end:

Basic – 2GB (up to $5 per month)
Standard – 250GB ($15 to $50 per month)
Premium – 500GB–1TB ($465 to $7,001 per month)

That last amount is not a typo. We can easily spend seven thousand US dollars, every month, running a 1TB Azure SQL Database.

Compare and contrast that with a fully-licenced Azure DS13 virtual machine, on Windows Server 2012 R2, with 56GB RAM, 8 CPU cores, and a 400GB system drive, running SQL Server 2016 Enterprise Edition, which (even with four 1TB disks running in a RAID 10 configuration) costs less than US$4,000 per month.

If it’s so expensive, who is it for?

This website, bornsql.ca, runs on a LAMP stack, which stands for Linux, Apache, MySQL and PHP. Linux is the operating system, Apache is the web server, MySQL is the database, and PHP is the programming language. I need all four of these things, with their interdependencies, to provide you with a website to read and comment on. That’s a lot of overhead, and it costs more than $5 a month, not even counting the administration costs.

Microsoft traditionally offers a similar stack, with Windows Server as the operating system, Internet Information Services (IIS) as the web server, SQL Server as the database, and ASP.NET (or another common language like C#, or even PHP if you want) as the programming language.

Vast swaths of the Internet are still hosted on Linux- or Windows-based web servers, with millions of hours dedicated to software updates, malware removal, replacing failing hard drives, taking regular backups, power failures, and so on. It adds up.

What Azure does (the entire service offering) is completely separate the dependency between these and other components. For example, if you want to build a static website, with no underlying database, you just need to use Azure App Service. Want to add a database? Throw in an Azure SQL Database. No operating system required. No server required. It’s all there, under the covers, and we don’t have to worry about it.

With Azure, we’re paying for ease of deployment of new stuff, with no administration and maintenance to worry about. Azure “just works”.

This sounds like virtualisation at a large scale

Sort of.

A funny thing happened in the last few years. Hardware kept getting faster and cheaper, and software stayed the same. This meant that newer hardware was being wasted because software just didn’t need it.

Some enterprising technologists decided that they could virtualise environments, by migrating physical servers to virtual servers, enabling multiple VMs (virtual machines) to run on one physical server.

Over time, they figured out how to configure VMs to make the best use of shared resources so that the VMs can run the same or even better than they did on their old physical servers. Fewer physical servers means lower administration costs, lower electricity bills, and so on.

Cloud computing providers, including Amazon Web Services, Google Apps, and Microsoft Azure, are basically heaps and heaps of servers that run custom software.

Is ease of administration worth the extra money?

I don’t know. That depends on you. This is why it was a difficult question to answer at first.

Imagine, instead of worrying about software and hardware costs, you can focus on growing your business and only pay for more resources when you need them. That’s the appeal of Azure.

How does Azure SQL Database fit in?

Azure SQL Database is a specific service offering that caters to people and companies who do not want to manage their own database servers and will be fine with “just” 50GB databases to run a particular web or mobile application. No need for a dedicated Database Administrator.

This is exciting! When can I migrate my database to Azure SQL Database?

Tune in next week to find out whether it’s a good idea to move your environment to Azure and how to get your on-premises SQL Server database there.

Comments

If you have any questions or comments, look me up on Twitter, at @bornsql .

AzCopy 101 – Copying SQL Server backup files to Azure Blob Storage

SQL Server 2016, released last month, now lets us back up our database files to two locations simultaneously. This is very good news if we’re running the latest and greatest (and it really is a good upgrade).

For everyone else, the bad news is that we are targeting an on-site destination for our backups (local network, SAN, NAS, tape) and using some form of automation to push these backups offsite. One of my budget-conscious customers uses Dropbox.

Since the release of Azure Cool Blob Storage, at a third of the price of typical Blob Storage, we now have a cost-effective off-site archive that is competitive with Amazon Glacier, and remains in the Microsoft family.

We can either use my Azure Blob Storage Sync tool to push our database backups to Azure Blob Storage, or if there is a policy to use Microsoft-only tools, we can use AzCopy.

Many tutorials already exist for how to use AzCopy, including the official documentation. This post is more about gotchas that I’ve discovered while working extensively with the tool.

Keys and Containers

Azure Blob Storage stores its blobs (we can think of them as files) in a container, which is addressable by a URL. When we first set up our storage account, we need to create a container to store the files. The path to that container would look something like this:

https://myaccount.blob.core.windows.net/mycontainer

A key is then needed to access that container in that storage account. It is a long string of characters, with == at the end, which looks something like this:

Zx7uEPwA6MfJ5MXML0MYUqc8k78lYYCvq7h+lG0grumpyMG1TvEpp931SQLXWpoZWfgItEzhvWnKzy9RKGTYfA==

In the examples below, I use the word key to represent this key.

Command Line

We can copy files to and from Azure Blob Storage using the following command.

AzCopy /Source: /Dest: [Options]

If we’re copying files to Azure, the /Source switch would have an on-premises path, and the /Dest switch would have a container path. The /DestKey must then be specified.

(If we were copying from Azure, the Source and Destination would be reversed, and the SourceKey would be specified.)

For example, let’s copy files from my local drive to an Azure storage container:

AzCopy /Source:D:\SQLData\Backup /Dest:https://myaccount.blob.core.windows.net/mycontainer /DestKey:key /S /XO /Z:D:\_Ops\AzCopy

Notes:
/Source – my local hard drive
/Dest – the Azure storage container URL
/DestKey – the secure key to access the container
/S – recurses through all the subfolders in the source location
/XO – ignores all older files that already exist
/Z – specify a fixed path for the journal

The Nitty-Gritty

Don’t Run Multiple Copies

If AzCopy is already running, don’t run it again. This is strongly advised by the official AzCopy documentation.

The reason is that it makes the best use of available resources, or in more practical terms, it’s thrashing the disk and CPU.

Running multiple copies of AzCopy simultaneously will render your machine almost unusable, and it will probably break the journal anyway.

Journal

The biggest headache working with AzCopy is that it uses a journal to keep track of your progress during a copy operation.

The journal normally lives in %LocalAppData%\Microsoft\Azure\AzCopy. This becomes a problem when automating the AzCopy process. Usually we run automated tasks under a service account, which means that we don’t monitor the task until it fails.

Unpredictable journal behaviour

Unfortunately, AzCopy does not fail if a previous copy operation was not completed. Instead, it will wait with a Y or N prompt, which never comes, because the task is running in a service account context.

AzCopy provides the option to use /Y to suppress confirmation prompts.

There are two main reasons why the journal will await a confirmation:

  • The previous operation failed, and pressing Y will allow it to resume where it stopped;
  • The current operation is similar to a previous operation, and pressing Y will overwrite the journal and start over.

As we can see, these are vastly different results. Therefore I do not recommend using the /Y switch.

If for whatever reason AzCopy is interrupted (usually network-related issues), the journal will have to be cleared out manually.

Note that the %LocalAppData% path is user-specific. I would recommend specifying a journal path with the /Z switch, which allows us to set a default location.

Ignore Older Files

Like Xcopy and Robocopy, we can skip older files, which is useful when archiving transaction log backups every 15 minutes.

Use the /XO switch to only copy new files, and ignore the ones that already exist.

Summary

AzCopy is a great tool, but it has a lot of moving parts and can cause headaches if we don’t manage the journal correctly.

If you would like to continue the discussion, or ask me questions, look me up on Twitter at @bornsql.

Forecast: Azure Skies Over the Amazon

Microsoft Azure and Amazon Web Services are taking over.

Last week, a client approached me to migrate their complex environment in two phases from two physical data centers, first to Azure VMs, and ultimately to an Azure SQL Database, doing away with logical servers completely.

In my mind, this trend is only going to accelerate. There are obvious exclusions that I won’t go into around privacy, large data and so on, but in general, I see a mass migration to “the Cloud”.

The primary motivator for virtualisation has always been cost. For example, Azure Blob Storage offers fully redundant global replication, where your data is copied to two other centers in other parts of the world, for a very small price bump and zero additional administration. In a traditional hosted environment, you would be looking at a significant outlay for ensuring global redundancy, not to mention maintaining and administering it.

A trade-off of virtualisation is performance. Virtual machines, by their nature, incur at least a 5% resource overhead, and if set up incorrectly or over-provisioned, can cut performance by as much as 50%. That is slowly changing as we become wise to the needs of SQL Server in a virtual environment, and Microsoft is driving the recommendations.

Licensing is also changing as a result of the Cloud. When setting up a traditional environment, software licenses can easily account for 80% or more of your initial outlay, depending on number of CPU cores, SQL Server edition, and so on.

With Azure and Amazon, just as you would pay a monthly fee for a Windows VM, you can now pay an additional monthly fee for a SQL Server license, that (depending on configuration) could take up to six years to become more expensive than buying a SQL Server Enterprise license outright. Plus, in those six years, you can get the latest version of the product as quickly as it takes to spin up a new virtual machine.

An operational expense like that makes developing for a SQL Server platform very attractive for developers, especially for companies that don’t necessarily have capital to pay hundreds of thousands of dollars on licensing.

It behooves us, as data professionals and consultants, to understand these risk factors, to ensure that our customers can get the most bang for their buck. If they want performance, and their budget is limited, Azure virtual machines, or even database-as-a-service, might be the best thing for them.

Yes, the Cloud is just someone else’s data center, but Microsoft and Amazon are lowering the barrier to entry for a number of businesses, large and small, to scale up their environment and help grow their businesses. It would be foolish to dismiss it out of hand.

If I ran a large data center, I’d be concerned. Take me to the clouds above.

Update to Azure Blob Storage Sync and Restore

Blob Storage Sync tool updated

During a SQL Server migration this month, I found some inconsistencies in my Azure Blob Storage Sync tool, so I made several improvements, and fixed an outstanding bug.

As you know, it relies on the naming convention provided in Ola Hallengren’s Maintenance Solution and comes in two parts: the AzureBlobStorageSync command-line application, and the AzureBlobStorageRestore command-line application.

New Features

I realised that it was not possible, using this tool, to download every file from the Blob Storage Container in one go. The code only downloaded the files necessary to perform the latest restore for a single database.

To resolve this, and allow all files to be downloaded from Blob Storage, I have added a new configuration key called DownloadFilesFromAzure, which takes True or False values.

Another new feature is an explicit option to upload all local files to Blob Storage during a sync. Previously, it was implied that all local files in the LocalPath should be uploaded, but you may not want to do that. This is implemented as configuration key CopyFilesToAzure, which takes True or False values.

Deleting Files

There are now two ways to delete files from a Blob Storage Container:

  • Files that do not match the source, that must be deleted off the target;
  • Files that are no longer needed on the target.

The first option is a typical synchronisation feature and was implicit in the previous version. It is now implemented using the configuration key DeleteMissingFilesFromAzure, which takes a True or False value. If it is set to True, files that do not exist on the local drive will be deleted from Blob Storage.

The second option is for deleting files that match a certain string in the file name. This is handy for server migrations where the file names generated by Ola’s backup script contain the old server name. While the backup script can perform cleanup tasks based on timestamps, it will ignore files that have a different server name, and you might be left with orphaned files long after the backup retention window has passed.

The configuration key, called DeleteExplicitFilesFromAzure takes True or False values, plus an additional configuration key, called ExplicitFilesToDeleteMatchingString. Here you can put a string containing the old server name, and any matching file with that name will be deleted. This particular feature only works with one string at a time. You will have to run it more than once if you need to delete file names that match other strings.

Fixes

Yes, the only logged issue in the repository has been fixed! I now use sp_executesql instead of EXEC for the T-SQL portion of the restore tool. This was probably the easiest thing to fix.

A more critical fix, and the main reason for this blog post and tool update, is to do with downloading files.

Imagine an all-too-common scenario where a download fails before it is complete. In an emergency, the last thing you need is your Restore tool failing. Whereas before I was simply using the name of the file, I now also check file size as well. If the file sizes do not match, the file will be downloaded from Blob Storage again.

Files now download from smallest to largest in size. This is a major benefit if you have a lot of small transaction log files.

Notes

Reading the list of files from Blob Storage takes about ten seconds for 2,500 files, before parsing can begin. This is not a lot of time, but it’s something to keep in mind.

Feature Requests and Bug Reports

If you have any feature requests, or have found a bug, please log that on the GitHub repository. Even better, if you know C#, you can add your own features and fix your own bugs, submit a pull request, and I’ll merge your code into the tool.

Questions, comments or complaints? Twitter is the place: @bornsql is where you’ll find me.

Pre-Announcement Announcement

On Saturday, 27 June 2015, I will be presenting at SQLSaturday #407 in Vancouver BC. The session is titled “Restoring a SQL Server Database from Azure Blob Storage”.

I will be releasing some new content and code at the same time as my talk, which will be available from this site, assuming WordPress plays along.

Once the content is available, you’ll understand why I’ve been so quiet on the blog.

Stay tuned.