Forecast: Azure Skies Over the Amazon

Microsoft Azure and Amazon Web Services are taking over.

Last week, a client approached me to migrate their complex environment in two phases from two physical data centers, first to Azure VMs, and ultimately to an Azure SQL Database, doing away with logical servers completely.

In my mind, this trend is only going to accelerate. There are obvious exclusions that I won’t go into around privacy, large data and so on, but in general, I see a mass migration to “the Cloud”.

The primary motivator for virtualisation has always been cost. For example, Azure Blob Storage offers fully redundant global replication, where your data is copied to two other centers in other parts of the world, for a very small price bump and zero additional administration. In a traditional hosted environment, you would be looking at a significant outlay for ensuring global redundancy, not to mention maintaining and administering it.

A trade-off of virtualisation is performance. Virtual machines, by their nature, incur at least a 5% resource overhead, and if set up incorrectly or over-provisioned, can cut performance by as much as 50%. That is slowly changing as we become wise to the needs of SQL Server in a virtual environment, and Microsoft is driving the recommendations.

Licensing is also changing as a result of the Cloud. When setting up a traditional environment, software licenses can easily account for 80% or more of your initial outlay, depending on number of CPU cores, SQL Server edition, and so on.

With Azure and Amazon, just as you would pay a monthly fee for a Windows VM, you can now pay an additional monthly fee for a SQL Server license, that (depending on configuration) could take up to six years to become more expensive than buying a SQL Server Enterprise license outright. Plus, in those six years, you can get the latest version of the product as quickly as it takes to spin up a new virtual machine.

An operational expense like that makes developing for a SQL Server platform very attractive for developers, especially for companies that don’t necessarily have capital to pay hundreds of thousands of dollars on licensing.

It behooves us, as data professionals and consultants, to understand these risk factors, to ensure that our customers can get the most bang for their buck. If they want performance, and their budget is limited, Azure virtual machines, or even database-as-a-service, might be the best thing for them.

Yes, the Cloud is just someone else’s data center, but Microsoft and Amazon are lowering the barrier to entry for a number of businesses, large and small, to scale up their environment and help grow their businesses. It would be foolish to dismiss it out of hand.

If I ran a large data center, I’d be concerned. Take me to the clouds above.

Update to Azure Blob Storage Sync and Restore

Blob Storage Sync tool updated

During a SQL Server migration this month, I found some inconsistencies in my Azure Blob Storage Sync tool, so I made several improvements, and fixed an outstanding bug.

As you know, it relies on the naming convention provided in Ola Hallengren’s Maintenance Solution and comes in two parts: the AzureBlobStorageSync command-line application, and the AzureBlobStorageRestore command-line application.

New Features

I realised that it was not possible, using this tool, to download every file from the Blob Storage Container in one go. The code only downloaded the files necessary to perform the latest restore for a single database.

To resolve this, and allow all files to be downloaded from Blob Storage, I have added a new configuration key called DownloadFilesFromAzure, which takes True or False values.

Another new feature is an explicit option to upload all local files to Blob Storage during a sync. Previously, it was implied that all local files in the LocalPath should be uploaded, but you may not want to do that. This is implemented as configuration key CopyFilesToAzure, which takes True or False values.

Deleting Files

There are now two ways to delete files from a Blob Storage Container:

  • Files that do not match the source, that must be deleted off the target;
  • Files that are no longer needed on the target.

The first option is a typical synchronisation feature and was implicit in the previous version. It is now implemented using the configuration key DeleteMissingFilesFromAzure, which takes a True or False value. If it is set to True, files that do not exist on the local drive will be deleted from Blob Storage.

The second option is for deleting files that match a certain string in the file name. This is handy for server migrations where the file names generated by Ola’s backup script contain the old server name. While the backup script can perform cleanup tasks based on timestamps, it will ignore files that have a different server name, and you might be left with orphaned files long after the backup retention window has passed.

The configuration key, called DeleteExplicitFilesFromAzure takes True or False values, plus an additional configuration key, called ExplicitFilesToDeleteMatchingString. Here you can put a string containing the old server name, and any matching file with that name will be deleted. This particular feature only works with one string at a time. You will have to run it more than once if you need to delete file names that match other strings.

Fixes

Yes, the only logged issue in the repository has been fixed! I now use sp_executesql instead of EXEC for the T-SQL portion of the restore tool. This was probably the easiest thing to fix.

A more critical fix, and the main reason for this blog post and tool update, is to do with downloading files.

Imagine an all-too-common scenario where a download fails before it is complete. In an emergency, the last thing you need is your Restore tool failing. Whereas before I was simply using the name of the file, I now also check file size as well. If the file sizes do not match, the file will be downloaded from Blob Storage again.

Files now download from smallest to largest in size. This is a major benefit if you have a lot of small transaction log files.

Notes

Reading the list of files from Blob Storage takes about ten seconds for 2,500 files, before parsing can begin. This is not a lot of time, but it’s something to keep in mind.

Feature Requests and Bug Reports

If you have any feature requests, or have found a bug, please log that on the GitHub repository. Even better, if you know C#, you can add your own features and fix your own bugs, submit a pull request, and I’ll merge your code into the tool.

Questions, comments or complaints? Twitter is the place: @bornsql is where you’ll find me.

Basic Availability Groups in SQL Server 2016

One of the new features in SQL Server 2016 Standard Edition, which may appeal to users of the deprecated Database Mirroring feature, is Basic Availability Groups, a variation of Availability Groups found in the Enterprise Edition.

At a high level, Availability Groups are an amalgamation of Database Mirroring and Windows Failover Clustering. The Windows Servers underneath the Availability Group are clustered, but the SQL Servers installed on each server are installed as standalone instances.

The feature is restricted to two nodes, a primary and secondary. One of the big differences in Standard Edition is that the secondary node cannot be a read-only replica, which means you cannot run it as a reporting instance. However, you might be able to do it off a database snapshot on the secondary.

(As licencing is not clear yet until the product is released, it is important to bear in mind that any or all of this may change.)

Fellow community member, Brent Ozar, has written a how-to for setting up Basic Availability Groups on SQL Server 2016, if you would like to read more about it.

For more information about other new features of SQL Server 2016, you can read my article about Temporal Tables.

Filmmaking as a Metaphor for the DBA

I worked on four films in 2015, three shorts and one feature-length movie, all shot in Calgary where I live. That has resulted in seven IMDb credits for me, someone who earns a living as a DBA.

If nothing else, that experience has scratched an itch I’ve had since I was old enough to wonder what it would be like to act in a movie.

But acting isn’t filmmaking. It’s a very small part of the big picture, along with directing, producing, set building, makeup, lights, cameras, craft services, animal trainers, and so on.

DBAs also do a lot of work behind the scenes to make sure everything works the way it should. The sign of a good DBA is a system that works as expected. The sign of an excellent DBA is recovering from failure, affecting anyone else as little as possible.

Like being an excellent DBA, making films is hard work. Purely from an acting perspective, there are lots of lines to learn, repeating them over and over again, and then having to wait for someone to reset the camera, move some lights or the boom mic, and then do it all over again.

Exactly the same way.

Acting is the antithesis of automation. For example, it can take nine hours to film five pages of a script. Each page in a screenplay equates roughly to one minute of screen time. When I directed our last short, we shot eighteen pages in seven hours. That’s almost unheard of.

In information technology, we are encouraged to automate any repetitive task.

In front of the camera, we can’t automate our lines. Continuity is critical, so that the cup you’re holding at 8:15am during the master shot, is in the same hand at the same line, with the same level of liquid, as the close-up shot at 11pm.

I have also done a little bit of voice acting. Have you seen the film Singin’ in the Rain, starring Gene Kelly and Debbie Reynolds? She plays a voice-over actor who must redo all the voice parts for Jean Hagen’s character, in a process called ADR (automated dialogue replacement) or Looping.

There’s nothing automated about it. You see the scene and the current audio, and get a metronome counting you in for two or three beats, then you record your dialogue, trying to match against the picture. It’s expensive and time-consuming, and never quite matches.

Sometimes you have to do it in voice acting too. Except, excluding some very minor exceptions, there’s no picture to watch yet. You are in a booth, with headphones, a microphone, and pop filter in front of you. In my case, there’s also an HD web cam in there so that the outside world can see in. In other studios, the booth may be soundproof glass and have the recording equipment and director in view. It’s a very lonely space.

Either way, if I have to do ADR for a movie like Debbie Reynolds did, she’d have a picture to lip sync with. In voice acting, if you have to do ADR, there’s no picture. You hear the original track, you get counted in, and then you do your line while the old one is playing in your headphones.

Try recording yourself, playing it back, and then saying the same line over again, exactly the same way.

Being a DBA has a lot of similarities:

  1. Repetitive tasks
  2. Attention to detail
  3. Troubleshooting with no visual guides
  4. Trying to do something complicated while someone is talking in your ear
  5. Someone is always judging you
  6. You have to go with your instincts sometimes.

Someone asked me recently whether I would choose between being a SQL Server professional, or a filmmaker. I answered that I couldn’t choose. They complement each other and keep me sane.

Thanks for reading. If you’d like to comment on Twitter, find me under @bornsql.