2016 in review

Lots of interesting things happened this year in the world of data.

As 2016 draws to a close, I’d like to encourage you to find a piece of Azure SQL Database, or SQL Server, and play with it, understand it, and teach someone else how it works.

One of the projects I’m looking forward to in 2017, is dealing with a massive volume of data from Internet of Things (IoT) devices. SQL Server seems up to the task.

See you on the other side!

Remote Storage Sync and Restore updated

Recently, I presented my “Back up and Restore your Database with Azure Blob Storage” session to the Bellingham PASS Chapter.

There are two components to the Azure Blob Storage Sync and Restore utility: Sync and Restore.

I wrote the Sync tool to work around difficulties I’d experienced with Azure Blob Storage’s Infinite Lease, and limitations in AzCopy. Lately these problems have been made mostly redundant, but the Restore tool remains extremely valuable, and I use it regularly for testing backups.

I’m happy to announce that new versions of both the sync and restore components have been released today, and they have a new name.

Both include some new features, which I’d like to highlight below, mostly to the Restore tool.

New Features

Point-In-Time Restore

During the Bellingham session, one of the questions asked was whether the Restore tool could perform a point-in-time restore using the RESTORE ... STOPAT keyword.

You can now perform a point-in-time restore using the Restore tool, and it will only download the files from the latest backup that it needs, up to and including that point in time.

Note: This feature only works on the latest available backup. A feature enhancement will be included in a later release to perform a point-in-time restore for any backup, if the requisite files exist.

Download Indicator

It’s handy to know how fast a file is downloading, but there aren’t easy ways to do this on a command-line tool, and it can get confusing in a disaster recovery scenario if you don’t know if the download is stuck.

AzureBlobStorageRestore will now show your download progress, using a rotating cursor similar to one I first saw on the MS-DOS version of Doctor Solomon’s Antivirus.

Big shout-out to Daniel Wolf for his ProgressBar gist (available free under the MIT License).

Sort Order

Previously, the Restore tool would download files from smallest to largest, but this could be disconcerting in a disaster recovery scenario as it might look like files are missing.

Now files can be downloaded in alphabetical order, or the default of smallest to largest.

One More Thing

There’s a lot of emphasis on off-site storage these days, and not everyone uses Azure Blob Storage.

For example, one of my customers has a 5TB file share, mapped using UNC (in the format \\server\share) and wants to copy files there.

What if you already keep tape backups, managed by a company that takes your tapes off-site in a fireproof lockbox, and all you need to do is get the files to a network share to be backed up to tape?

What if you just want to copy your database backup files to a local network share as well as Azure Blob Storage?

What if you know you should be moving your backup files to a network share but keep forgetting to do so?

More importantly, what if your backup files are already on a local network share and you have a disaster recovery situation. Downloading from Azure Blob Storage will take too long, and you just want to restore the latest backup from a local network share.

Introducing File Storage Sync and Restore.


Like its older sibling, FileStorageSync synchronises the contents of a local directory with a network share using a UNC path (in the format \\server\share). You can provide a username and password as well, which is convenient for creating scheduled tasks. All other features from AzureBlobStorageSync are included.


Like its older sibling, FileStorageRestore parses all the files in a UNC share, and downloads only the the latest full, differential and transaction log backup files required to perform a restore of your database.

It also includes Point-In-Time Restore (for the latest backup) and Sort Order customisation.

A download indicator is not provided in this version because local network speeds are usually so quick that it would add unnecessary overhead.

All other features from AzureBlobStorageRestore are included.

Note: A future enhancement which restores the latest backup from the UNC network share, without copying the files first, is coming soon. You will be able to restore your database as fast as possible.

If you want to play with the new features, you can find them in the GitHub repository.

This project is released free under the MIT License (free for commercial and non-commercial use), and all you have to do is give me credit.

Find me on Twitter at @bornsql if you have any thoughts. Feature requests can be submitted on the GitHub repository itself.

Look, Ma, No Surprises

Last week I demonstrated at least 30% performance improvement by switching to memory optimised table-valued parameters on SQL Server 2016.

This week I will demonstrate the same test using Azure SQL Database, on the Premium tier, where In-Memory OLTP is supported.

My test harness is the same. I will create a temp table and a table variable:

-- Temp table creation
CREATE TABLE #temptable

-- Table variable creation
DECLARE @tablevariable AS TABLE

Now I will create a standard table-valued parameter:


DECLARE @TempTVPStandard AS dbo.TVPStandard;

Finally, I will create a memory-optimized table-valued parameter (there is no requirement to separately enable In-Memory OLTP in Azure SQL Database):


DECLARE @TempTVPMemory AS dbo.TVPInMemory;

So far, nothing is different from last week’s scenario. We now have the following structures at our disposal:

  • #temptable (a temp table)
  • @tablevariable (a table variable)
  • @TempTVPStandard (a standard TVP)
  • @TempTVPMemory (a memory-optimized TVP)

I’m going to use the same WHILE loop again, but instead of a million runs, I’ll do 1000, 10,000 and 100,000, because I’m paying for this instance of Azure SQL Database (I picked a Premium P1, with 125 DTUs) and I’m a cheapskate. I doubt the 125 DTUs is even enough to run a million times for the fourth option.



WHILE @i < <number of executions>
INSERT INTO <object name>
SELECT @i = @i + 1;




Like last week, at low row counts all four data structures performed around the same speed. Tempdb is shared with other Azure SQL Database customers, so I expected to see that slower.

I ran each series three times and took the lowest value from each run.

Data Structure 1,000 10,000 100,000 1,000,000
Temp Table (Clustered) 94 ms 453 ms 4,266 ms 44,955 ms
Table Variable (Clustered) 93 ms 344 ms 3,484 ms 34,673 ms
Standard TVP (Clustered) 94 ms 343 ms 3,500 ms 34,610 ms
Memory-Optimized TVP
78 ms 203 ms 1,797 ms No Time

Unsurprisingly, because Azure SQL Databases share tempdb with other customers, the IO-bound figures are higher than those on my dedicated laptop, no matter how old it is.

The big winner here, again, is the memory-optimized table-valued parameter. More than twice as fast as a temp table, and almost twice as fast as a table variable or standard TVP.

Note, however, that because my Azure SQL Database was resource-constrained, I was unable to run the one-million step WHILE loop.

This is an excellent example of how In-Memory OLTP is a trade-off, no matter whether you’re running on premises or in the cloud. While you do consistently get much better performance, it is not reliable for large data structures.

If you have any In-Memory OLTP tweaks you’d like to share, find me on Twitter at @bornsql.

Memory-Optimized Table-Valued Parameters

November brings me back to my regular weekly posting schedule.

I attended PASS Summit 2016 last week, and saw a lot of great sessions. Aside from the new stuff, what I also noticed is how people are making use of existing technology to solve performance problems.

I have a customer that is targeting their vendor app to run in Azure SQL Database. They’ve run the Advisor and have discovered a number of places where they make use of temp tables. While this is not in itself a problem, how they check for their existence is a problem.

Azure SQL Database does not allow you to refer to tempdb by name. Any T-SQL construct referring to a temp table using the form tempdb..#table is explicitly forbidden.

One of the workarounds I recommended to them was to replace their use of temp tables with TVPs (table-valued parameters). There is no performance benefit to doing so, but it works.

Then I remembered that Azure SQL Database Premium Tier now supports In-Memory OLTP.

If you recall in a previous post, I suggested that Standard Tier and Premium Tiers are functionally equivalent to Standard Edition and Enterprise Edition respectively. This is no exception. You can create Memory Optimized TVPs in the Premium Tier of Azure SQL Database.

(Edit: With SQL Server 2016 Service Pack 1, you can now create Memory-Optimized objects in all editions of SQL Server.)

In other words, for these short-lived temp tables, there’s not only an added benefit of no longer worrying about referring to tempdb..#table, but you also get a massive performance improvement as well.

There are some legitimate concerns, though. Any Memory Optimized structures naturally require dedicated memory, so if you are already memory-bound on your Azure SQL Database (see the DTU post I wrote last month), this is going to hurt you. On the other hand, by not using tempdb at all, the performance improvement (as much as 100% in what I saw in the sessions last week) may offset that enough to warrant its use.

Next week, I will compare the performance characteristics of temp tables, standard TVPs, and Memory Optimised TVPs, both on-premises (using SQL Server 2016 Developer Edition, which is the same as Enterprise Edition) and Azure SQL Database (using a Premium Tier database).

If you have any other quick performance wins using new features of SQL Server 2016 or Azure SQL Database, find me on Twitter at @bornsql .

Should I upgrade SQL Server 2008 / R2, or migrate to Azure?

Brent Ozar Unlimited runs a website, called SQL Server Updates, which comes in really handy for keeping your on-premises SQL Server up to date.

Last week I noticed something interesting: if you’re running SQL Server 2008 or 2008 R2, Microsoft’s extended support for it ends on 9 July 2019. That’s two versions of SQL Server, almost two years apart in release cycle, with different feature sets, expiring on the same day.

Granted, by that date SQL Server 2018 will probably have been released, which would make 2008 officially five versions old and 2008 R2 four versions old. But in corporations, time moves both slowly and quickly. By that I mean, planning to upgrade takes a long time, and before you know it, it’s 9 July 2019 and everyone is panicking.

It’s the right time, today, to start your upgrade plan. A lot of customers I speak to, running on 2008 or 2008 R2, are talking about upgrading to 2012 or 2014 right now.

On Twitter, I said that after the end of next year, I would not recommend upgrading any instances to SQL Server 2012, for this reason:

In 2019, moving to SQL Server 2016 buys you 7 more years of support. I wouldn’t consider moving to SQL Server 2012 after the end of 2017.

At the end of 2017, SQL Server 2012 will be more than five years old. Despite your best intentions assuming a major upgrade will be quick, and considering that SQL Server licensing costs the same regardless of the version you buy, there’s no practical reason not to go to at least SQL Server 2014, and preferably SQL Server 2016 (which I consider the biggest release since SQL Server 2005). You can always run your database in 2008 Compatibility Mode if you’re running SQL Server 2014.

I wrote down some reasons to upgrade from SQL Server 2005 last year. These reasons still apply.

Whither Azure?

Which brings up the big question: what about Azure SQL Database? As you know, my first question when customers ask about moving to Azure SQL Database is “Are you sure?”, followed by “Are you very sure?”.

By the time July 2019 rolls around, Azure will be ten years old. That’s hard to fathom right now, but at their current rate of progress, Azure SQL Database will be a mature platform (it already is mature for appropriate workloads, when taking elastic pools into account).

So, if you meet all of the following conditions:

  • you use SQL Server 2008 or 2008 R2,
  • you plan to upgrade after 2017, and
  • you’re running a database smaller than 1TB,

I would recommend that you seriously consider migrating to Azure SQL Database, instead of upgrading to an on-premises database server running SQL Server. The time is right, at the end of 2016, to spend 18 months on an appropriate migration strategy to achieve this.

Yes, 18 months from now we’ll be deep into 2018. Scary, isn’t it?

Act now.

If you want to discuss this on Twitter, look me up at @bornsql.

Temporal Tables in Azure SQL Database

In the latest Microsoft Azure newsletter I received last week was this most excellent news:

Azure SQL Database Temporal Tables generally available

Temporal Tables let customers track the full history of data changes in Azure SQL Database without custom coding. Customers can focus data analysis on a specific point in time and use a declarative cleanup policy to control retention of historical data. Designed to improve productivity when customers develop applications, Temporal Tables can help:

– Support data auditing in applications.
– Analyze trends or detect anomalies over time.
– Implement slowly changing dimension patterns.
– Perform fine-grained row repairs in cases of accidental data errors made by humans or applications.

For more information on how to integrate Temporal Tables in an application, please visit the Getting Started with Temporal Tables in Azure SQL Database documentation webpage. To use temporal retention, please visit the Manage temporal history with retention policy documentation webpage.

Long-time readers of my blog will remember my short series about Temporal Tables in SQL Server 2016. Now it’s time to play with them on Azure SQL Database too!

Changes to Table Schema

With Azure SQL Database, just like SQL Server 2016, you can change the table schema without breaking the link to the history table. From the above Getting Started link, it states that you perform standard ALTER TABLE statements, “and Azure SQL Database will appropriately propagate changes to the history table”. It’s good to see feature parity across products like this. Temporal tables even work on Basic-sized databases.

Go forth and play with Temporal Tables. You’ll no longer audit data changes the same way. Get the benefits of Change Data Capture without the need for massive complexity.

Share your CDC horror stories with me on Twitter, at @bornsql.

Azure Storage Throughput

Hot on the heels of DTUs from earlier this week, today we’re going to switch focus a little, while staying with Azure, and have a quick look at storage speeds on SQL Server running on Azure Virtual Machines (Infrastructure-as-a-Service).

Microsoft recently lowered their prices for more popular VMs, which is making this an attractive solution for a lot of folks who don’t want to run their servers on-site.

However, there’s a limit to how fast our storage can get, despite the attraction of solid-state drives on the Premium Storage offering.

We in the industry refer to “latency” when discussing storage. This is measured in seconds per MB (s/MB), or how long it takes to access 1 MB of data. Target latency for SQL Server is 5 ms to 10 ms. In other words, the best case scenario for SQL Server is that we should be able to access 1 MB of data in between 5- and 10-thousandths of a second.

Everything on Azure is virtualised, from the CPU and RAM, down to the storage layer. Behind the scenes, all of these components are connected by what Microsoft calls the “fabric layer”. In simpler terms, it’s the network, stupid.

Yes, it has a low latency. Yes, a customer of mine did save almost 50% of their budget by switching from a data centre to Azure, and yes, they did get a nice performance boost in the process.

However, there’s a limit, and that is the theoretical limit on network throughput. The absolute best case for any local network these days is 10 Gb/s, or in other words, 1.25 GB per second (there are 8 bits in a byte, so we have to divide the bits by eight to get the bytes per second).

In realistic terms though, the connection between our virtual machine and the SSD premium storage, depending on our region and storage tier, is more likely to be 1 Gbps, or 125 MB per second. It’s not really clear from the Azure Documentation what the actual limits are, but I’ve personally seen it max out at 125 MB/s. The region we host in matters a great deal.

Inverting that 125 MB/s figure to get the latency, we’re seeing 8 ms at best. That’s still within the window of 5–10 ms, so it’s not terrible. Now add in the fact that we’re not the only person using that underlying hardware, and the latency starts creeping up. There will be competition on the VM host, the network card, the storage, and it all adds up. Pretty soon we’re looking at 20–30 ms of latency, which is now more than twice as long as the preferred amount.

No matter what the CPU speed is, or how much RAM we have, our bottleneck is always going to be the network. Those of us who have experience with older SANs (storage area network) will recognise this. Even having the fastest spindles (hard drives) in the SAN is meaningless when we’re connected to the server with a 1 Gbps half-duplex network card (no sending traffic is allowed while receiving traffic).

As I say on this blog and in person, Azure is not the answer for everyone (whether we’re looking at SQL Server on an Azure VM or an Azure SQL Database). We need to make sure that we understand the inherent limits of the infrastructure before committing to a long-term project to migrate everything.

I’m not saying Azure isn’t a good idea. I just pointed out above that a client I migrated recently was very happy with their outcome. Their data centre provider offering was terrible, and Azure was a great improvement. I personally run a production system on SQL Server, running on a small Azure VM.

20–30ms might be twice as fast as your current solution. If your database is running on 7200 rpm SATA drives in a RAID 5 configuration, you’ll see a massive increase in performance by switching to an Azure VM with P30 Premium Storage.

Here are some fantastic resources to help you plan your migration better:

It all comes back to the question: Are you sure?

Let me know your Azure migration plans on Twitter, at @bornsql . I look forward to hearing from you.

Database Throughput Units

Last time we briefly touched on the metric that Microsoft uses to keep your Azure SQL Database in check: the DTU, or database throughput unit.

It uses a combination of CPU, I/O and log flushes/second, using a special algorithm, to calculate a single unit.

The reason for this is simple: Our databases don’t use the same resources all the time. We will have busy periods and quiet periods. Some of us are fortunate and only need the database to be available during business hours.

This leaves a bit of mystery around what sort of resources our database may need, if we want to run it in Azure instead of on premises.

The DTU Calculator, a third-party service created by Justin Henriksen (a Microsoft employee), will calculate the DTU requirements for our on-premises database that we want to migrate to Azure, by firstly capturing a few performance monitor counters, and then performing a calculation on those results, to provide the recommended service tier for our database.

Justin provides a command-line application or PowerShell script to capture these performance counters:

  • Processor – % Processor Time
  • Logical Disk – Disk Reads/sec
  • Logical Disk – Disk Writes/sec
  • Database – Log Bytes Flushed/sec

For best results, we should run this session during the busiest time of the day for our database, so as to get a realistic idea of what the load is. Justin recommends running it for one hour.

Once the results are captured, a CSV file containing the output is then uploaded to the DTU Calculator site. It will take a few seconds and then provide a recommended service tier for our database. We can then see how much it would cost to run our on-premises database in an Azure SQL Database environment (assuming of course that it is compatible).

In the sample I captured for an existing client for the purposes of this post, I was surprised to find that the Premium Tier was recommended, at level P2.

DTU Calculator graph

Based on your database utilization, we recommend you migrate your SQL Server workload to Premium – P2. This Service Tier/Performance Level should cover approximately 91.83 % of your utilization.

NOTE: There is approximately 8.17 % of your workload that falls into a higher Service Tier/Performance Level. After migrating your database to Azure, you should evaluate your database’s performance using the guidance mentioned in the  information section above.

This means that it would cost more than $1,350 (CAD) per month (assuming a 24/7 uptime) to run this database in the cloud at the P2 level. A SQL Server Standard Edition licence for the existing hardware, when it is time to upgrade, will cost a shade under $10,000 (CAD) for four cores. In other words, the licence will pay for itself within eight months.

Perhaps this is worthwhile for the customer, if they do away with hardware and other on-site costs, but if their database load increases for any reason and resource limits are repeatedly hit, they will have to move up to the next level (P3), which is immediately double the monthly cost.

Please make use of the DTU Calculator if you intend to migrate an existing database to Azure SQL Database. It should form an important part of your migration plan. There is no reason not to, and it will save you the pain of making a costly and time-consuming mistake.

Let me hear your DTU results on Twitter at @bornsql .

Azure SQL Database Limits

Let’s talk briefly about resource limits with Azure SQL Database. Because we have to share resources with other users, and because Microsoft doesn’t want us affecting others dramatically, they have implemented some limits.

If for some reason our database does get overeager, the operations relating to CPU, RAM and I/O will be queued up by the underlying resource manager, which will allow those operations to continue once the resources are free. Yes, that will slow down our system.

Resources other than CPU, Memory, Log I/O, and Data I/O, as noted by the documentation, will be denied, and clients will see an error message.

Microsoft uses something called DTUs (Database Throughput Units, a combination of CPU, RAM and log writes) to figure out our Azure SQL Database’s resource usage.

Check out the DTU Calculator to find out more, and see how your on-premises database compares. Next week, we will go into more detail about DTUs, and how to calculate the DTU usage of an on-premises database.

It gets really complicated, really quickly, but using the DTU Calculator before we decide to move to Azure SQL Database, and being conservative with our estimates, will make our lives easier.

The biggest an Azure SQL Database can be on the Standard Tier, is 250 GB. The biggest it can be on the Premium Tier, is 1000 GB.

Coincidentally, the SQL Server Customer Advisory Team (SQLCAT) wrote a blog post titled “Real-World Azure SQL DB: Unexpected Database Maximum Size Limit” last month, about a gotcha when upgrading between service tiers. It is recommended reading.

It helps to draw parallels between Standard Tier and Standard Edition of SQL Server, and Premium Tier and Enterprise Edition of SQL Server.

(Edit: With SQL Server 2016 Service Pack 1, you can now create Memory-Optimized objects in all editions of SQL Server.)

There is a Basic Tier for Azure SQL Database, but the resource limits are so strict that I barely consider it. Perhaps if you run a small MySQL database for a WordPress website, with a caching plugin, this may be appropriate.

Using elastic pools (the ability to share the count of resource DTUs between several databases in a pool, that shrinks and grows as your requirements change) can help with balancing costs across several databases, but it is going to seem more expensive at the outset.

The bottom line with Azure SQL Server is that the resource limits may take you by surprise if you don’t plan correctly. Spend the time to evaluate the requirements of your database, and don’t just use Azure SQL Database because it’s new and shiny. If you currently have an Enterprise licence for SQL Server, chances are you’ll spend the equivalent (or more) on a Premium Tier database.

Azure SQL Database is not a way to save money on your SQL Server licences. It is a different technology, with different resource requirements, that cater to a specific workload. Choose wisely.

If you have some thoughts about DTUs, come and talk to me on Twitter, at @bornsql.

Planning matters

It’s the 2016 PASS Summit at the end of this month, and I’ll be attending for my third year.

Because I’m a glutton for punishment, I’ll also be writing bonus posts for the month of October to celebrate Summit.

Fortunately, because I’m the luckiest person in the world, I am attending the Pet Shop Boys concert in Vancouver on the Monday evening beforehand, so that is going to be exciting, too.

Getting on a plane and flying from Calgary to Seattle is such a convenience these days. You book your flights, drive to the airport, breeze through airport security, and the plane jets you, undelayed, to Vancouver, where you hop on a train to your hotel, have dinner nearby, and then walk to the concert. The following morning after a good night’s uninterrupted sleep, you hop back on the train to the airport as scheduled, pass quickly through passport control (so as not to trouble you with that on the other side, one of the benefits of flying to the US out of Canada), once again breeze through airport security, and the plane jets you to Seattle-Tacoma airport, where you get off the plane quickly, fetch your luggage, breeze through customs, and there you are. A short train ride to the hotel later, you’re ready to face the networking track of the Summit for three days.

If only migrating a SQL Server database to Azure SQL Database were this simple.

Of course, I’m joking. There are many very important steps involved when thinking about flying to another country, even if it’s just for a few days. This is especially complex when you add in a side-trip to Vancouver for an evening. It requires careful planning, making sure enough time exists to get everything done. And you can’t forget your passport either.

So why do so many people in the Information Technology industry want to do things without careful planning?

Every time (and I’m guilty of this at times, too), a new technology comes out, and it becomes trendy, we all jump on the bandwagon and start playing around with it. So far, so good. That’s the equivalent of kicking the car tyres.

And then management says (because management plays golf with vendor sales, or the developer lead likes that it uses tabs instead of spaces) that we have to move to this new technology to save the company lots of money. Less good, but doable.

The next logical step is to do a proof of concept. That’s where we research the requirements of the existing system, the limitations of the new technology, and build a working model to see if it is fit for purpose. If it is not fit for purpose, the amount of time we have expended is minimal, and we continue bug fixing on the older technology stack, letting management know that it is not possible, or practical, to move to this new technology.

If the proof of concept shows merit, then we go back to the project manager and business analysts, and begin planning for migration. We include all the stakeholders, including customers (internal and external), and slowly develop clear goals and timelines. There’s a project lead, and politics don’t enter into it. And unicorns are real.

Why is it that we want so badly to “play” with the new technology that we lose sight of this necessary step?

Microsoft announced Azure SQL Database in 2009. While it has been seven years since that time, and children born in that year are now at school, it’s a very short time in the technology world. SQL Server itself is only 27 years old. COBOL, which is still very much in use, is 57 years old. C is 44 years old. Large chunks of the Internet infrastructure are built on C and C++.

My point is, if we are going to move a company’s systems to newer technology, we have to consider longevity:

  1. How long will the new technology be around?
  2. How long will the system we’re migrating be around?
  3. (Bonus question) Will someone getting upset on the Internet destroy my system?

The bonus question relates to the crisis NPM faced earlier this year, when a single 11-line JavaScript file caused thousands of projects, that depended on it, to fail.

Moving to Azure SQL Database is a complicated process. While the underlying structure might look familiar, and while most of our code will probably work fine, there are major stumbling blocks that could scupper our plans entirely.

Take for example the fact that you are not allowed to address tempdb directly. You can create local temp tables and table variables just fine, but you cannot check for their existence by referring directly to tempdb.

This works:

CREATE TABLE #test (col1 INT);

This does not:

IF EXISTS (SELECT * FROM tempdb..sysobjects
WHERE id = OBJECT_ID('tempdb..#test')

This “minor” issue occurs in many more places than you might think it does, because everyone reading this knows about that one legacy system we’re too scared to touch.

If management is pressuring you to migrate to the new stuff because it’s trendy, they’re doing it wrong.

Careful planning is what you need. This includes a proof of concept, a 100% pass rate using the Data Migration Advisor, and a project manager who understands the complexities involved, plus a proper testing plan that includes automated unit tests, edge cases, and humans bashing away at keyboards and mice.

And don’t forget to plan for a disaster, especially for a disaster that occurs during the migration.

Share your migration horror stories with me on Twitter, at @bornsql .