In February 2011, Pat Wright invited us to talk about Automation:
So the topic I have chosen for this month is Automation! It can be Automation with T-SQL or with Powershell or a mix of both. Give us your best tips/tricks and ideas for making our lives easier through Automation.
I’ve spoken about this topic here previously (Automation is the new black), and in order to avoid repeating myself I’ll use a specific example of where automation has helped a customer of mine in a small but significant way.
I’m working on a long-term project to help a customer that is currently on SQL Server 2014 upgrade to SQL Server 2019. The main stumbling block is that they have many years of archive data in their production database full of XML audit records. Every time a row is updated or deleted by a stored procedure, the previous version of the row is written as XML to an append-only audit table in its own filegroup. The customer wants to reduce the amount of storage used by this archive, which by law they must keep for ten years, so one of the things I’ve been investigating (and wrote about previously here as well) is what kind of data compression makes the most sense with the least impact on the performance.
We have finally settled on using a clustered columnstore index on the archive table, but this requires converting the XML to NVARCHAR(MAX)
first. There will need to be a period of time during the upgrade / migration when the applications are all offline for this conversion process. The stored procedures don’t need to be modified because they’ll still be writing XML to the table, it will just be stored as NVARCHAR(MAX)
, however the way we read from this table will need some modification because we’re also doing away with all but one of the non-clustered indexes.
So how does this relate to the topic of automation? Let’s take a step back in time to 2013 when this auditing design pattern was first established.
The technical specification required that all stored procedures that perform data modifications must write the old row as XML which is then written to the audit table. Owing to a major oversight on the application side, no security was implemented at the API level (the application programming interface is used by the web service to speak to the database). In other words, anyone with the right permissions from the application side could read any data from the database.
The fastest way to get a fix in place was for every stored procedure to get a new session GUID parameter. The stored procedure would need to check that session GUID against the UserSession table, which contains all current user sessions and associated expiry dates. If the date was still in the future, update the date to the current time plus a configurable timeout (in this case, 20 minutes), otherwise raise an error and don’t return any data.
This solved the immediate security problem, but it introduced an unintended consequence which became apparent during this upgrade project: because all data modifications were being tracked in the audit table, this included data modifications to the UserSession table, which included all modifications to the expiration date. In other words, every time someone accessed a table using a stored procedure, the UserSession table was being updated, so this row was dutifully converted to XML and written to the audit table, even if that data access was not a modification.
On the upside, this meant the application was fully tracking all data access via stored procedures. On the downside, after I ran a small SELECT
query, it turned out that more than 50% of all rows in the audit table and between 35% and 40% of all storage in that table comprises session state modifications that are useless after 30 days.
Our way to fix this — using good old automation — was twofold. Firstly, a manual task went through the audit table and removed audit records from that user session table source that are older than a month. Secondly, we created a SQL Agent task that runs every night before the regular maintenance tasks; the automation task deletes all rows from the audit table where the audit table source is the session table and is older than seven days.
Automation isn’t always glamorous or clean. If a task is repeatable, it should be automated. In this example we were able to automate away a problem that had a significant impact on archival storage, without requiring any code changes.
Share your scary automation story in the comments below.
One of my better automation tricks was when I implemented service broker. Implementation on the test server went perfectly and my light load testing showed no performance impact on either of the SQL instances. When it went to a real-world implementation though, things went sideways with tons of deadlocks. The problem – I was starting a new conversation with each message and closing the conversation as soon as I was done with the message. On test, this was maybe 1 per hour. On live, we were looking at 70,000+ per day in an 8 hour work day so deadlocks were happening and things were going sideways quickly.
First step – disable service broker while I investigate. Second step – adjust service broker messaging to reuse conversation handles that were already open. Now this works great and gives the added benefit of messages arriving in order (they ONLY arrive in order on the same conversation!). The problem that arises – conversations can get slow if they are left open for too long.
The automation – closing all conversation handles and starting new conversations on a daily basis via a SQL agent job. Did testing of this manually (after hours) which proved the solution would work, and then automating it meant I didn’t have a daily evening task to do. Get the performance benefit of uncluttered conversation handles and get the benefits of in-order messaging. win-win!
Some inspired thinking there! Thanks for sharing.
Comments are closed.