In the following video, I take a look at the three most important transaction modes in SQL Server: autocommit, explicit transactions, and implicit transactions. Sorry, batch-scoped transactions, but nobody loves you.
If you’d prefer a textual rendering of the video, here are the pre-demo and post-demo sections, lightly edited for narrative flow.
Setting the Stage: Transactions and Modes
What I want to do in today’s post is to cover the different sorts of transaction modes and get into the debate about whether you should use explicit transactions or rely on auto-committed transactions for data modification in SQL Server. This came from an interesting discussion at work, where some of the more recent database engineers were curious about our company policy around transaction modes and understanding the whys behind it. I didn’t come up with the policy, but my thinking isn’t too far off from the people who did.
But before I get too far off course, let’s briefly lay out some of the basics around transactions.
When you modify data (that is, run a command like INSERT, UPDATE, MERGE, or TRUNCATE) or tables (CREATE, ALTER, DROP, etc.), that operation takes place inside a transaction. A transaction is, according to Microsoft Docs, a single unit of work. Everything in a transaction will either succeed as a whole or fail as a whole–you won’t end up with some operations succeeding and others not–it’s really all or nothing. The importance of this goes back to relational databases having ACID properties, but because that’s a little far afield of where I want to go, I’ll give you a link if you’d like to learn more about the topic, as it helps explain why transactions are useful for relational database developers.
What I do want to get to is that there are three kinds of transactions: autocommit transactions, explicit transactions, and implicit transactions. There’s actually a fourth kind, batch-scoped transactions, but that only applies to Multiple Active Result Sets transactions and if you find yourself there, you’ve got bigger problems than deciding how you want to deal with transactions.
In the demo for the video, I show off each of the three transaction modes, including how you enable them, how you work with them, and any important considerations around them.
The easy recommendation is, don’t use implicit transactions. For SQL Server developers and database administrators, this is unexpected behavior–the default is to use autocommit, so that if you run an INSERT statement by itself, the transaction automatically commits at the end. If you set implicit transactions on, there is no UI indication that this is on and it becomes really easy to forget to commit a transaction. I understand that if you come from an Oracle background, where implicit transactions are the norm, it might feel comfortable to enable this, but it becomes really easy to start a transaction, forget to commit or rollback, and leave for lunch, blocking access to a table for a considerable amount of time. And if you’re using Azure Data Studio, it appears that implicit transactions might not even work, so you’d be in a world of hurt if you were relying upon them. So let’s throw this one away as a recommendation.
My recommendation, whenever you have data modification on non-temporary tables, is to use explicit transactions over autocommit. I have a few reasons for this.
First, consistency. Sometimes you will need explicit transactions. For example, if I need to ensure that I delete from table A only if an insert into table B and an update of table C are successful, I want to link those together with an explicit transaction. That way, either all three operations succeed or none of them succeed. Given that I need explicit transactions some of the time, I’d rather be in the habit of using them; so to build that habit, I’d prefer to use them for all data modification queries.
Second, explicit transactions give you clarity around what is actually necessary in a transaction. Suppose you query a table and load the results into a temporary table. From there, you make some modifications, join to other tables, and reshape the data a bit. So far, nothing I’ve mentioned requires an explicit transaction because you’re only working with temp tables here. When you take the final results and update a real table, now we want to open a transaction. By using an explicit transaction, I make it clear exactly what I intend to have in the transaction: the update of a real table, but not the temp table shenanigans.
Third, as an implication of the second point, explicit transactions can help you reduce the amount of time you’re locking tables. You can do all of your heavy lifting in temp tables before opening the transaction, and that means you don’t have to do that while locking the real table. In the best case, autocommit will behave the same, but saying “Here is where I want my transaction to be” also lets you think about whether you really want to do everything at one statement or break it up into smaller chunks.
Finally, if you use a loop, whether that be a cursor or WHILE statement, you can control whether you want one transaction per loop iteration or one transaction in total, and that’s entirely to do with whether you begin and commit the transaction outside of the loop or inside. Having one transaction in total can be considerably faster in some circumstances, but if you have an expensive action in the loop, you can commit after each loop iteration. This will minimize the amount of time you block any single operation waiting to access this table. It will increase the total runtime of your query, but minimize the pain to other users, and that’s a trade-off you can only make if you use explicit transactions.
Rules of Thumb
First, if you have a stored procedure which is simply running a SELECT statement, use autocommit. There’s no real advantage to putting this into an explicit transaction and there is the downside that you might forget to commit.
Second, if you have a stored procedure which performs data modification on non-temporary tables, use an explicit transaction only over the area which modifies data. Don’t begin the transaction until you’re ready to start modifying tables; this will minimize the amount of time you need to keep the transaction open and resources locked.
As a corollary of the second point, note that you can use explicit transactions to control parent-child relationships with stored procedures, where the parent begins a transaction, calls each child, and rolls back or commits at the end depending upon the results. That’s something you can’t do with autocommit, as each data modification statement would run in its own auto-committed transaction.
Third, if you are working with non-global temporary tables beforehand, don’t include any modification of those inside the explicit transaction. If you are working with global temporary tables, I suppose you should treat them like non-temporary tables here if you expect other sessions to use them and care about blocking, though there’s a pretty small number of cases where it makes sense to have global temporary tables with multiple users, so I’d call that an edge case.
Fourth, in a loop, choose whether you want to put the explicit transaction around the loop or inside it. In most cases, I prefer to put the transaction inside the loop to minimize the amount of time that I’m blocking others. This is probably the smarter move to make in busy transactional environments, where you want to prevent blocking as much as possible. Also, If one loop iteration fails, you’ll have less you need to roll back, so you can fix the issue and pick back up where you left off. Note that at that point, you trade atomicity on the entire set of data for atomicity on a block of data, so if that’s a big enough concern, bite the bullet and put an explicit transaction around the loop. Or see if you can make it faster without a loop.
Fifth, outside of a stored procedure—that is, when I’m just writing ad hoc statements in a client tool—use explicit transactions if you’re doing something potentially risky. I know this brings up the question of “Why are you doing risky things in a client tool to begin with?” But that’s a story for a different day.
Sixth, watch out for nested transactions. In SQL Server, there’s very little utility in them and their behavior is weird. Paul Randal explains in great detail just how broken they are, and I’d rather the product never have had them. Anyhow, check to see if you’re in a transaction before opening one. The pattern I like to use comes from my Working Effectively with Legacy SQL talk (which, ironically enough, needs some changes to be brought up to date) and originally from smart people in the office who put it together before I got there. Here’s a simplified version of it for a sample stored procedure:
CREATE OR ALTER PROCEDURE dbo.GetFraction @Divisor INT = 5 AS DECLARE @AlreadyInTransaction BIT; BEGIN TRY IF ( @@TRANCOUNT > 0 ) BEGIN SET @AlreadyInTransaction = 1; END ELSE BEGIN SET @AlreadyInTransaction = 0; BEGIN TRANSACTION; END; -- Note: this is where you'd put your data modification statements. SELECT 1.0 / @Divisor AS Quotient; IF ( @AlreadyInTransaction = 0 AND @@TRANCOUNT > 0 ) BEGIN COMMIT TRANSACTION; END; END TRY BEGIN CATCH IF ( @AlreadyInTransaction = 0 AND @@TRANCOUNT > 0 ) BEGIN ROLLBACK TRANSACTION; END; THROW; END CATCH GO --Test the procedure EXEC dbo.GetFraction @Divisor = 5; --Start an explicit transaction BEGIN TRANSACTION EXEC dbo.GetFraction @Divisor = 5; SELECT @@TRANCOUNT; ROLLBACK TRANSACTION
Finally, make sure you roll back the transaction on failure. If you write code using try-catch blocks, commit at the end of the TRY block or rollback at the beginning of the CATCH. Explicit transactions offer you more power, but come with the responsibility of handling transactions appropriately.
What are your thoughts on explicit transactions versus autocommit? Do you prefer the ease of autocommit or the power of explicit transactions? Or where do you draw the line between the two? Leave your thoughts in the comments section below—either here or on the video—and let me know.