This is part seven in a series on near-zero downtime deployments.
Identity columns are common enough in the SQL Server world that I figured I would spend some time on them today. I’m also going to include sequences in here because it’s a pretty short post otherwise. And if I learned nothing else from German philosophers, turgid and obscurantist prose is the best prose.
Reseeding an Existing Identity Column
|Database release||Reseed value|
There are some tables where you create an identity value and expect to cycle through data. An example for this might be a queue table, where the data isn’t expected to live permanently but it is helpful to have a monotonically increasing function to determine order (just watch out for those wrap-arounds and you’re fine). An example of reseeding is below:
DBCC CHECKIDENT('dbo.MyTable', RESEED, 1);
This operation needs to take a
LCK_M_SCH_M lock, otherwise known as a schema modification lock. Any transactions which are writing to the table will block your transaction but so will any readers unless you have Read Committed Snapshot Isolation turned on or the reader is in the
READ UNCOMMITTED or
SNAPSHOT transaction isolation level.
If you are using RCSI and don’t have extremely long-running transactions, this is an in-and-out operation, so even though there’s a little bit of blocking, it’s minimal.
Adding an Identity Column
|Database pre-release||Create new table|
|Database release||Partition switch|
|Database post-release||Drop old table|
Kenneth Fisher has a great technique for switching out tables to avoid having to write a complicated backfill process. So let’s try it out with limiting downtime in mind.
First, I want to create a table. I’m going to load this with a pretty fair amount of data, too.
CREATE TABLE [dbo].[PredictionTest] ( [PredictedQuantitySold] [decimal](8, 3) NOT NULL, [ActualQuantitySold] [decimal](8, 3) NULL ) GO INSERT INTO dbo.PredictionTest ( PredictedQuantitySold, ActualQuantitySold ) SELECT TOP(1000000) RAND(CHECKSUM(NEWID())), RAND(CHECKSUM(NEWID())) FROM sys.all_columns ac1 CROSS JOIN sys.all_columns ac2;
Now we have a million rows and nowhere to call home. Let’s create a new table with an identity column and the same schema otherwise (including indexes, key constraints, and everything else). For more notes on this, check out Kenneth’s other blog post on changing identity column increments.
Here is our create statement for the pre-release phase:
IF (OBJECT_ID('dbo.PredictionTestNew') IS NULL) BEGIN CREATE TABLE [dbo].[PredictionTestNew] ( [PredictedQuantitySold] [DECIMAL](8, 3) NOT NULL, [ActualQuantitySold] [DECIMAL](8, 3) NULL, [PredictionID] BIGINT IDENTITY(5000000,1) NOT NULL ); END GO
Now, if you’re eagle-eyed, you noticed a little bit of a complication I added: PredictionID did not exist in the old table but it does in the new. If I simply try to perform a partition switch, I’ll get an error:
Msg 4943, Level 16, State 1, Line 42
ALTER TABLE SWITCH statement failed because table ‘tempdb.dbo.PredictionTest’ has 2 columns and table ‘tempdb.dbo.PredictionTestNew’ has 3 columns.
This means that I need to add the PredictionID column to the new table before performing this swap. Because I have to add a column at the end, our identity column needs to be the last one—columns between the two tables much be in the same order. Here is the script to perform the partition swap:
IF NOT EXISTS ( SELECT 1 FROM sys.columns sc WHERE sc.name = N'PredictionID' AND sc.object_id = OBJECT_ID('dbo.PredictionTest') ) BEGIN ALTER TABLE dbo.PredictionTest ADD PredictionID BIGINT NOT NULL DEFAULT(0); ALTER TABLE dbo.PredictionTest SWITCH TO dbo.PredictionTestNew; EXEC sp_rename N'dbo.PredictionTest', N'PredictionTest_DELETEMEAFTER20190501'; EXEC sp_rename N'dbo.PredictionTestNew', N'PredictionTest'; END
We are doing four things here:
- Add PredictionID to the existing table and filling it with 0 values so that we have something.
- Switching our data from
All of these are metadata operations and we will need to take a schema modification lock. All told, if there’s no upstream blocking, this is done in milliseconds.
In the post-release process, drop
PredictionTest_DELETEMENOW. There won’t be any rows in this table.
Here are a few notes as I wrap up this section:
- Your existing rows will not get identity values; only new rows will take advantage of the identity attribute. This means that if you don’t want a million values with a prediction ID of 0, you’ll need to fix that yourself with a backfill process.
- This also works to remove an identity integer, such as if you’re migrating from identity values to sequences.
- This also also works if you’re changing the increment value, so you might go from jumping 1 each insert to 2 each insert because you really hate odd numbers.
- If you want to add a primary key constraint as well, you’re probably better off taking the backfill route, which we saw in the post on table modifications.
Creating a Sequence
|Database pre-release||Create sequence|
|Database release||Alter tables|
Creating a sequence is pretty easy:
IF NOT EXISTS ( SELECT 1 FROM sys.sequences s WHERE s.name = N'PredictionTestSequence' ) BEGIN CREATE SEQUENCE [dbo].[PredictionTestSequence] AS BIGINT START WITH 10 INCREMENT BY 1; END GO
Once we have a sequence, we can use it in stored procedures to populate something like PredictionID (supposing we don’t use an identity integer). We might add something like this to an insert procedure for the prediction test table:
INSERT INTO dbo.PredictionTest ( PredictedQuantitySold, ActualQuantitySold, PredictionID ) VALUES (3, 2, NEXT VALUE FOR dbo.PredictionTestSequence);
Application code would not care how PredictionID gets populated, so all of our changes are subsumed into stored procedure changes during the database release.
Restarting a Sequence
|Database pre-release||Alter sequence|
|Database release||Alter procedures to reference sequence|
Restarting a sequence is as simple as running an
ALTER SEQUENCE command. I include it separately because it’s probably the most common change and I want to save “alteration” for changing the data type.
ALTER SEQUENCE [dbo].[PredictionTestSequence] RESTART WITH 50;
Any processes trying to get the next value for the sequence will require a schema stability lock on the sequence object itself, so there can be some blocking, but only for a moment until it clears up.
One interesting thing: processes selecting the next value for a sequence will not block your sequence restart. Unlike some of the changes we’ve seen in supporting objects (like many constraints), you can have operations in progress pulling values before the restart occurs.
Also, note that you can change the increment, min value, max value, cycling, and cache options the same way.
Altering a Sequence Data Type
|Database release||Alter procedures (use new data type)|
|Database post-release||Drop constraints|
You cannot use the
ALTER SEQUENCE syntax to change the data type for a sequence. Microsoft’s recommendation is to drop and re-create the sequence, so let’s see how that works for us.
If all you have is stored procedures calling the sequence, then the first step would be to update your stored procedures to use the new data type. For example, suppose you have a sequence of type
INT but you’re bumping up to
BIGINT (probably the most common change by far). Change your variables which read the data type from
BIGINT and you’ll be fine. If your tables are defined as
INTs, you’ll want to review my notes on changing a column’s data type and possibly reflect upon how that job as a street sweeper would be really nice right now. Street sweepers have zero stress. What’s the worst thing that happens to you when you’re driving a street sweeper? You hit the curb. Big whoop; the curb gets more damage than your vehicle. And if some yokel’s parked in the middle of the street on street sweeping day, you do your best to try to avoid the car but if they can’t follow the signs, what’s it to you?
Now that I’ve gotten the less hardcore types thinking about new positions in street maintenance, let’s suppose we have a table called
SomeTbl which uses a sequence for its default like so:
CREATE TABLE dbo.SomeTbl ( Id BIGINT DEFAULT(NEXT VALUE FOR dbo.PredictionTestSequence) );
What happens when I try to drop and create the sequence as an
Msg 3729, Level 16, State 1, Line 85
Cannot DROP SEQUENCE ‘dbo.PredictionTestSequence’ because it is being referenced by object ‘DF__SomeTbl__Id__68487DD7’.
If you guessed error, you win the prize, which is to fix the problem. The fix is that we have to drop all of those default constraints which directly reference sequences. But if you have a default covering some null data in an otherwise non-nullable column, I leave that as an exercise to you, dear reader.
Otherwise, the steps in a transaction are:
- Drop any default constraints referencing the sequence.
- Drop the existing sequence.
- Define the sequence as a new type.
- Re-add default constraints.
Let me give you an alternative, though: just create a new sequence and modify any calling code to use it instead. Make life easy.
In this post, we took a look at identity columns and sequences. There is some special behavior here for both which we can exploit for low-downtime environments.