Approaching Zero: Identity Column Changes

This is part seven in a series on near-zero downtime deployments.

Identity columns are common enough in the SQL Server world that I figured I would spend some time on them today. I’m also going to include sequences in here because it’s a pretty short post otherwise. And if I learned nothing else from German philosophers, turgid and obscurantist prose is the best prose.

Reseeding an Existing Identity Column

PhaseProcess
Database pre-release
Database releaseReseed value
Code release
Database post-release

There are some tables where you create an identity value and expect to cycle through data. An example for this might be a queue table, where the data isn’t expected to live permanently but it is helpful to have a monotonically increasing function to determine order (just watch out for those wrap-arounds and you’re fine). An example of reseeding is below:

DBCC CHECKIDENT('dbo.MyTable', RESEED, 1);

This operation needs to take a LCK_M_SCH_M lock, otherwise known as a schema modification lock. Any transactions which are writing to the table will block your transaction but so will any readers unless you have Read Committed Snapshot Isolation turned on or the reader is in the READ UNCOMMITTED or SNAPSHOT transaction isolation level.

If you are using RCSI and don’t have extremely long-running transactions, this is an in-and-out operation, so even though there’s a little bit of blocking, it’s minimal.

Adding an Identity Column

PhaseProcess
Database pre-releaseCreate new table
Database releasePartition switch
Rename tables
Code release
Database post-releaseDrop old table

Kenneth Fisher has a great technique for switching out tables to avoid having to write a complicated backfill process. So let’s try it out with limiting downtime in mind.

First, I want to create a table. I’m going to load this with a pretty fair amount of data, too.

CREATE TABLE [dbo].[PredictionTest]
(
    [PredictedQuantitySold] [decimal](8, 3) NOT NULL,
    [ActualQuantitySold] [decimal](8, 3) NULL
)
GO
INSERT INTO dbo.PredictionTest 
(
	PredictedQuantitySold,
	ActualQuantitySold
)
SELECT TOP(1000000)
	RAND(CHECKSUM(NEWID())),
	RAND(CHECKSUM(NEWID()))
FROM sys.all_columns ac1
	CROSS JOIN sys.all_columns ac2;

Now we have a million rows and nowhere to call home. Let’s create a new table with an identity column and the same schema otherwise (including indexes, key constraints, and everything else). For more notes on this, check out Kenneth’s other blog post on changing identity column increments.

Here is our create statement for the pre-release phase:

IF (OBJECT_ID('dbo.PredictionTestNew') IS NULL)
BEGIN
	CREATE TABLE [dbo].[PredictionTestNew]
	(
		[PredictedQuantitySold] [DECIMAL](8, 3) NOT NULL,
		[ActualQuantitySold] [DECIMAL](8, 3) NULL,
		[PredictionID] BIGINT IDENTITY(5000000,1) NOT NULL
	);
END
GO

Now, if you’re eagle-eyed, you noticed a little bit of a complication I added: PredictionID did not exist in the old table but it does in the new. If I simply try to perform a partition switch, I’ll get an error:

Msg 4943, Level 16, State 1, Line 42
ALTER TABLE SWITCH statement failed because table ‘tempdb.dbo.PredictionTest’ has 2 columns and table ‘tempdb.dbo.PredictionTestNew’ has 3 columns.

This means that I need to add the PredictionID column to the new table before performing this swap. Because I have to add a column at the end, our identity column needs to be the last one—columns between the two tables much be in the same order. Here is the script to perform the partition swap:

IF NOT EXISTS
(
	SELECT 1
	FROM sys.columns sc
	WHERE
		sc.name = N'PredictionID'
		AND sc.object_id = OBJECT_ID('dbo.PredictionTest')
)
BEGIN
	ALTER TABLE dbo.PredictionTest ADD PredictionID BIGINT NOT NULL DEFAULT(0);
	ALTER TABLE dbo.PredictionTest SWITCH TO dbo.PredictionTestNew;
	EXEC sp_rename
		N'dbo.PredictionTest',
		N'PredictionTest_DELETEMEAFTER20190501';
	EXEC sp_rename
		N'dbo.PredictionTestNew',
		N'PredictionTest';
END

We are doing four things here:

  1. Add PredictionID to the existing table and filling it with 0 values so that we have something.
  2. Switching our data from PredictionTest over to PredictionTestNew.
  3. Renaming PredictionTest to PredictionTest_DELETEMENOW.
  4. Renaming PredictionTestNew to PredictionTest.

All of these are metadata operations and we will need to take a schema modification lock. All told, if there’s no upstream blocking, this is done in milliseconds.

In the post-release process, drop PredictionTest_DELETEMENOW. There won’t be any rows in this table.

Here are a few notes as I wrap up this section:

  • Your existing rows will not get identity values; only new rows will take advantage of the identity attribute. This means that if you don’t want a million values with a prediction ID of 0, you’ll need to fix that yourself with a backfill process.
  • This also works to remove an identity integer, such as if you’re migrating from identity values to sequences.
  • This also also works if you’re changing the increment value, so you might go from jumping 1 each insert to 2 each insert because you really hate odd numbers.
  • If you want to add a primary key constraint as well, you’re probably better off taking the backfill route, which we saw in the post on table modifications.

Creating a Sequence

PhaseProcess
Database pre-releaseCreate sequence
Database releaseAlter tables
Code release
Database post-release

Creating a sequence is pretty easy:

IF NOT EXISTS
(
	SELECT 1
	FROM sys.sequences s
	WHERE
		s.name = N'PredictionTestSequence'
)
BEGIN
	CREATE SEQUENCE [dbo].[PredictionTestSequence]
	AS BIGINT
	START WITH 10
	INCREMENT BY 1;
END
GO

Once we have a sequence, we can use it in stored procedures to populate something like PredictionID (supposing we don’t use an identity integer). We might add something like this to an insert procedure for the prediction test table:

INSERT INTO dbo.PredictionTest
(
	PredictedQuantitySold,
	ActualQuantitySold,
	PredictionID
)
VALUES
(3, 2, NEXT VALUE FOR dbo.PredictionTestSequence);

Application code would not care how PredictionID gets populated, so all of our changes are subsumed into stored procedure changes during the database release.

Restarting a Sequence

PhaseProcess
Database pre-releaseAlter sequence
Database releaseAlter procedures to reference sequence
Code release
Database post-release

Restarting a sequence is as simple as running an ALTER SEQUENCE command. I include it separately because it’s probably the most common change and I want to save “alteration” for changing the data type.

ALTER SEQUENCE [dbo].[PredictionTestSequence]
RESTART WITH 50;

Any processes trying to get the next value for the sequence will require a schema stability lock on the sequence object itself, so there can be some blocking, but only for a moment until it clears up.

One interesting thing: processes selecting the next value for a sequence will not block your sequence restart. Unlike some of the changes we’ve seen in supporting objects (like many constraints), you can have operations in progress pulling values before the restart occurs.

Also, note that you can change the increment, min value, max value, cycling, and cache options the same way.

Altering a Sequence Data Type

PhaseProcess
Database pre-release
Database releaseAlter procedures (use new data type)
Code release
Database post-releaseDrop constraints
Drop sequence
Create sequence
Create constraints

You cannot use the ALTER SEQUENCE syntax to change the data type for a sequence. Microsoft’s recommendation is to drop and re-create the sequence, so let’s see how that works for us.

If all you have is stored procedures calling the sequence, then the first step would be to update your stored procedures to use the new data type. For example, suppose you have a sequence of type INT but you’re bumping up to BIGINT (probably the most common change by far). Change your variables which read the data type from INT to BIGINT and you’ll be fine. If your tables are defined as INTs, you’ll want to review my notes on changing a column’s data type and possibly reflect upon how that job as a street sweeper would be really nice right now. Street sweepers have zero stress. What’s the worst thing that happens to you when you’re driving a street sweeper? You hit the curb. Big whoop; the curb gets more damage than your vehicle. And if some yokel’s parked in the middle of the street on street sweeping day, you do your best to try to avoid the car but if they can’t follow the signs, what’s it to you?

Now that I’ve gotten the less hardcore types thinking about new positions in street maintenance, let’s suppose we have a table called SomeTbl which uses a sequence for its default like so:

CREATE TABLE dbo.SomeTbl
(
	Id BIGINT DEFAULT(NEXT VALUE FOR dbo.PredictionTestSequence)
);

What happens when I try to drop and create the sequence as an INT?

Msg 3729, Level 16, State 1, Line 85
Cannot DROP SEQUENCE ‘dbo.PredictionTestSequence’ because it is being referenced by object ‘DF__SomeTbl__Id__68487DD7’.

If you guessed error, you win the prize, which is to fix the problem. The fix is that we have to drop all of those default constraints which directly reference sequences. But if you have a default covering some null data in an otherwise non-nullable column, I leave that as an exercise to you, dear reader.

Otherwise, the steps in a transaction are:

  1. Drop any default constraints referencing the sequence.
  2. Drop the existing sequence.
  3. Define the sequence as a new type.
  4. Re-add default constraints.

Let me give you an alternative, though: just create a new sequence and modify any calling code to use it instead. Make life easy.

Conclusion

In this post, we took a look at identity columns and sequences. There is some special behavior here for both which we can exploit for low-downtime environments.

Advertisements

2 thoughts on “Approaching Zero: Identity Column Changes

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s