Approaching Zero: Constraint Changes

This is part six in a series on near-zero downtime deployments.

In the last post, we looked at making minimally-blocking changes to tables. Today, we’re going to focus on constraints: primary keys, unique keys, foreign keys, default constraints, and check constraints to be precise.

Setup Phase

For this set of tests, I’m going to create two artificial tables, LookupTable and LargeTable. The former will have 10 rows and the latter 2 million.

CREATE TABLE dbo.LookupTable
(
	Id TINYINT NOT NULL,
	CONSTRAINT [PK_LookupTable] PRIMARY KEY CLUSTERED(Id)
);
INSERT INTO dbo.LookupTable
(
	Id
)
VALUES
	(0),(1),(2),
	(3),(4),(5),
	(6),(7),(8),(9);

CREATE TABLE dbo.LargeTable
(
	Id INT IDENTITY(1,1) NOT NULL,
	LookupTableId TINYINT NOT NULL,
	SomeChar CHAR(1) NOT NULL,
	SomeVal DECIMAL(10,2) NOT NULL,
	SomeNullableInt INT NULL
);

INSERT INTO dbo.LargeTable
(
	LookupTableId,
	SomeChar,
	SomeVal,
	SomeNullableInt
)
SELECT TOP(2000000)
	lt.Id,
	LEFT(ac.name, 1),
	150 * RAND(CHECKSUM(NEWID())),
	CASE WHEN RAND(CHECKSUM(NEWID())) > 0.5 THEN 350 * RAND(CHECKSUM(NEWID())) ELSE NULL END
FROM dbo.LookupTable lt
	CROSS JOIN sys.all_columns ac
	CROSS JOIN sys.all_columns ac1
	CROSS JOIN sys.all_columns ac2;

At this point, LargeTable is a heap with no indexes anywhere.

Creating a New Primary Key Constraint

PhaseProcess
Database pre-releaseCreate constraint
Database release
Code release
Database post-release

We’re going to create a combination primary key constraint and clustered index on the Id column.

IF NOT EXISTS
(
	SELECT 1
	FROM sys.key_constraints kc
	WHERE
		kc.name = N'PK_LargeTable'
)
BEGIN
	ALTER TABLE dbo.LargeTable ADD CONSTRAINT [PK_LargeTable]
		PRIMARY KEY CLUSTERED(Id)
		WITH (ONLINE = ON, DATA_COMPRESSION = PAGE);
END

This is fundamentally the same as creating a clustered index (as we did in the last post): you create the constraint during the pre-release step and there is a short amount of blocking after the clustered index / primary key is ready to go. Writers will block the start of this operation like during a clustered index.

If you create a non-clustered primary key constraint on a heap, the process is the same and so is the story on blocking. Avoiding a clustered index won’t save you on that short bout of blocking.

Let’s say that we have a clustered index on SomeVal and want to create our primary key on Id. You’re probably not surprised when I tell you it’s exactly the same behavior. In short, expect a small amount of blocking and try to create this primary key when nobody’s looking (or at least writing).

Create a New Unique Key Constraint

PhaseProcess
Database pre-releaseCreate constraint
Database release
Code release
Database post-release

Again I have a clustered index on SomeVal. I’m going to try to create a unique index on Id this time, after having dropped the primary key constraint.

IF NOT EXISTS
(
	SELECT 1
	FROM sys.key_constraints kc
	WHERE
		kc.name = N'UKC_LargeTable'
)
BEGIN
	ALTER TABLE dbo.LargeTable ADD CONSTRAINT [UKC_LargeTable]
		UNIQUE(Id)
		WITH (ONLINE = ON, DATA_COMPRESSION = PAGE);
END

The locking story is the same as with primary key constraints above. We need to hold a lock once the unique key constraint is ready to go into action.

Creating a New Foreign Key Constraint

New constraint, same process:

IF NOT EXISTS
(
	SELECT 1
	FROM sys.foreign_keys fk
	WHERE
		fk.name = N'FK_LargeTable_LookupTable'
)
BEGIN
	ALTER TABLE dbo.LargeTable WITH CHECK ADD CONSTRAINT [FK_LargeTable_LookupTable]
	FOREIGN KEY (LookupTableId)
	REFERENCES dbo.LookupTable(Id);
END

The locking story is not the same as with the primary and unique key constraints. First, there’s one extra piece: the transition will block access to dbo.LookupTable as well as the table we create the constraint on. That’s to keep us from deleting rows in our lookup table before the key is in place.

Second, the locks begin as soon as we hit F5. Even SELECT statements get blocked requesting a LCK_M_SCH_S lock. Bad news, people.

So what can we do to get around this problem? Two routes: the ineffectual way and the ugly way.

The Ineffectual Way

PhaseProcess
Database pre-releaseCreate index (online)
Create constraint
Database release
Code release
Database post-release

Knowing that there will be blocking, we can reduce the length of blocking by including a non-clustered index on our foreign key constraint like so:

CREATE NONCLUSTERED INDEX [IX_LargeTable_LookupTableId] ON dbo.LargeTable
(
    LookupTableId
) WITH (ONLINE = ON, DATA_COMPRESSION = PAGE);

Doing that can speed up considerably the amount of time it takes for SQL Server to check LookupTableId looking for bad values. This doesn’t get you down to milliseconds, but might turn minutes into seconds and that route may be acceptable.

The Ugly Way

PhaseProcess
Database pre-releaseCreate new table with constraint
Create temporary trigger
Create backfill process
Disable backfill process
Database releaseRename tables
Drop temporary trigger
Code release
Database post-releaseDrop old table
Drop backfill process

This is similar to what we did when making a big column change in the last post. As you can see, there’s a lot more work here and it also requires that we have twice as much disk space available as normally needed for this table. But let’s trudge through this one step by step because sometimes the ineffectual way really won’t work.

Step 1: Create the Table

First, we need to create a new table. Let’s call it dbo.LargeTableTemp:

CREATE TABLE dbo.LargeTableTemp
(
	Id INT IDENTITY(1,1) NOT NULL,
	LookupTableId TINYINT NOT NULL,
	SomeChar CHAR(1) NOT NULL,
	SomeVal DECIMAL(10,2) NOT NULL,
	SomeNullableInt INT NULL,
	CONSTRAINT [PK_LargeTableTemp] PRIMARY KEY CLUSTERED(Id) WITH(DATA_COMPRESSION = PAGE),
	CONSTRAINT [FK_LargeTableTemp_LookupTable]
		FOREIGN KEY(LookupTableId)
		REFERENCES dbo.LookupTable(Id)
);

Make sure to use the new name with these constraints so we don’t have constraint name violations like you would if you tried to create another PK_LargeTable. Also, add the foreign key here. If you have non-clustered indexes, you can add them here or wait until the very end of the pre-release process. Just have them in place before you get to the database release phase.

Step 2: Create the Data Migration Process

Our data migration process will be a two-pronged approach: we create a trigger to handle new records and a backfill process to take care of existing records. Let’s first look at the backfill.

In the backfill, we’ll batch insert records. Ideally, you want to have a BackfillMetadata table like I showed in the last post so that your backfill can look a bit like this:

DECLARE
	@LatestBackfilledID BIGINT,
	@BackfillTable SYSNAME = N'LargeTableTemp',
	@BatchSize INT = 1000,
	@NumberOfBatches INT = 5;

SET NOCOUNT ON;

SELECT
	@LatestBackfilledID = MaximumProcessedValue
FROM dbo.BackfillMetadata bm
WHERE
	bm.BackfillTable = @BackfillTable;

IF (@LatestBackfilledID IS NULL)
BEGIN
	-- If you start your identity columns at MIN_VALUE, this would change.
	SET @LatestBackfilledID = ISNULL(@LatestBackfilledID, 0);

	INSERT INTO dbo.BackfillMetadata
	(
		BackfillTable,
		MaximumProcessedValue
	)
	VALUES
	(
		@BackfillTable, @LatestBackfilledID
	);
END

DECLARE
	@i INT = 0,
	@LoadThroughID BIGINT,
	@RowsAffected INT = 0;

SET IDENTITY_INSERT dbo.LargeTableTemp ON;

WHILE (@i < @NumberOfBatches)
BEGIN;
	WITH records AS
	(
		SELECT TOP (@BatchSize)
			PredictionID
		FROM dbo.PredictionTest pt
		WHERE
			pt.PredictionID > @LatestBackfilledID
		ORDER BY
			pt.PredictionID ASC
	)
	SELECT
		@LoadThroughID = MAX(PredictionID)
	FROM records;

	IF (@LoadThroughID IS NULL)
		-- We have finished processing.
		BREAK;

	INSERT INTO dbo.LargeTableTemp
	(
		Id,
		LookupTableId,
		SomeChar,
		SomeVal,
		SomeNullableInt
	)
	SELECT
		lt.Id,
		lt.LookupTableId,
		lt.SomeChar,
		lt.SomeVal,
		lt.SomeNullableInt
	FROM dbo.LargeTable lt
		LEFT OUTER JOIN dbo.LargeTableTemp ltt
			ON lt.Id = ltt.Id
	WHERE
		lt.Id > @LatestBackfilledID
		AND lt.Id <= @LoadThroughID
		AND ltt.Id IS NULL;

	SET @RowsAffected = @@rowcount;
	RAISERROR ('%i rows updated.', 10, 1, @RowsAffected) WITH NOWAIT;

	SET @i = @i + 1;
	SET @LatestBackfilledID = @LoadThroughID;
END

SET IDENTITY_INSERT dbo.LargeTableTemp OFF;

UPDATE dbo.BackfillMetadata
SET
	MaximumProcessedValue = @LoadThroughID
WHERE
	BackfillTable = @BackfillTable;
GO

This is functionally very similar to the backfill we used for changing a column type and that’s on purpose: the mechanics behind backfills are similar even as the particularities make them unique. Once that process is in place and running to load data into your temp table, we can also create a trigger which inserts new records into dbo.LargeTableTemp. As a side note, the backfill and trigger will need some hefty powers in this case because SET IDENTITY_INSERT ON requires ALTER on the table. We won’t get into security in this series because I have enough rabbit holes to go down as it is, but certificate signing can be your friend here.

Step 3: Disable the Backfill when Finished

Here we twiddle our thumbs for a bit. Remember that “pre-release” is not a defined amount of time, so this can take a while. I’ve worked on backfills which took weeks to complete because the tables were so large and so heavily-used that we didn’t want to move more than a tiny number of rows at a time.

Once the backfill is done, you can disable it. Here is where having one more column on dbo.BackfillMetadata might help: something like MaximumValueToProcess and perform a check when MaximumProcessedValue is at least MaximumValueToProcess. Or you can check out the table once in a while and disable it when you’re done. You probably aren’t doing so many backfills that you lose track of them, after all. Hopefully.

Once the backfill is disabled, you are ready for the database release portion of the show. That’s where we pick up in:

Step 4: Rename Tables and Drop Temp Trigger in a Single Bound

We have the dbo.LargeTableTemp table finished and it is in sync with dbo.LargeTable. We’re ready to make a move. In our release phase, we need to do all of these as one transaction:

  1. Rename dbo.LargeTable to something like dbo.LargeTable_DELETEMEAFTER20190501. We want to keep the table around for a little bit if we have the disk space, just in case there’s a catastrophic problem which requires rollback. If we are running on borrowed space, delete it sooner. But don’t drop it immediately.
  2. Drop the temp trigger.
  3. Rename dbo.LargeTableTemp to dbo.LargeTable.

Do this as one transaction so that people just get blocked rather than errors when trying to run procedures which query dbo.LargeTable. Here’s a sample:

BEGIN TRANSACTION
EXEC sp_rename
	'dbo.LargeTable',
	'LargeTable_DELETEMEAFTER20190501';
DROP TRIGGER [dbo].[LargeTable].[tr_LargeTable_Backfill_IU];
EXEC sp_rename
    'dbo.LargeTableTemp',
    'LargeTable';
COMMIT TRANSACTION

This will cause blocking for a moment but the pain will subside quickly and users will otherwise be unaware of your switcharoo.

After this, we’ve done everything we need to do in the release phase. On to the post-release phase.

Step 5: Drop Obsolete Objects

At some point, it becomes safe to drop the old dbo.LargeTable which we’ve renamed to have a DELETEMEAFTER indicator. We can also drop the backfill process now that we’re done with it.

The moral of this story is, try to get your domain model right the first time, or at least when tables are small and you can absorb a couple seconds of blocking.

Creating a New Check Constraint

PhaseProcess
Database pre-releaseCreate new table with constraint
Create temporary trigger
Create backfill process
Disable backfill process
Database releaseRename tables
Drop temporary trigger
Code release
Database post-releaseDrop old table
Drop backfill process

Here’s an example of a check constraint, causing rejection if SomeVal is not less than 200:

IF NOT EXISTS
(
	SELECT 1
	FROM sys.check_constraints cc
	WHERE
		cc.name = N'CK_LargeTable_SomeVal_LT200'
)
BEGIN
	ALTER TABLE dbo.LargeTable ADD CONSTRAINT [CK_LargeTable_SomeVal_LT200] CHECK
	(
		SomeVal < 200
	);
END

Bad news: check constraints cause blocking during creation, so they follow the same process as foreign key constraints above: the query takes a schema modification lock on the dbo.LargeTable table.

Schema modification locks are a necessary evil for us.

Because we take a schema modification lock, users can’t even query the table because SELECT operations require an incompatible LCK_M_SCH_S schema stability lock: in other words, readers want to know that the world won’t change out from under them as they query. Even though I can reason through this and say that the check constraint ought not change reader behavior at all (because either it succeeds and data & column definitions remain the same, or it fails and data & column definitions remain the same), check constraints take that lock to prevent writes so we pay the price during reads.

That leads us back to foreign keys and the two routes: ugly and ineffectual. My table shows the ugly route rather than the ineffectual route because I’m a pessimist that way.

Creating a New Default Constraint (Non-Nullable Edition)

PhaseProcess
Database pre-releaseCreate constraint
Database release
Code release
Database post-release

Creating a default constraint on a non-nullable column will lead to blocking but just momentarily: we need a schema modification lock to put the constraint in place, but because the column is not allowed to be NULL, there’s no work to do.

Here is an example of creating a default constraint:

IF NOT EXISTS
(
	SELECT 1
	FROM sys.default_constraints dc
	WHERE
		dc.name = N'DF_LargeTable_SomeVal'
)
BEGIN
	ALTER TABLE dbo.LargeTable ADD CONSTRAINT [DF_LargeTable_SomeVal]
		DEFAULT(17) FOR SomeVal;
END

Creating a New Default Constraint (Nullable Edition)

PhaseProcess
Database pre-releaseCreate constraint
Database release
Code release
Database post-release

What if we decide to create a new default constraint on a nullable column? It turns out the behavior is the same—the default constraint does not require writing any values to the table as of SQL Server 2012. Remus Rusanu covers this in detail.

Dropping Constraints (Not Clustered Index Edition)

PhaseProcess
Database pre-release
Database release
Code release
Database post-releaseDrop constraint

Here I’m going to bundle them all together. Drop constraints in the database post-release phase. Dropping foreign key, check, and default constraints is just a metadata operation which requires a schema modification lock, so blocking is limited. That means we don’t need to do anything special here.

Dropping primary or unique key constraints means dropping an index. If the constraint is tied to a non-clustered index, that’s another metadata operation requiring a schema modification lock.

Dropping Constraints (Clustered Index Edition)

PhaseProcess
Database pre-releaseCreate new table without constraint
Create temporary trigger
Create backfill process
Disable backfill process
Database releaseRename tables
Drop temporary trigger
Code release
Database post-releaseDrop old table
Drop backfill process

Uh-oh. Remember this bad boy? Yeah, we’re back to the land of pain (Land of Pain: population, 7 billion).

Dropping a primary key which is also the clustered index means that the database engine needs to rebuild the table before we can move forward, converting from a clustered index to a heap. During that time, SELECT statements won’t run because readers need to take a schema stability lock and our current schema is anything but stable.

Unfortunately, there’s no way to get around this if your clustered index and primary key are the same. That’s not me saying you shouldn’t combine the two, but be aware of this cost.

Conclusion

Unlike indexes, working with constraints can leave you with unacceptable amounts of downtime even on the latest versions of SQL Server Enterprise Edition. Constraints are definitely something you want to get right on the first try. Barring that, if you still have a small enough table, adding a blocking constraint (foreign key or check) or changing a clustered unique/primary key can block for just a few seconds. On large tables, you’re going to want to stage a big backfill operation. Well, dread might be a better word than want here…

Advertisements

One thought on “Approaching Zero: Constraint Changes

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s