Approaching Zero: Stored Procedure Changes

This is part four in a series on near-zero downtime deployments.

In the first three parts of this series, I’ve laid out conceptual groundwork around what helps in reducing database downtime for releases. Now I’m going to start getting into specific details. The next several posts will look at making different types of database changes and walk through a number of examples. In each case, I intend to show a quick table of what steps happen when in the process. This allows you to chain together the individual elements in these posts and create more complicated results, seeing what needs to happen in order.

Today’s topic is stored procedures. Dealing with stored procedures is pretty easy, all things considered. But even with these, we have some nuance. Let’s look at a number of different scenarios and see how we can handle each.

New Stored Procedure

PhaseProcess
Database pre-release
Database releaseDeploy new procedure
Code releaseDeploy calling code (optional)
Database post-release

New stored procedures can go out in the database release step. There’s no need to pre-release them because new code won’t call the procedure yet. We can create the stored procedure and deploy it. If you’re generating your own deployment scripts, the CREATE OR ALTER PROCEDURE syntax is a wonderful addition to SQL Server 2016 SP1 and lets you create a procedure quite easily:

CREATE OR ALTER PROCEDURE dbo.SomeProcedure
(
@Parameter1 INT,
@Parameter2 BIGINT
)
AS
SET NOCOUNT ON;

SELECT
	a.Col1,
	a.Col2
FROM dbo.TblA a
WHERE
	a.Parameter1 = @Parameter1
	AND a.Parameter2 = @Parameter2;
GO

Pretty much any automated deployment process will handle this scenario well. Then, once the procedure is out, new code released in the code deployment phase can reference it. The reason I like CREATE OR ALTER so much here is that it is an idempotent process: every time I execute, I get to exactly the same end state. That means my continuous integration process can deploy that change dozens of times over the course of a { sprint / month / week / day } and nobody will care.

A Note on Ordering

In the scenario above, I assumed that your stored procedure lives on its own and the only caller is some block of code someplace or manual execution. There are a few cases where that won’t be true, like nested stored procedures or SQL Agent jobs (or other scheduled tasks) executing stored procedures. In that case, we have to make sure that we deploy things in the right order. Creating procedures in the wrong order isn’t a catastrophe but it does lead to a warning:

CREATE OR ALTER PROCEDURE dbo.p1
AS
EXEC dbo.p2;
GO
CREATE OR ALTER PROCEDURE dbo.p2
AS
SELECT
	1 AS Val;
GO

The module ‘p1’ depends on the missing object ‘dbo.p2’. The module will still be created; however, it cannot run successfully until the object exists.

But if you create a SQL Agent job expecting a procedure to exist before it actually exists and that Agent job runs, you’ll get errors. Again, this is new code so it won’t break existing stuff but this is easy to get right.

New Result Set Column on an Existing Procedure

PhaseProcess
Database pre-release
Database releaseDeploy procedure
Code releaseDeploy new calling code (optional)
Database post-release

Adding a new column to an existing procedure’s result set is trivial as long as the calling code does not demand an exact signature for procedure outputs. For example, suppose we have procedure p2 above and we want to add a new column to its output set:

CREATE OR ALTER PROCEDURE dbo.p2
AS
SELECT
	1 AS Val,
	2 AS Val2;
GO

ADO.Net and most ORMs which accept stored procedures can handle the addition of a new field, so I’ll work from the assumption that your code can as well. If not, you’d need to version the procedure, which I’ll describe below.

Assuming your code can safely handle adding new fields, you update the code and deploy it. Old code doesn’t care about the new field and new code knows how to reference it.

One note here is that it’s probably safest to add new columns to the bottom of result sets. Some data access layers look at the order of items rather than a keyed name (like the column alias in the result set). I’m not a fan of using order to determine which column is which, but I know these technologies exist. So add new columns to procedures after existing columns unless you know how the calling code knows which column is which.

Removing a Column from a Stored Procedure Result Set

There are two scenarios I could come up with around removing a column from a stored procedure. Let’s hit them in order of difficulty.

Manual Execution or No Code Dependency

PhaseProcess
Database pre-release
Database releaseDeploy altered procedure
Code release
Database post-release

This is the easiest scenario: you have a manually run procedure or a procedure whose result set doesn’t need to have a set shape. Maybe it’s a logging procedure or writing data out to a table or whatever. You can safely change the procedure because it won’t break any existing code.

Code Dependency

PhaseProcess
Database pre-release
Database release
Code releaseDeploy new calling code
Database post-releaseDeploy altered procedure

If the code is currently referencing a column which can go away, we need to change application code first. Only after application code is completely finished will we be able to deploy an altered version of the stored procedure missing that column from its result set.

Changing Input Parameters

PhaseProcess
Database pre-release
Database releaseDeploy new procedure version
Code releaseDeploy new calling code
Database post-releaseDrop old procedure version

Any change to input parameters is liable to break existing code. This can include adding new parameters, removing existing parameters, changing parameter types, and possibly even changing parameter names. Depending upon how flexible your data mapping layer is, you might need to version a procedure. I’ll assume for the sake of simplicity that your data mapping layer is quite inflexible and breaks with any input parameter change, so we’ll go ahead with versioning.

Versioning a procedure is pretty simple: you create a new procedure with alterations you want. Corporate naming standards where I’m at have you add a number to the end of versioned procedures, so if you have dbo.SomeProcedure, the new version would be dbo.SomeProcedure01. Then, the next time you version, you’ll have dbo.SomeProcedure02 and so on. For frequently-changing procedures, you might get up to version 05 or 06, but in practice, you’re probably not making that many changes to a procedure’s signature. For example, looking at a directory with exactly 100 procedures in it, I see 7 with a number at the end. Two of those seven procedures are old versions of procedures I can’t drop quite yet, so that means that there are only five “unique” procedures that we’ve versioned in a code base which is two years old. Looking at a different part of the code with 879 stored procedures, 95 have been versioned at least once in the 15 or so years of that code base’s existence. The real number is a bit higher than that because we’ve renamed procedures over time and renamings tend to start the process over as we might go from dbo.SomeProcedure04 to dbo.SomeNewProcedure when we redesign underlying tables or make other drastic architectural changes.

As far as pushing out new versions of procedures goes, we deploy the new procedure dbo.SomeProcedure01, then deploy code which uses the new input interface. Once all of the old code which used the old stored procedure dbo.SomeProcedure is gone, we can drop the old version of the procedure and now our code base has just one version of the procedure left, dbo.SomeProcedure01.

As a quick side note, you might wonder why we use incrementing numbers. We could probably just get away with two values: dbo.SomeProcedure and dbo.SomeProcedure01. Then, the next change could just go back to dbo.SomeProcedure because that’s available. The reason we don’t do this is because we want to keep old versions in source control. We store old versions of procedures in an Archive folder so that we can see how the procedures morph over time.

Changing Input Parameters With Table Types

PhaseProcess
Database pre-release
Database releaseDeploy new table type version
Deploy new procedure version
Code releaseDeploy new calling code
Database post-releaseDrop old procedure version
Drop old table type version

This is a bonus scenario. Let’s say you use table-valued parameters and need to change the TVP. In the best-case scenario, your TVP allows nulls and your calling code can handle changes with aplomb. Well, too bad—you still need to follow this process because any change to a table type requires that you drop and recreate the type. We don’t want to drop a table type because we’d need to drop all stored procedures referencing the type first, and dropping procedures which the code uses is a no-no.

Creating a new type which is CI-safe is pretty easy. Here’s an example of a product type which takes a ProductID input:

IF NOT EXISTS
(
	SELECT 1
	FROM sys.types t
	WHERE
		T.name = N'ProductType'
)
BEGIN
	CREATE TYPE [dbo].[ProductType] AS TABLE
	(
		ProductID BIGINT NOT NULL
	);
END
GO

We would deploy the new table type and then create a versioned procedure which references this table type as two database deployment-time scripts. Then, in the code deployment phase, we transition from the old procedure-type combo to the new procedure-type combo. After the code release completes, we can run the database post-release process and drop the procedure and then the type in that order. If you try to drop a type which is in use, you’ll get an error like:

Msg 3732, Level 16, State 1, Line 30
Cannot drop type ‘dbo.ProductType’ because it is being referenced by object ‘<Calling procedure name>’. There may be other objects that reference this type.

Therefore, in this case, order really matters.

Refactoring a Procedure

PhaseProcess
Database pre-release
Database releaseDeploy altered procedure
Code release
Database post-release

This scenario is one of my key reasons for preferring stored procedures over ad hoc SQL in the code base: I can refactor stored procedures and they simply go out. In this case, I mean “refactoring” in the truest sense: the inputs and outputs remain the same, but the process by which we translate inputs into outputs can change. In a complicated stored procedure, this might involve creating and working with temp tables, reshaping the T-SQL query, or even adding query hints.

Conclusion

In today’s post, we looked at different procedure modification scenarios and covered how we can make minimally disruptive changes. Using the four-phase model works well here, and we haven’t run into a case where we absolutely need to use pre-release…yet. That will come soon.

Advertisements

2 thoughts on “Approaching Zero: Stored Procedure Changes

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s