36 Chambers – The Legendary Journeys: Execution to the max!

February 26, 2014

T-SQL Anti-Patterns: Row By Agonizing Row (RBAR) Queries

Filed under: Database Administration — Kevin Feasel @ 10:49 pm

The last couple of days, I poked a bit of fun at object-oriented developers who move to a relational platform and try to do things like nesting views or using functions inappropriately (although, to be fair, it’s not so much that developers are trying to use functions as that the way Microsoft implemented functions leads generally to horribly inefficient outcomes).  Today’s anti-pattern will be quick but important.  Today, I’m going to talk about Row-By-Agonizing-Row (RBAR) queries, something I tend to rant about (such as in my tally tables presentation).

Most students learning to program tend to get introduced first to structural or object-oriented languages like C and C#/Java, respectively.  With these languages, we see certain structural patterns emerge:  conditional statements, method redirection, and loops.  These work wonders in C-based languages, as well as other structural/OO languages, and so it makes sense for a developer whose primary focus is C# to import that paradigm into writing SQL.  It is also a devastating performance problem.

Here’s a concrete example:  suppose that you have to loop through a set of users and set their IsActive flag to be false if their last recorded login is earlier than three months ago.  Armed solely with structural programming tools, there’s an obvious solution here.  The obvious solution is to use two separate loops:

  1. Loop through each user in the User table, or at least those users who are currently active.
  2. For each user, we create another loop which goes through the UserLogin table.  We store as a local variable the latest login date for that user which we have found.

As far as the logic goes, this is sound reasoning.  Unfortunately, it’s also a performance hog within SQL Server if you actually try to implement this as two separate cursors.  The reason is that we’ll be scanning the User table N times (once for each user) and the UserLogin M*N times (once for each user login—for each user).  We can improve this a bit by putting filters on, like getting only active users and making M*N into m*N, where m is the number of UserLogin records associated with that user.  Nevertheless, you’re hitting the database N + M*N (or N + m*N) times.

Some people see cursors and say, “Hey, I’ll improve performance here:  I’m going to get rid of this cursor and replace it with a WHILE loop.”  Those people are wrong.  At best, WHILE loops are the same thing as cursors; at worst, you might actually write a cursor which performs better than the equivalent WHILE loop.  In either event, it’s not the appropriate solution.  The appropriate solution is to start thinking in terms of sets.

For our previous example, we can re-write the problem to be handled in one query, whose pseudocode would look a bit like:

UPDATE dbo.User
SET IsActive = 0
WHERE
	NOT EXISTS
	(
		SELECT *
		FROM dbo.UserLogin ul
		WHERE
			ul.UserID = UserID
			AND ul.LoginDate >= DATEADD(MONTH, -3, CURRENT_TIMESTAMP)
	);

In this case, we’re hitting the database one time.  With an appropriate index on dbo.UserLogin, we can make the non-existence query fast, so we’re scanning the User table and seeking against UserLogin.  Performance will be tremendously better than the RBAR solution.  And fortunately for us, almost every cursor-based solution can be re-written using set-based logic.  There are a few maintenance functions that you’d probably want to use a cursor for (and that’s how things like sp_msforeachdb work), but those should be few and far between.

Also, it’s important to note that in the Oracle world, cursors typically perform much better than in SQL Server.  This means that even a skilled Oracle developer, upon entering the world of SQL Server, could mess this up pretty badly.  But if you’re going to spend a lot of time writing T-SQL, you really need to be able to think in terms of sets.

About these ads

2 Comments »

  1. […] By Agonizing Row (RBAR, pronounced “re-bar”) was our last anti-pattern.  Today, I’m going to extend it to a term I’m coining, Query By Agonizing Query (QBAQ, […]

    Pingback by T-SQL Anti-Patterns: Query By Agonizing Query (QBAQ) | 36 Chambers - The Legendary Journeys: Execution to the max! — February 28, 2014 @ 10:05 pm

  2. Everyone loves it when folks get together and share thoughts.
    Great site, continue the good work!

    Comment by google — June 24, 2014 @ 3:19 pm


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Silver is the New Black Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 94 other followers

%d bloggers like this: