Not too long ago, I had an interesting discussion about how to troubleshoot performance problems in large procedures. By “large procedure,” I’m talking about something which can be thousands of lines of code, or a procedure which might have dozens of potential choke points. This is not an exhaustive article, but instead gives an idea of thought processes and acts as a starting point.
The first thing I want to do is format the code. I’m going to use a tool to format code; that way, I know that I will be able to understand the code more quickly and be less likely to make a mistake due to improper tabbing, multiple columns on a single line, etc. This obviously won’t improve any performance problems and I might need to maintain historical formatting when I make my changes, but I’ll at least make the changes locally for me to understand.
Once I have formatted code, the next step is to do a quick code review and look for certain patterns. Any cursors or WHILE loops I see become automatically suspect. I’m also on the lookout for Common Table Expression chains (especially ones which aggregate data), temp table chains (taking data from one temp table into another into another, etc.), and user-defined function calls. Looking for cursors and WHILE loops is obvious: procedural code tends to perform poorly in SQL Server, so those sections of code can act as performance choke points. Common Table Expression chains can lead to lazy spooling, in which we read a subset of data over and over, and that subset of data can end up in tempdb, which is the single biggest point of contention on a typical SQL Server. Temp table chains follow a similar logic: we’re using tempdb over and over here, which might lead to a contention problem, or possibly a problem in which we are not filtering the base data sets well enough. Finally, I look at user-defined functions because they tend to perform significantly less efficiently than equivalent queries without functions. These are the types of things I’m looking for when I do my initial review, and when I find them, I mark their locations.
Next, I want to take a look at an execution plan. Ideally, I want to use SQL Sentry Plan Explorer to run the query and collect all of the relevant plan metrics, including CPU and I/O per operation. Because this is a very large procedure, I’d expect there to be a fairly large number of independent operations, and I want to focus my efforts on the highest-cost offenders. Once I narrow down to the one or two calls which are most expensive, I’m going to look for a few specific things first. First of all, I want to look for lazy spooling, as that tends to be a performance killer. When I see lazy spooling, I almost always want to re-write the relevant query in such a way as to remove the spooling. I also want to look for key lookups, which tell me that indexing might need to be updated to handle this query. That could mean one of three things: adding a new index, modifying an existing index, or modifying the query to use an existing index (for example, if I know a query should use a particular filtered index but it’s using a different index which doesn’t have all relevant columns). Key lookups don’t hurt very much when we’re only looking at a few rows in an infrequently-called procedure, but high volume or high frequency can easily harm an otherwise-acceptable procedure. I also want to see if I can find index scans where I expect seeks—or the other way around. If I know that I am retrieving less than 0.5% of the data in a non-trivial table, I want a seek. If I know I’m retrieving more than 20% of the data, I want a scan. If I’m seeing the wrong type of operation, that could cause performance to suffer.
The next step for execution plans is looking at join types. If I see a scenario in which I expect to pull in nested loops but am getting a hash match, I might end up spilling over to tempdb, which harms performance. I also want to look for sort operations, as those tend to be relatively slow to begin with, and they get much worse if they spill to tempdb. Really, tempdb is a common performance killer, so it’s not surprising to see some of the most expensive, performance-killing operations involve tempdb in some fashion.
After looking at the execution plan, I want to go back to my initial code walkthrough because execution plans can sometimes lie. Execution plans lie most often when dealing with cursors and functions, because they tend to show you one call rather than all calls, meaning that they severely under-count how much CPU and I/O function calls and cursors/WHILE loops really use. This means that I want to look at ways of removing those cursors and functions, even if the execution plan tells me that they aren’t very expensive. If performance doesn’t change, I can always put them back in where they were, but inlining a table-valued function can improve performance of a query by a factor of 3-9x and could potentially improve scalar UDFs by much more (if they’re in the WHERE clause and make separate I/O operations like querying from some other table).
The easiest way to make these changes is to simplify the procedure for testing purposes. That means hardcoding paramters or incoming data sets and stubbing out child procedure calls. Obviously this technique can be somewhat limiting, but it does allow you to focus in on the critical section of code without needing to immerse yourself in other points of complexity. Take care of one critical section at a time, create unit tests to ensure that refactoring doesn’t break existing calls, and focus on the pain points. That’s the method I use to tackle a very large problem.
What do you do? I’m interested in hearing how other peoples’ methods may differ from mine, as I consider this a hard problem and am always up for learning more.