This is part ten in a series on window functions in SQL Server.

The Need for Speed

SQL Server typically operates in row mode, which means that an operator processes one row at a time. This sounds inefficient, but tends to work out pretty well in practice. However, something which may work out even better is to process more than one row at a time, especially when the number of rows gets to be fairly large. Enter batch mode.

Batch mode was introduced in SQL Server 2012 alongside non-clustered columnstore indexes. It became interesting in SQL Server 2016 and very interesting in SQL Server 2019. That’s because 2016 introduced writable clustered columnstore indexes and 2019 gives us batch mode outside of columnstore indexes.

Batch mode lets us work on up to 900 rows of data or 64KB of data at a time per operator. Each operator also evaluates metadata only once per batch versus once per row. Finally, a function operating on a batch can handle all up-to-900 rows at once.

You Must Be This Tall to Ride the Ride

Before we can use batch mode, we need to pass certain requirements. If you are using a columnstore index, the requirements are:

  • Batch mode cost must be less than row mode cost

Not too onerous, eh? If you are using SQL Server 2019 and have a rowstore index, the requirements are:

  • At least 2^17 rows (131,072) in at least one table in the query
  • At least one operator which benefits from batch mode
  • At least one input of a batch has at least 2^17 rows
  • Batch mode cost is less than row mode cost

In other words, if I have a 1 billion row table but only get 1800 rows from it, I won’t be able to use batch mode on the rowstore index because I need at lest 131,072 rows to pass through an operator. This limits batch mode on rowstore to larger queries, and that’s on purpose: the idea is that we know how row mode on rowstore indexes operates and it does a really good job on smaller data sets. Batch mode might potentially be slower for whatever reason on tiny datasets, and so the optimizer has fairly stringent limts.

The Cheater’s Guide to Batch Mode

If you are using SQL Server 2016 or 2017 and don’t want to work with a columnstore index, you can still take advantage of them to allow for batch mode queries using your rowstore indexes. In particular, you create a filtered columnstore index on an impossible condition. See, the optimizer doesn’t actually check if the columnstore index is the one you use; it checks if the columnstore index exists and assumes that’s good enough for rowstore, because what kind of dummy would put a columnstore index on a table which was horribly unsuited for it?

Here’s an example of such a query:

WHERE ID = -1;

Create that index and you’re in Batch Mode City, where the cars zoom past and everybody needs to wear sunglasses even at night.

Batches of Data

To show off batch mode, let’s use a table which might actually benefit from it. My Warehouse.ColdRoomTemperature table in WideWorldImporters is about 40 million rows large. Let’s suppose that I want to run a window function which partitions by cold room sensor number. Here’s a query which does just that. Note that I do have an optimizer hint to force this into SQL Server 2017 mode, which guarantees no batch mode because the table does not have a columnstore index on it.

WITH records AS
        ROW_NUMBER() OVER (
            PARTITION BY c.ColdRoomSensorNumber
            ORDER BY c.ColdRoomTemperatureID) AS rownum
    FROM Warehouse.ColdRoomTemperatures_Archive c
FROM records r
    r.rownum = 0

This query returns zero rows because it’s not possible to have a rownum value of 0, but it sure takes its sweet time doing so.

On the plus side, it only took 4ms to compile, so we’ve got that going for us.

Let’s remove that option and run the query again.

The sweet smell of victory.

This particular query took about 1/6th the time to run as the prior, with about 1/3 the total CPU and half the I/O. In practice, on a good query, you can cut the time down to about 1/3 of what it would run in row mode. It’s not an order of magnitude better, but it is a nice chunk of time savings.

Now let’s see how things look against the invoice lines table. Note that this is not the stock WideWorldImporters invoice lines table; I’ve expanded it out to 1.75 million rows so we can see some good examples of batch mode in process.

First up, here’s your daily reminder that just because you can have oodles of partitions doesn’t mean you should.

	ROW_NUMBER() OVER (PARTITION BY il.InvoiceID ORDER BY il.InvoiceLineID) AS rownum1,
	ROW_NUMBER() OVER (PARTITION BY il.StockItemID ORDER BY il.InvoiceLineID) AS rownum2,
	ROW_NUMBER() OVER (PARTITION BY il.PackageTypeID ORDER BY il.InvoiceLineID) AS rownum3,
	ROW_NUMBER() OVER (PARTITION BY il.TaxAmount ORDER BY il.InvoiceLineID) AS rownum4,
	ROW_NUMBER() OVER (PARTITION BY il.LineProfit ORDER BY il.InvoiceLineID) AS rownum5
FROM Sales.InvoiceLines il;
On the plus side, all of the aggregates and sorts are in batch mode…

If you have one partition, you’re more likely to get one window aggregate.

WITH records AS
		ROW_NUMBER() OVER (PARTITION BY il.InvoiceID ORDER BY il.InvoiceLineID) AS rownum,
		RANK() OVER (PARTITION BY il.InvoiceID ORDER BY il.InvoiceLineID) AS rk,
	FROM Sales.InvoiceLines il
FROM records
	rownum = 0;
No sort, no sweat.


Window functions can use batch mode for quick gains with zero developer effort. But do understand that not all queries will be eligible for batch mode.


2 thoughts on “A Slice of Time: Batch Mode Power-Ups

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s