A Slice of Time: Overview

This is part one in a series on window functions in SQL Server.

What is a Window Function?

A window function is, quite simply, a function applied over a window of data. If that doesn’t sound impressively reductionist, I don’t know what will. So let’s unpack this a little bit further.

A window function operates over a window of data. Here’s an example of a tumbling window.

An example of a tumbling window.

What we see here is a set of events running over time. We have a window of a fixed size, perhaps one minute in length. We capture all of the events during this stretch and operate on the elements within each window.

What is a Window?

Another way to think about a window is that it is the context of operation for a function call. Suppose we have a dataset and we want to perform some analysis for each invoice ID.

Breaking out windows by invoice ID.

The way that we define the context of our window is using the OVER() clause in SQL.

What Constitutes a Window Function?

Speaking of OVER() there are four things which are critical to a window function. They are:

  1. Use of the OVER() clause, which defines the window we use.
  2. A PARTITION BY clause, which allows us to filter rows in a window function.
  3. An ORDER BY clause provides sorting when we need an ordered set, but it also provides meaning to dataframes when aggregating data, as we’ll see in future posts.
  4. A frame, using ROWS or RANGE, to filter within a partition.

Here’s a sample of what a window function looks like:

SELECT
	Col1,
	Col2,
	WINDOW_FUNCTION() OVER(
			PARTITION BY Col1
			ORDER BY Col2
			ROWS BETWEEN UNBOUNDED PRECEDING
				AND CURRENT ROW
		) AS wf
FROM dbo.Table;

Our window function uses the OVER() clause to define that we want to PARTITION BY column 1, meaning that as column 1’s value changes, we create new windows. We sort within each window by column 2. Finally, we have a frame which includes all rows from the beginning of our window up to and including the current row.

Why Should I Care?

Window functions can solve a variety of use cases in an efficient manner. Over the course of this series, I’ll show you how we can use window functions to solve the following problems:

  • Deleting duplicate rows in a table
  • Calculating running totals
  • Calculating moving averages
  • Calculating percents of a total
  • Getting the latest N items from a table
  • Calculating the distinct number of customers who have purchased a product over time

In addition to these, we’ll cover a whole lot more.