Skip to main content

Overview of Window Functions (SQL)

Functions that specify a per-row "window frame" for calculating aggregates and ranking.

Window Functions and Aggregate Functions

A window function operates on the rows selected by a SELECT query after the WHERE, GROUP BY, and HAVING clauses have been applied.

A window function combines the values of a field (or fields) from a group of rows and return a value for each row in a generated column in the result set.

While window functions are like aggregate functions in that they combine results from multiple rows, they are distinct from aggregates in that they do not combine the rows themselves. However, the aggregate functions AVG(), MIN(), MAX(), and SUM() can also be invoked as window functions. Within this context, each row receives the result of calling the function on the group of rows in its corresponding window frame.

Window Functions Syntax

A window function is specified as a select-item in a SELECT query. A window function can also be specified in the ORDER BY clause of a SELECT query.

A window function performs a task in relation to a per-row window specified by a PARTITION BY clause, an ORDER BY clause, and a ROWS clause, and returns a value for each row. All three of these clauses are optional, but if specified must be specified in the orders shown in the following syntax:

window-function() OVER (
                      [ PARTITION BY partfield ]
                      [ ORDER BY orderfield ]
                      [ ROWS framestart ] | [ ROWS BETWEEN framestart AND frameend ]
                      )

where framestart and frameend can be:

UNBOUNDED PRECEDING |
offset PRECEDING |
CURRENT ROW |
UNBOUNDED FOLLOWING |
offset FOLLOWING

Keywords and window function names are not case-sensitive.

A Simple Example

CityTable contains rows with the following values:

Name City
Able New York
Betty Boston
Charlie Paris
Davis Boston
Eve Paris
Francis Paris
George London
Beatrix Paris

The ROW_NUMBER() window function assigns a unique sequential integer to each row based on the specified window.

SELECT Name,City,ROW_NUMBER() OVER (PARTITION BY City) FROM CityTable

This example partitions the rows by the City value and returns the following:

Name City Window_3
Able New York 1
Betty Boston 1
Charlie Paris 1
Davis Boston 2
Eve Paris 2
Francis Paris 3
George London 1
Beatrix Paris 4
SELECT Name,City,ROW_NUMBER() OVER (ORDER BY City) FROM CityTable

This example treats all the rows as a single partition. It orders the rows by the City value and returns the following:

Name City Window_3
Able New York 4
Betty Boston 1
Charlie Paris 5
Davis Boston 2
Eve Paris 6
Francis Paris 7
George London 3
Beatrix Paris 8
SELECT Name,City,ROW_NUMBER() OVER (Partition BY City ORDER BY Name) FROM CityTable

This example partitions the rows by the City value, orders each City partition by Name values, and returns the following:

Name City Window_3
Able New York 1
Betty Boston 1
Charlie Paris 2
Davis Boston 2
Eve Paris 3
Francis Paris 4
George London 1
Beatrix Paris 1

NULL

The PARTITION BY clause treats rows with fields that are NULL (have no assigned value) as a partitioned group. For example, ROW_NUMBER() OVER (Partition BY City) would assign rows with no City value sequential integers, just as it assigns sequential integers to rows with a City value of 'Paris'.

The ORDER BY clause treats rows with fields that are NULL (have no assigned value) as ordered before any assigned value (having the lowest collation value). For example, ROW_NUMBER() OVER (ORDER BY City) would first assign sequential integers to rows with no City value, then assign sequential integers to rows with a City value in collation sequence.

The ROWS clause treats with fields that are NULL (have no assigned value) as having a value of zero. For example, SUM(Scores) OVER (ORDER BY Scores ROWS 1 PRECEDING)/2 would assign 0.00 to all rows with no Scores value ((0 + 0) / 2), and handle the first Scores value by adding 0 to it then dividing by 2.

Supported Window Functions

The following window functions are supported:

  • AVG(field) — assigns the average of the values in the field column for rows within the specified window frame to all rows in that window frame. For example, AVG(Salary) OVER (PARTITION BY Department) FROM Company.Employee could be used to compare each employee’s salary against the average of the salaries earned by employees within that employee’s department. AVG() supports the ROWS clause. Refer to the AVG() function’s reference page for further details on use.

  • FIRST_VALUE(field) — assigns the value of the field column for the first row (ROW_NUMBER()=1) within the specified window frame to all rows in that window frame. For example, FIRST_VALUE(Country) OVER (PARTITION BY City). FIRST_VALUE() supports the ROWS clause. Note that NULL collates before all values, so if the value of field in the first row is NULL, all of the rows in the window will be NULL.

  • LAST_VALUE(field) — assigns the value of the field column for the last row within the specified window frame to all rows in that window frame. LAST_VALUE() supports the ROWS clause.

  • LAG(field[, offset[, default]]) — assigns the value of the field column for the row that is offset rows before the given row within the specified window frame. If no offset is specified, the function assigns the value of the field column 1 row before the given row by default. By default, LAG() will assign the value NULL if the given row does not have a row offset rows before it within its window frame. The user has the option to assign an alternate value under these conditions by including a value default.

  • LEAD(field[, offset[, default]]) — assigns the value of the field column for the row that is offset rows after the given row within the specified window frame. If no offset is specified, the function assigns the value of the field column 1 row after the given row by default. By default, LEAD() will assign the value NULL if the given row does not have a row offset rows after it within its window frame. The user has the option to assign an alternate value under these conditions by including a value default.

  • MAX(field) — assigns the maximum value of the field column within the specified window frame to all rows in that window frame. For example, MAX(Salary) OVER (PARTITION BY Department) FROM Company.Employee could be used to compare each employee’s salary against the highest salary earned by an employee within that employee’s department. MAX() supports the ROWS clause. Refer to the MAX() function’s reference page for further details on use.

  • MIN(field) — assigns the minimum value of the field column within the specified window frame to all rows in that window frame. For example, MIN(Salary) OVER (PARTITION BY Department) FROM Company.Employee could be used to compare each employee’s salary against the lowest salary earned by an employee within that employee’s department. MIN() supports the ROWS clause. Refer to the MIN() function’s reference page for further details on use.

  • NTH_VALUE(field, n) — assigns the value of the field column for row number n within the specified window frame (counting from 1) to all rows in that window frame. NTH_VALUE() supports the ROWS clause.

  • PERCENT_RANK() — assigns a ranking percentage as a fractional number between 0 and 1 (inclusive) to each row within the same window frame. Ranking percentages can include duplicate values if multiple rows contain the same value for the window function field.

  • RANK()— assigns a ranking integer to each row within the same window frame, starting with 1. Ranking integers can include duplicate values if multiple rows contain the same value for the window function field.

  • ROW_NUMBER() — assigns a unique sequential integer to each row within the same window frame, starting with 1. If multiple rows contain the same value for the window function field, each row is assigned a unique sequential integer.

  • SUM(field) — assigns the sum of the values of the field column within the specified window frame to all rows in that window frame. SUM() supports the ROWS clause. Refer to the SUM() function’s reference page for further details on use.

The following example compares the values returned by the ORDER BY clause in these window functions:

SELECT Name,City,ROW_NUMBER() OVER (ORDER BY City) AS RowNum,
  RANK() OVER (ORDER BY City) AS RankNum,
  PERCENT_RANK() OVER (ORDER BY City) AS RankPct
  FROM CityTable ORDER BY City

This example treats all the rows as a single partition. It orders the rows by the City value and returns the following:

Name City RowNum RankNum RankPct
Harriet   1 1 0
Betty Boston 2 2 .1111111111111111111
Davis Boston 3 2 .1111111111111111111
George London 4 4 .3333333333333333333
Able New York 5 5 .4444444444444444444
Charlie Paris 6 6 .5555555555555555555
Eve Paris 7 6 .5555555555555555555
Francis Paris 8 6 .5555555555555555555
Beatrix Paris 9 6 .5555555555555555555
Jackson Rome 10 10 1

The ROWS Clause

The ROW clause can be used with the AVG(), FIRST_VALUE(), LAST_VALUE(), NTH_VALUE(), MIN(), MAX(), and SUM() windows functions. It can be specified for the other windows functions, but performs no operation (result is the same with or without the ROWS clause).

The ROWS clause has two syntactic forms:

ROWS framestart
ROWS BETWEEN framestart AND frameend

framestart and frameend have five possible values:

UNBOUNDED PRECEDING                    /* start at beginning of the current partition */
offset PRECEDING   /* start offset number of rows preceding the current row */
CURRENT ROW                                      /* start at the current row */
offset FOLLOWING  /* continue offset number of rows following the current row */
UNBOUNDED FOLLOWING                  /* continue to the end of the current partition */

ROWS clause syntax can specify a range in either direction. For example, ROWS BETWEEN UNBOUNDED PRECEDING AND 1 FOLLOWING and ROWS BETWEEN 1 FOLLOWING AND UNBOUNDED PRECEDING are exactly equivalent.

The ROWS framestart syntax defaults to CURRENT ROW as the unspecified second bound of the range. Therefore the following are equivalent:

ROWS UNBOUNDED PRECEDING ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
ROWS 1 PRECEDING ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
ROWS CURRENT ROW ROWS BETWEEN CURRENT ROW AND CURRENT ROW
ROWS 1 FOLLOWING ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING
ROWS UNBOUNDED FOLLOWING ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING

The default if the ROWS clause is not specified is ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

ROWS Clause Example

The following query returns scores that contain a lot of “noise” (random variation). The ROWS clause is used to “smooth” these variations by summing each score with the one immediately preceding it and the one immediately following it in collation sequence, then dividing by 3 to get a rolling average score:

SELECT Item,Score,SUM(Score)
  OVER (ORDER BY Score ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)/3
  AS CohortScore FROM Sample.TestScores

The operation is: (PrecedingScore + CurrentScore + FollowingScore)/3. Note that the bottom and top CohortScore values will not be accurate, because they are adding 0 to two Score values, then dividing by 3: (0 + CurrentScore + FollowingScore)/3 and (PrecedingScore + CurrentScore + 0)/3.

Using Window Functions

An window function can be used in:

  • SELECT list as a listed select-item.

    A window function cannot be embedded in a subquery or an aggregate function in the select-item list.

  • ORDER BY clause.

A window function cannot be used in and ON, WHERE, GROUP BY, or HAVING clause. Attempting to do so results in an SQLCODE -367 error.

Column Names and Aliases

By default, the column name assigned to the results of a window function is Window_n, where the n number suffix is the column order number, as specified in the SELECT list. Thus, the following example creates column names Window_3 and Window_6:

SELECT Name,State,ROW_NUMBER() OVER (PARTITION BY State),Age,AVG(Age),ROW_NUMBER() OVER (ORDER BY Age)
FROM Sample.Person

To specify another column name (a column alias), use the AS keyword:

SELECT Name,State,ROW_NUMBER() OVER (PARTITION BY State) AS StateRow,Age
FROM Sample.Person

You can use a column alias to specify a window field in an ORDER BY clause:

SELECT Name,State,ROW_NUMBER() OVER (PARTITION BY State) AS StateRow,Age
FROM Sample.Person
ORDER BY StateRow

You cannot use a default column name (such as Window_3) in an ORDER BY clause.

For further details on column aliases, refer to the SELECT statement.

With ORDER BY

Because an ORDER BY clause is applied to the query result set after window functions are evaluated, ORDER BY does not affect the values assigned by a select-item window function.

See Also


FeedbackOpens in a new tab