Caché SQL Reference
Overview of Aggregate Functions
[Home]  [Next]
InterSystems: The power behind what matters   
Class Reference   

Functions that evaluate all of the values of a column and return a single aggregate value.
Supported Aggregate Functions
An aggregate function performs a task in relation to one or more values from a single column and returns a single value. The supported functions are:
Aggregate functions ignore fields that are NULL. For example, LIST and %DLIST do not include elements for rows in which the specified field is NULL. COUNT only counts non-null values of the specified field.
All aggregate functions support the optional DISTINCT keyword clause. This keyword limits the aggregate operation to only distinct (unique) field values. The default is to perform the aggregate operation on all non-NULL values, including duplicate values. The MIN and MAX aggregate functions support the DISTINCT keyword, although it perform no operation.
Aggregate functions support the full DISTINCT keyword clause syntax, including the optional BY(item-list) subclause. Refer to the DISTINCT clause for details.
The aggregate function DISTINCT field1 clause ignores field1 values that are NULL. This differs from the DISTINCT clause of the SELECT statement: a SELECT DISTINCT clause returns one row for the distinct NULL, just as it returns one row for each distinct field value. However, an aggregate function DISTINCT BY(field2) field1 does not ignore the distinct NULL for field2. For example, if FavoriteColors has 50 distinct values and multiple NULLs, the number of DISTINCT rows returned is 51, the COUNT(DISTINCT FavoriteColors) is 50, and the COUNT(DISTINCT BY(FavoriteColors) %ID) is 51:
       COUNT(DISTINCT FavoriteColors),
       COUNT(DISTINCT BY(FavoriteColors) %ID)
 FROM Sample.Person
Aggregate functions (with the exception of COUNT) cannot be applied to a stream field. Attempting to do so generates an SQLCODE -37 error. You can use COUNT to count stream field values, with some restrictions.
Using Aggregate Functions
An aggregate function can be used in:
An aggregate function can be specified in a DISTINCT BY clause without error, though this usage is not meaningful and always returns a single row.
An aggregate function cannot be used directly in:
However, you can supply an aggregate function value to a WHERE or HAVING clause by using a subquery. For example, to use a WHERE clause to select Age values that are less than the average Age value, you can place the AVG aggregate function in a subquery:
SELECT Name,Age,AvgAge
FROM (SELECT Name,Age,AVG(Age) AS AvgAge FROM Sample.Person)
WHERE Age < AvgAge
Combining Aggregates and Fields
Caché SQL allows you to specify an aggregate function with other SELECT items in a query. An aggregate such as COUNT(*) does not need to be in a separate query.
FROM Sample.Person
When you specify an aggregate function and specify no field select items in the select list, Caché SQL returns one row. A TOP clause is ignored, unless it is TOP 0 (return no rows):
FROM Sample.Person
WHERE Age > 75
When you specify an aggregate function and specify one or more field select items in the select list, Caché SQL returns as many rows as required for the field item:
FROM Sample.Person
WHERE Age > 75
Column Names and Aliases
By default, the column name assigned to the results of an aggregate function is Aggregate_n, where the n number suffix is the column order number, as specified in the SELECT list. Thus, the following example creates column names Aggregate_2 and Aggregate_5:
SELECT TOP 5 Home_State,COUNT(*),Name,Age,AVG(Age)
FROM Sample.Person
To specify another column name (a column alias), use the AS keyword:
SELECT COUNT(*) AS PersonCount
FROM Sample.Person,Sample.Employee
You can use a column alias to specify an aggregate field in an ORDER BY clause. The following example lists people in the order that their ages diverge from the average age:
SELECT Name,Age,
    AVG(Age) AS AvgAge,
    ABS(Age - AVG(Age)) AS RelAge
FROM Sample.Person
For further details on column aliases, refer to the SELECT statement.
The LIST, %DLIST, XMLAGG, and JSON_ARRAYAGG functions combine the values of a table column from multiple rows into a single aggregate value. By default, these aggregates list these values in the same sequence as the table rows (RowID sequence). You can use the ORDER BY clause to list these values in ascending sequence. Use of ORDER BY is subject to the following conditions:
A SELECT DISTINCT with a select-item aggregate function and a GROUP BY clause returns the same results as if the DISTINCT keyword were not present. To achieve the desired results, put the aggregate function in a subquery.
For example, you wish to return the number of distinct counts of persons in states (there are states with 4 people, there are states with 6 people, etc.). You would expect to achieve this result as follows:
FROM Sample.Person
GROUP BY Home_State
Instead, you get a person count for each state, the same as if the DISTINCT keyword were not present:
SELECT COUNT(*) AS PersonCounts
FROM Sample.Person
GROUP BY Home_State
To achieve your intended result, you need to use a subquery, as follows:
FROM (SELECT COUNT(*) AS PersonCounts FROM Sample.Person
      GROUP BY Home_State)
Row Counts
In a query that contains one or more aggregate functions the %ROWCOUNT value depends on the query:
Aggregates, Transactions, and Locking
Including an aggregate function in a query causes the query to return the current state of the data to all result set fields, including uncommitted changes to the data. Thus, an ISOLATION LEVEL READ COMMITTED setting is ignored for a query containing an aggregate function. The current state of uncommitted data is as follows:
Because aggregate functions usually involve data from a large number of rows, it is not acceptable to issue a transaction lock on all of the rows involved in an aggregate calculation. It is therefore possible that another user may be performing a transaction that modifies the data while an aggregate calculation is in process.
See Also