Caché SQL Reference
GROUP BY
|
|
A SELECT clause that groups the resulting rows of a query according to one or more columns.
Synopsis
SELECT ...
GROUP BY field
The
GROUP BY clause takes the resulting rows of a query and breaks them up into individual groups according to one or more database columns. When you use
SELECT in conjunction with
GROUP BY, one row is retrieved for each distinct value of the
GROUP BY fields.
GROUP BY treats fields with NULL (no specified value) as a separate distinct value group.
The
GROUP BY clause is conceptually similar to the Caché extension %FOREACH, but
GROUP BY operates on an entire query, while %FOREACH allows selection of aggregates on sub-populations without restricting the entire query population.
The simplest form of a
GROUP BY clause specifies a single field, such as
GROUP BY Home_State. The field must be specified by its column name. You cannot specify a field by column alias or by column number. You cannot specify an aggregate field; attempting to do so generates an SQLCODE -19 error.
Specifying a literal in a
GROUP BY clause returns 1 row; which row is returned is indeterminate. Thus, specifying 7, 'Chicago', '', 0, or NULL all return 1 row.
Aggregate Functions with GROUP BY and DISTINCT BY
SELECT Home_State,COUNT(Home_State)
FROM Sample.Person
GROUP BY Home_State
The
DISTINCT BY clause is applied after aggregate functions are calculated. In the following example, the
COUNT aggregate function counts the number of rows in the entire table:
SELECT DISTINCT BY(Home_State) Home_State,COUNT(Home_State)
FROM Sample.Person
In order to calculate an aggregate function for the entire table, rather than a
GROUP BY group, you can specify a
select-item subquery:
SELECT Home_State,(SELECT COUNT(Home_State) FROM Sample.Person)
FROM Sample.Person
GROUP BY Home_State
A
GROUP BY clause should not be used with a
DISTINCT clause when the select list consists of an aggregate field. For example, the following query is
intended to return the distinct numbers of people who share the same Home_State:
/* This query DOES NOT apply the DISTINCT keyword */
/* It is provided as a cautionary example */
SELECT DISTINCT COUNT(*) AS mynum
FROM Sample.Person
GROUP BY Home_State
ORDER BY mynum
This query did not return the expected results because it did not apply the DISTINCT keyword. To apply both a
DISTINCT aggregate and a
GROUP BY clause, use a subquery as shown in the following example:
SELECT DISTINCT *
FROM (SELECT COUNT(*) AS mynum
FROM Sample.Person
GROUP BY Home_State) AS Sub
ORDER BY Sub.mynum
This example successfully returns the distinct numbers of people who share the same Home_State. For instance, if any Home_State is shared by 8 people, the query returns an 8.
Collation, Letter Case, and Optimization
By default,
GROUP BY groups together string data type fields based on the
collation specified for the
field when it was created. Caché has a
default string collation, which can be set for each namespace; the initial string collation default for all namespaces is SQLUPPER, which is not case-sensitive. Therefore, commonly,
GROUP BY collation is not case-sensitive unless otherwise specified.
GROUP BY groups together the values of a field with SQLUPPER collation based on their uppercase letter collation. Field values that differ only in letter case are grouped together. Grouped field values are returned in all uppercase letters. To group values by original letter case, or to display the returned values for a grouped field in their original letter case, use the
%EXACT collation function. This is shown in the following examples, which assume that Sample.Person contains records with a Home_City field with SQLUPPER collation and values of ‘New York’ and ‘new york’:
SELECT Home_City FROM Sample.Person GROUP BY Home_City
/* groups together Home_City values by their uppercase letter values
returns the name of each grouped city in uppercase letters.
Thus, 'NEW YORK' is returned. */
SELECT %EXACT(Home_City) FROM Sample.Person GROUP BY Home_City
/* groups together Home_City values by their uppercase letter values
returns the name of each grouped city in original letter case.
Thus, 'New York' or 'new york' may be returned, but not both. */
SELECT Home_City FROM Sample.Person GROUP BY %EXACT(Home_City)
/* groups together Home_City values by their original letter case
returns the name of each grouped city in original letter case.
Thus, both 'New York' and 'new york' are returned.
Optimization is not used. */
WRITE $SYSTEM.SQL.SetFastDistinct(1)
This optimization takes advantage of indices for the selected field(s). It is therefore only meaningful if an index exists for one or more of the selected fields. It collates field values as they are stored in the index; alphabetic strings are returned in all uppercase letters. You can set this system-wide option, then override it for specific queries by using the %EXACT collation function to preserve letter case.
Specifying a GROUP BY clause causes a
cursor-based Embedded SQL query to not set the
%ROWID variable. %ROWID is not set even when GROUP BY does not limit the rows returned. This is shown in the following example:
SET %ROWID=999
&sql(DECLARE EmpCursor CURSOR FOR
SELECT Name, Home_State
INTO :name,:state FROM Sample.Person
WHERE Home_State %STARTSWITH 'M'
GROUP BY Home_State)
&sql(OPEN EmpCursor)
FOR { &sql(FETCH EmpCursor)
QUIT:SQLCODE
WRITE !,"RowID: ",%ROWID," row count: ",%ROWCOUNT
WRITE " Name=",name," State=",state
}
&sql(CLOSE EmpCursor)
This change of query behavior only applies to cursor-based Embedded SQL
SELECT queries. Dynamic SQL
SELECT queries and non-cursor Embedded SQL
SELECT queries never set %ROWID.
Transaction Committed Changes
A query containing a
GROUP BY clause does not support READ COMMITTED isolation level. In a transaction defined as READ COMMITTED, a
SELECT statement without a
GROUP BY clause returns only data modifications that have been committed; in other words, it returns the state of the data before the current transaction. A
SELECT statement with a
GROUP BY clause returns all data modifications made, whether or not they have been committed.