Caché SQL Reference
GROUP BY
[Back] [Next]
   
Server:docs2
Instance:LATEST
User:UnknownUser
 
-
Go to:
Search:    

A SELECT clause that groups the resulting rows of a query according to one or more columns.
Synopsis
SELECT ...
GROUP BY field
Arguments
field One or more fields from which data is being retrieved. Either a single field name or a comma-separated list of field names.
Description
GROUP BY is a clause of the SELECT statement. The optional GROUP BY clause appears after the FROM clause and the optional WHERE clause, and before the optional HAVING and ORDER BY clauses.
The GROUP BY clause takes the resulting rows of a query and breaks them up into individual groups according to one or more database columns. When you use SELECT in conjunction with GROUP BY, one row is retrieved for each distinct value of the GROUP BY fields. GROUP BY treats fields with NULL (no specified value) as a separate distinct value group.
The GROUP BY clause is conceptually similar to the Caché extension %FOREACH, but GROUP BY operates on an entire query, while %FOREACH allows selection of aggregates on sub-populations without restricting the entire query population.
Specifying a Field
The simplest form of a GROUP BY clause specifies a single field, such as GROUP BY Home_State. The field must be specified by its column name. You cannot specify a field by column alias or by column number. You cannot specify an aggregate field; attempting to do so generates an SQLCODE -19 error.
Valid field values include the following: A column name (GROUP BY Home_State); an %ID (which returns all rows); a scalar function specifying a column name (GROUP BY ROUND(Age,-1)); a collation function specifying a column name (GROUP BY %EXACT(Home_City)).
A GROUP BY clause can use the arrow syntax (–>) operator to specify a field in a table that is not the base table. For example: GROUP BY Company->Name. For further details, refer to Implicit Joins in Using Caché SQL.
Specifying a literal in a GROUP BY clause returns 1 row; which row is returned is indeterminate. Thus, specifying 7, 'Chicago', '', 0, or NULL all return 1 row.
Aggregate Functions with GROUP BY and DISTINCT BY
The GROUP BY clause is applied before aggregate functions are calculated. In the following example, the COUNT aggregate function counts the number of rows in each GROUP BY group:
SELECT Home_State,COUNT(Home_State)
FROM Sample.Person
GROUP BY Home_State
 
The DISTINCT BY clause is applied after aggregate functions are calculated. In the following example, the COUNT aggregate function counts the number of rows in the entire table:
SELECT DISTINCT BY(Home_State) Home_State,COUNT(Home_State)
FROM Sample.Person
 
In order to calculate an aggregate function for the entire table, rather than a GROUP BY group, you can specify a select-item subquery:
SELECT Home_State,(SELECT COUNT(Home_State) FROM Sample.Person)
FROM Sample.Person
GROUP BY Home_State
 
A GROUP BY clause should not be used with a DISTINCT clause when the select list consists of an aggregate field. For example, the following query is intended to return the distinct numbers of people who share the same Home_State:
/* This query DOES NOT apply the DISTINCT keyword */
/* It is provided as a cautionary example  */
SELECT DISTINCT COUNT(*) AS mynum
FROM Sample.Person 
GROUP BY Home_State
ORDER BY mynum
 
This query did not return the expected results because it did not apply the DISTINCT keyword. To apply both a DISTINCT aggregate and a GROUP BY clause, use a subquery as shown in the following example:
SELECT DISTINCT *
FROM (SELECT COUNT(*) AS mynum
      FROM Sample.Person 
      GROUP BY Home_State) AS Sub
ORDER BY Sub.mynum
 
This example successfully returns the distinct numbers of people who share the same Home_State. For instance, if any Home_State is shared by 8 people, the query returns an 8.
Collation, Letter Case, and Optimization
By default, GROUP BY groups together string data type fields based on the collation specified for the field when it was created. Caché has a default string collation, which can be set for each namespace; the initial string collation default for all namespaces is SQLUPPER, which is not case-sensitive. Therefore, commonly, GROUP BY collation is not case-sensitive unless otherwise specified.
GROUP BY groups together the values of a field with SQLUPPER collation based on their uppercase letter collation. Field values that differ only in letter case are grouped together. Grouped field values are returned in all uppercase letters. To group values by original letter case, or to display the returned values for a grouped field in their original letter case, use the %EXACT collation function. This is shown in the following examples, which assume that Sample.Person contains records with a Home_City field with SQLUPPER collation and values of ‘New York’ and ‘new york’:
SELECT Home_City FROM Sample.Person GROUP BY Home_City
/* groups together Home_City values by their uppercase letter values
   returns the name of each grouped city in uppercase letters. 
   Thus, 'NEW YORK' is returned.                          */
 
SELECT %EXACT(Home_City) FROM Sample.Person GROUP BY Home_City
/* groups together Home_City values by their uppercase letter values
   returns the name of each grouped city in original letter case. 
   Thus, 'New York' or 'new york' may be returned, but not both. */
 
SELECT Home_City FROM Sample.Person GROUP BY %EXACT(Home_City)
/* groups together Home_City values by their original letter case
   returns the name of each grouped city in original letter case. 
   Thus, both 'New York' and 'new york' are returned.
   Optimization is not used. */
 
You can configure optimization for queries that contain a GROUP BY clause using the Management Portal. Select [Home] > [Configuration] > [General SQL Settings]. View and set the DISTINCT Optimization Turned ON option. (This optimization also works for the DISTINCT clause.) The default is “Yes”. When this optimization is in effect, grouping of alphabetic values is done using their uppercase letter collation. For further details, refer to SQL configuration settings described in Caché Advanced Configuration Settings Reference.
You can also set this system-wide option to 1 or 0 with the $SYSTEM.SQL.SetFastDistinct() method:
  WRITE $SYSTEM.SQL.SetFastDistinct(1)
This optimization takes advantage of indices for the selected field(s). It is therefore only meaningful if an index exists for one or more of the selected fields. It collates field values as they are stored in the index; alphabetic strings are returned in all uppercase letters. You can set this system-wide option, then override it for specific queries by using the %EXACT collation function to preserve letter case.
%ROWID
Specifying a GROUP BY clause causes a cursor-based Embedded SQL query to not set the %ROWID variable. %ROWID is not set even when GROUP BY does not limit the rows returned. This is shown in the following example:
   SET %ROWID=999
   &sql(DECLARE EmpCursor CURSOR FOR 
        SELECT Name, Home_State
        INTO :name,:state FROM Sample.Person
        WHERE Home_State %STARTSWITH 'M' 
        GROUP BY Home_State)
   &sql(OPEN EmpCursor)
   FOR { &sql(FETCH EmpCursor)
        QUIT:SQLCODE  
        WRITE !,"RowID: ",%ROWID," row count: ",%ROWCOUNT
        WRITE " Name=",name," State=",state
  }
   &sql(CLOSE EmpCursor)
 
This change of query behavior only applies to cursor-based Embedded SQL SELECT queries. Dynamic SQL SELECT queries and non-cursor Embedded SQL SELECT queries never set %ROWID.
Transaction Committed Changes
A query containing a GROUP BY clause does not support READ COMMITTED isolation level. In a transaction defined as READ COMMITTED, a SELECT statement without a GROUP BY clause returns only data modifications that have been committed; in other words, it returns the state of the data before the current transaction. A SELECT statement with a GROUP BY clause returns all data modifications made, whether or not they have been committed.
See Also