Skip to main content

CREATE AGGREGATE (SQL)

Creates a user-defined aggregate function.

Synopsis

CREATE [OR REPLACE] AGGREGATE name(parameter_list) [ RETURNS datatype ]
   [ INITIALIZE WITH function-name ]
     ITERATE WITH function-name
   [ MERGE WITH function-name ]
   [ FINALIZE WITH function-name ]

Description

The CREATE AGGREGATE command creates a user-defined aggregate function (UDAF). When invoked, this user-defined aggregate function iterates through the row values and invokes one or more user-defined functions to compute an aggregate value. You can use CREATE AGGREGATE to provide aggregate operations not provided by the standard InterSystems IRIS SQL aggregate functions.

If you invoke CREATE AGGREGATE to create a UDAF that already exists, SQL issues an SQLCODE -428 error, with a %msg such as: User Defined Aggregate Function SQLUser.MyUDAF already exists. If you specify the optional OR REPLACE keyword clause (CREATE OR REPLACE AGGREGATE), specifying the name of an existing UDAF does not generate an error. Instead, the existing UDAF is updated with the specified definition.

To delete a user-defined aggregate function, use the DROP AGGREGATE command.

Privileges

The CREATE AGGREGATE command is a privileged operation. Before using CREATE AGGREGATE you must have Execute privilege for the UDAF and all referenced user-defined functions. Failing to do so results in an SQLCODE -99 error (Privilege Violation).

Aggregate Function Name

The UDAF name must be a valid identifier. Aggregate function names are not case-sensitive.

The UDAF name can be qualified (schema.aggname), or unqualified (aggname). An unqualified name takes the default schema name.

The UDAF name cannot be the same as the name of an existing stored procedure. Attempted to create a UDAF name that duplicates a stored procedure name generates an SQLCODE -428 error, with a %msg such as User Defined Aggregate Function SQLUser.MyFunction conflicts with existing stored procedure name.

INITIALIZE WITH Clause

The optional INITIALIZE WITH clause invokes the specified user-defined function or class method to compose the initial state object. The state object value is used to pass interim aggregate values or other variables required to perform the end calculation. If this clause is not specified, a null object is passed as the initial state object to the function specified in the ITERATE WITH clause.

The specified user-defined function-name must exist when CREATE AGGREGATE is invoked; otherwise an SQLCODE -428 error is generated and %msg specifies the UDAF function, the clause, and the non-existent function name.

The following is a user-defined function that defines an initial state object:

CREATE FUNCTION MyAggregateInit() returns varchar language ObjectScript { RETURN "^" }

ITERATE WITH Clause

The ITERATE WITH clause invokes the specified user-defined function or class method once for each row being aggregated. It take a state object representing the interim result and the current row's column value(s) as input parameters and performs its operation on that state object, which accumulates the aggregate value. When all rows have been processed it returns the new state value.

The specified user-defined function-name must exist when CREATE AGGREGATE is invoked; otherwise an SQLCODE -428 error is generated and %msg specifies the UDAF function, the clause, and the non-existent function name.

MERGE WITH Clause

The optional MERGE WITH clause can be specified to enable parallel processing of the user-defined aggregate function. If not specified, the query invoking the UDAF uses single-thread processing. For further details, see Parallel Processing.

FINALIZE WITH Clause

The optional FINALIZE WITH clause invokes the specified user-defined function or class method once, at the end of processing, to perform any final calculations based on the state value returned from the last call to the ITERATE WITH clause function. If the invoking query specifies a GROUP BY clause, this user-defined function is invoked once for each GROUP BY grouped value.

The specified user-defined function-name must exist when CREATE AGGREGATE is invoked; otherwise an SQLCODE -428 error is generated and %msg specifies the UDAF function, the clause, and the non-existent function name.

Arguments

name

The name of the user-defined aggregate function to be created. The name must be a valid identifier. The name can be qualified (schema.aggname), or unqualified (aggname). An unqualified name takes the default schema name. Aggregate function names are not case-sensitive. The name must be followed by parentheses containing one or more parameters.

parameter_list

A list of parameters used to pass values to the aggregate function. The parameter list is enclosed in parentheses. You can specify a single parameter, or a list of parameters separated by commas. Each parameter in the list consists of a parameter name and a data type. For example: (param1 INTEGER,param2 NUMERIC).

RETURNS datatype

An optional argument that specifies the data type to return the aggregate function value. If omitted, the data type defaults to the data type of the first parameter in the parameter_list.

function-name

The name of an existing user-defined function created using the CREATE FUNCTION command, or a class method that returns a value and is projected as an SQL procedure. A user-defined function is stored as a method in a stored procedure class. For example, the user-defined function MyFunction takes the default schema name: SQLUser.MyFunction, which corresponds to the class User.funcMyFunction which contains the classmethod MyFunction().

Invoking a User-defined Aggregate Function

User-defined aggregate functions follow the same usage rules as standard aggregate functions.

A UDAF is invoked in a SELECT list, either as a listed select-item or in a subquery select-item. It can specify a column alias; if a column alias is not specified, it defaults to Aggregate_n. For example,

SELECT Home_State,AVG(Age) AS AvgAge,MAX(Age) AS MaxAge,SecondHighest(Age) AS SecondMaxAge 
FROM Sample.Person GROUP BY Home_State 

A UDAF cannot be used directly in an ORDER BY clause. Attempting to do so generates an SQLCODE -73 error. However, you can use a user-defined aggregate function in an ORDER BY clause by specifying the corresponding column alias or select-item sequence number.

A UDAF can be used directly in a HAVING clause. However, a HAVING clause must explicitly specify the user-defined aggregate function; it cannot specify a UDAF using the corresponding select-item column alias or select-item sequence number.

An aggregate function cannot be used directly in:

  • a WHERE clause. Attempting to do so generates an SQLCODE -19 error.

  • a GROUP BY clause. Attempting to do so generates an SQLCODE -19 error.

  • a TOP clause. Attempting to do so generates an SQLCODE -1 error.

  • a JOIN. Attempting to specify an aggregate in an ON clause generates an SQLCODE -19 error. Attempting to specify an aggregate in a USING clause generates an SQLCODE -1 error.

Unlike a standard aggregate function, a user-defined aggregate function cannot specify a DISTINCT, %FOREACH, or %AFTERHAVING clause.

Parallel Processing

If the optional MERGE WITH clause is specified, the MERGE WITH function merges the supplied state objects coming from the ITERATE WITH functions of two or more parallel subqueries, returning a single merged values that represents the aggregated state. The MERGE WITH function is automatically invoked as many times as the number of parallel processes. The result of these merges is supplied to the FINALIZE WITH clause.

When declaring a MERGE WITH function, it is assumed the state object supports implicit serialization. For example, by implementing the %SerialObject interface in ObjectScript.

If a MERGE WITH function is not provided, the user-defined aggregate function is not processed by parallel threads when %PARALLEL or sharding is specified. It is porcessed as a single thread.

Listing User-defined Aggregate Functions

The INFORMATION.SCHEMA.USERDEFINEDAGGREGATESOpens in a new tab persistent class displays information about all user-defined aggregate functions in the current namespace. It provides a number of properties including the names of the user-defined functions specified in its clauses.

The following example returns the schema name, user-defined aggregate name, ITERATE clause function name, and returned data type for all user-defined aggregate functions in the current namespace:

SELECT AGGREGATE_SCHEMA,AGGREGATE_NAME,ITERATE_FUNCTION,RETURN_TYPE 
FROM INFORMATION_SCHEMA.USER_DEFINED_AGGREGATES

If no RETURNS clause is specified, the RETURN_TYPE value is NULL.

Example

The following example creates a user-defined aggregate function that sums values by adding all high (>=5) values and subtracting 5 for all low (<5) values. All values are data type NUMERIC(4,1). The first step is to create the iterate function, specifying a state variable (tot) and an input variable (num):

CREATE FUNCTION Sample.AddSub(tot NUMERIC(4,1),IN num NUMERIC(4,1)) RETURNS NUMERIC(4,1) 
LANGUAGE OBJECTSCRIPT {IF num>=5 {SET tot=tot+num} ELSE {SET tot=tot-5} QUIT tot}

You can then define the user-defined aggregate function:

CREATE AGGREGATE Sample.SumAddSub(arg NUMERIC(4,1)) 
   ITERATE WITH Sample.AddSub

You can then invoke this user-defined aggregate function for the Score field as follows:

SELECT TestSubject,Score,SUM(Score) AS ScoreSum,Sample.SumAddSub(Score) AS ScoreAddHighSubtractLow

To avoid negative values, add a FINALIZE WITH function:

CREATE FUNCTION Sample.NoNeg(tot NUMERIC(4,1)) RETURNS NUMERIC(4,1) 
LANGUAGE OBJECTSCRIPT {IF num>0 {QUIT tot} ELSE {SET tot=0 QUIT tot}}
CREATE OR REPLACE AGGREGATE Sample.SumAddSub(arg NUMERIC(4,1))
   ITERATE WITH Sample.AddSub
   FINALIZE WITH Sample.NoNeg

See Also

FeedbackOpens in a new tab