Caché SQL Reference
VARIANCE, VAR_SAMP, VAR_POP
[Home] [Back] [Next]
InterSystems: The power behind what matters   
Class Reference   
Search:    

Aggregate functions that return the statistical variance of a data set.
Synopsis
VARIANCE([ALL | DISTINCT [BY(col-list)]] expression [%FOREACH(col-list)] [%AFTERHAVING])

VAR_SAMP([ALL | DISTINCT [BY(col-list)]] expression [%FOREACH(col-list)] [%AFTERHAVING])

VAR_POP([ALL | DISTINCT [BY(col-list)]] expression [%FOREACH(col-list)] [%AFTERHAVING])
Arguments
ALL Optional — Specifies that statistical variance functions return the variance of all values for expression. This is the default if no keyword is specified.
DISTINCT Optional — A DISTINCT clause that specifies that statistical variance functions return the variance of the distinct (unique) expression values. DISTINCT can specify a BY(col-list) subclause, where col-list can be a single field or a comma-separated list of fields.
expression Any valid expression. Usually the name of a column that contains the data values to be analyzed for variance.
%FOREACH(col-list) Optional — A column name or a comma-separated list of column names. See SELECT for further information on %FOREACH.
%AFTERHAVING Optional — Applies the condition found in the HAVING clause.
Description
These three variance aggregate functions return the statistical variance of the values of expression, after discarding NULL values. That is, the amount of variation from the mean value of the data set, expressed as a positive number. The larger the return value, the more variation there is within the data set of values. Caché SQL also supplies aggregate functions to return the standard deviation corresponding to each of these variance functions.
There are slight variations in how this statistical variation is derived:
These variance aggregate functions can be used in a SELECT query or subquery that references either a table or a view. They can appear in a SELECT list or HAVING clause alongside ordinary field values.
These variance aggregate functions cannot be used in a WHERE clause. They cannot be used in the ON clause of a JOIN, unless the SELECT is a subquery.
These variance aggregate functions return a value of data type NUMERIC with a precision of 36 and a scale of 17, unless expression is data type DOUBLE in which case the function returns data type DOUBLE.
These variance aggregate functions are normally applied to a field or expression that has a numeric value. They evaluate nonnumeric values, including the empty string (''), as zero (0).
These variance aggregate functions ignore NULL values in data fields. If no rows are returned by the query, or the data field value for all rows returned is NULL, they return NULL.
The statistical variance functions, like all aggregate functions, can take an optional DISTINCT clause. VARIANCE(DISTINCT col1) returns the variance of those col1 field values that are distinct (unique). VARIANCE(DISTINCT BY(col2) col1) returns the variance of the col1 field values in records where the col2 values are distinct (unique). Note however that the distinct col2 values may include a single NULL as a distinct value.
Changes Made During the Current Transaction
Like all aggregate functions, the variance functions always returns the current state of the data, including uncommitted changes, regardless of the current transaction’s isolation level. For further details, refer to SET TRANSACTION and START TRANSACTION.
Examples
The following example uses VARIANCE to return the variance in the ages of the employees in Sample.Employee, and the variance in the distinct ages represented by one or more employees:
SELECT VARIANCE(Age) AS AgeVar,VARIANCE(DISTINCT Age) AS PerAgeVar
     FROM Sample.Employee
The following example uses VAR_POP to return the population variance in the ages of the employees in Sample.Employee, and the variance in the distinct ages represented by one or more employees:
SELECT VAR_POP(Age) AS AgePopVar,VAR_POP(DISTINCT Age) AS PerAgePopVar
     FROM Sample.Employee
See Also