SELECT
Synopsis
[(] SELECT [%NOFPLAN] [%NOLOCK] [DISTINCT [BY (item {,item2})] | ALL] [TOP {int | ALL}] select-item {,select-item, ...} [INTO host-variable-list] [FROM [optimize-option] table-ref [[AS] t-alias] {,table-ref [[AS] t-alias]} ] [WHERE condition-expression [{AND | OR condition-expression}] ] [GROUP BY scalar-expression] [HAVING condition-expression [{AND | OR condition-expression}] ] [ORDER BY item-order-list [ASC | DESC] ] [)] select-item ::= [t-alias.]* | [t-alias.]scalar-expression [[AS] c-alias]
Arguments
Argument | Description |
---|---|
%NOFPLAN | Optional — The %NOFPLAN keyword specifies that Caché will ignore the frozen plan (if any) for this query and generate a new query plan. The frozen plan is retained, but not used. For further details, refer to Frozen Plans in Caché SQL Optimization Guide. |
%NOLOCK | Optional — The %NOLOCK keyword specifies that Caché will perform no locking on any of the specified tables. If you specify this keyword, the query retrieves data in READ UNCOMMITTED mode, regardless of current transaction’s isolation mode. For further details, refer to Transaction Processing in the “Modifying the Database” chapter of Using Caché SQL. |
DISTINCT DISTINCT BY (item) ALL |
Optional — The DISTINCT clause specifies that each row returned must contain a unique value for the specified field or combination of fields. A DISTINCT keyword specifies that the select-item value(s) must be unique. A DISTINCT BY keyword clause specifies that item value(s) must be unique. An item (or a comma-separated list of items) is enclosed in parentheses. Commonly, an item is the name of a column. It may or may not also be listed as a select-item. Optional — The ALL keyword specifies that all rows that meet the SELECT criteria be returned. This is the default for Caché SQL. The ALL keyword performs no operation; it is provided for SQL compatibility. See DISTINCT clause for more details. |
TOP int TOP ALL |
Optional — The TOP clause limits the number of rows returned to the number specified in int. If no ORDER BY clause is specified in the query, which records are returned as the “top” rows is unpredictable. If an ORDER BY clause is specified, the top rows accord to the specified order. The DISTINCT keyword (if specified) is applied before TOP, specifying that int number of unique values are to be returned. The int argument can be either a positive integer or a Dynamic SQL ? input parameter that resolves to a positive integer. If no TOP keyword is specified, the default is to display all the rows that meet the SELECT criteria. TOP ALL is only meaningful in a subquery or in a CREATE VIEW statement. It is used to support the use of an ORDER BY clause in these situations, fulfilling the requirement that an ORDER BY clause must be paired with a TOP clause in a subquery or a query used in a CREATE VIEW. TOP ALL does not restrict the number of rows returned. See TOP clause for more details. |
select-item | One or more columns (or other values) to be retrieved. Multiple select-items are specified as a comma-separated list. You can also retrieve all columns by using the * symbol. |
INTO host-variable-list | Optional — (Embedded SQL only): One or more host variables into which select-item values are placed. Multiple host variables are specified as a comma-separated list or as a single host variable array. See INTO clause for more details. |
FROM table-ref |
Optional — A reference to one or more tables from which data is being retrieved. A valid table-ref is required for every FROM clause, even if the SELECT makes no reference to that table. A SELECT that makes no references to table data can omit the FROM clause. A table-ref can be specified as one or more tables, views, table-valued functions, or subqueries, specified as a comma-separated list or with JOIN syntax. Some restrictions apply on using views with JOIN syntax. A subquery must be enclosed in parentheses. A table-ref is either qualified (schema.tablename) or unqualified (tablename). An unqualified table-ref is supplied either the default schema name, or a schema name from the schema search path. Multiple tables can be specified as a comma-separated list or associated with ANSI join keywords. Any combination of tables or views can be specified. If you specify a comma between two table-refs here, Caché performs a CROSS JOIN on the tables and retrieves data from the results table of the JOIN operation. If you specify ANSI join keywords between two table-refs here, Caché performs the specified join operation. For further details, refer to the JOIN page of this manual. You can optionally assign an alias (t-alias) to each table-ref. The AS keyword is optional. You can optionally specify one or more optimize-option keywords to optimize query execution. The available options are: %ALLINDEX, %FIRSTTABLE, %FULL, %INORDER, %IGNOREINDEX, %NOFLATTEN, %NOMERGE, %NOREDUCE, %NOSVSO, %NOTOPOPT, %NOUNIONOROPT, %PARALLEL, and %STARTTABLE. See FROM clause for more details. |
WHERE condition-expression | Optional — A qualifier specifying one or more predicate conditions for what data is to be retrieved. See WHERE clause for more details. |
GROUP BY scalar-expression | Optional — A comma-separated list of one or more scalar expressions specifying how the retrieved data is to be organized; these may include column names. See GROUP BY clause for more details. |
HAVING condition-expression | Optional — A qualifier specifying one or more predicate conditions for what data is to be retrieved. See HAVING clause for more details. |
ORDER BY item-order-list | Optional — A select-item or a comma-separated list of items that specify the order in which rows are displayed. Each item can have an optional ASC (ascending order) or DESC (descending order). The default is ascending order. An ORDER BY clause is used on the results of a query. An ORDER BY clause in a subquery (for example, in a UNION statement) must be paired with a TOP clause. If no ORDER BY clause is specified, the order of the records returned is unpredictable. See ORDER BY clause for more details. |
scalar-expression | A field identifier, an expression containing a field identifier, or a general expression, such as a function call or an arithmetic operation. |
AS t-alias | Optional — An alias for a table or view name (table-ref). An alias must be a valid identifier; it can be a delimited identifier. For further details see the “Identifiers” chapter of Using Caché SQL. The AS keyword is optional. |
AS c-alias | Optional — An alias for a column name (select-item). An alias must be a valid identifier. For further details see the “Identifiers” chapter of Using Caché SQL. The AS keyword is optional. |
Description
The SELECT statement performs a query that retrieves data from a Caché database. In its simplest form, it retrieves data from one or more columns (fields) of a single table. The columns are specified by the select-item list, the table is specified by the FROM table-ref clause, and optionally a WHERE clause supplies one or more limiting conditions that select which rows return their column values.
In more complex queries, a SELECT can retrieve column, aggregate, and non-column data, can retrieve data from multiple tables using joins, and can retrieve data using views.
A SELECT can also be used to return a value from an SQL function, a host variable, or a literal. A SELECT query can combine returning these non-database values with retrieving values from tables or views. When a SELECT is only used to return such non-database values, the FROM clause is optional. See FROM clause for more details.
The values returned from a SELECT query are known as a result set. In Dynamic SQL, SELECT retrieves values into the %SQL.StatementOpens in a new tab class. Refer to the Dynamic SQL chapter of Using Caché SQL, and the %SQL.StatementOpens in a new tab class in the InterSystems Class Reference.
Caché sets a status variable SQLCODE, which indicates the success or failure of the SELECT. In addition, the SELECT operation sets the %ROWCOUNT local variable to the number of selected rows. Successful completion of a SELECT generally sets SQLCODE=0 and %ROWCOUNT to the number of rows selected. In the case of an embedded SQL containing a simple SELECT, data from (at most) one row is selected, so SQLCODE=0 and %ROWCOUNT is set to either 0 or 1. However, in the case of an embedded SQL SELECT that declares a cursor and fetches data from multiple rows, the operation completes when the cursor has been advanced to the end of the data (SQLCODE=100); at that point, %ROWCOUNT is set to the total number of rows selected. Refer to the FETCH command for further details.
Uses of SELECT
You can use a SELECT statement in the following contexts:
-
As an independent query prepared as a Dynamic SQL query, an Embedded SQL query, or a Class Query.
-
As a subquery, a SELECT statement that supplies values to a clause of an enclosing SELECT statement. A subquery in a SELECT statement can be specified in the select-item list, in a FROM clause, or in a WHERE clause with an EXISTS or IN predicate. A subquery can also be specified in an UPDATE or DELETE statement. A subquery must be enclosed in parentheses.
-
As a leg of a UNION. The UNION statement allows you to combine two or more SELECT statements into a single query.
-
As part of a CREATE VIEW defining the data available to the view.
-
As part of a DECLARE CURSOR used with Embedded SQL.
-
As part of an INSERT with a SELECT. An INSERT statement can use a SELECT to insert data values for multiple rows into a table, selecting the data from another table.
You can enclose the entire SELECT statement with one or more sets of parentheses, as follows:
-
Parentheses are optional for an independent SELECT query, a UNION leg SELECT query, a CREATE VIEW SELECT query, or a DECLARE CURSOR SELECT query. Enclosing a SELECT query in parentheses causes it to follow the syntax rules for a subquery; specifically, an ORDER BY clause must be paired with a TOP clause.
-
Parentheses are mandatory for a subquery. One set of parentheses is mandatory; you can specify additional optional sets of parentheses.
-
Parentheses are not permitted for an INSERT statement SELECT query.
Specifying optional parentheses generates a separate cached query for each set of parentheses added.
Privileges
To perform a SELECT query on one or more tables, you must either have column-level SELECT privileges for all of the specified select-item column(s), or table-level SELECT privileges for the specified table-ref table(s) or view(s). A select-item column specified using a table alias (such as t.Name or "MyAlias".Name) only requires column-level SELECT privilege, not table-level SELECT privilege.
When using SELECT *, note that column-level privileges cover all table columns named in the GRANT statement; table-level privileges cover all table columns, including those added after the privilege was assigned.
Failing to have the necessary privileges results in an SQLCODE -99 error (Privilege Violation). You can determine if the current user has SELECT privilege by invoking the %CHECKPRIV command. You can determine if a specified user has table-level SELECT privilege by invoking the $SYSTEM.SQL.CheckPriv()Opens in a new tab method. For privilege assignment, refer to the GRANT command.
Having table-level SELECT privilege for a table is not a sufficient test that the table actually exists. If the specified user has the %All role, CheckPriv() returns 1 even if the specified table or view does not exist.
A SELECT query that does not have a FROM clause does not require any SELECT privileges. A SELECT query that contains a FROM clause requires SELECT privilege, even if no column data is accessed by the query.
Required Clauses
The following are required clauses for all SELECT statements:
-
A select-item list, a comma-separated list of one or more items (the select-item arguments) to be retrieved from the table or otherwise generated. Most commonly, these items are the names of columns in a table. The select-item consists of either a scalar expression specifying one or more individual items, or an asterisk (*) referring to all the columns of a base table.
-
A FROM clause specifies one or more tables, views, or subqueries from which rows are to be retrieved. These tables may be associated by JOIN expressions. In Caché SQL a FROM clause with a valid table-ref is required for a SELECT that makes any reference to table data. A FROM clause is optional for a SELECT that does not access table data. The optional FROM clause is further described in the FROM clause reference page.
Optional Clauses
The following optional clauses operate on the virtual table that a FROM clause returns. All are optional, but, if used, must appear in the order specified:
-
A DISTINCT clause, which specifies that only distinct (non-duplicate) values should be returned.
-
A TOP clause, which specifies how many rows to return.
-
A WHERE clause, which specifies boolean predicate conditions that rows must match. The WHERE clause predicate condition both determines which rows are returned and limits the values supplied to aggregate functions to the values from those rows. These conditions are specified by one or more predicates linked by logical operators; the WHERE clause returns all records that satisfy these predicate conditions. A WHERE clause predicate cannot include aggregate functions.
-
A GROUP BY clause, which specifies a comma-delimited list of columns. These organize a query’ result set into subsets with matching values for one or more columns and determine the ordering of the rows returned. GROUP BY allows scalar expressions as well as columns.
-
A HAVING clause, which specifies boolean predicate conditions that rows must match. These conditions are specified by one or more predicates linked by logical operators. The HAVING clause predicate condition determines which rows are returned, but (by default) it does not limits the values supplied to aggregate functions to the values from those rows. This default can be overridden using the %AFTERHAVING keyword. A HAVING clause predicate can specify aggregate functions. These predicates typically operate on each group specified by a GROUP BY clause.
-
An ORDER BY clause, which specifies the order in which rows should be displayed. An ORDER BY clause in a subquery or a CREATE VIEW query must be paired with a TOP clause.
Specifying SELECT clauses in the incorrect order generates an SQLCODE –25 error.
The SELECT syntax order is not the same as the SELECT clauses semantic processing order. For further details, refer to SELECT Clause Order of Execution.
The DISTINCT Clause
The DISTINCT keyword clause causes redundant field values to be eliminated. It has two forms:
-
SELECT DISTINCT: Returns one row for each unique combination of select-item values. You can specify one or more than one select-items. For example, the following query returns a row with Home_State and Age values for each unique combination of Home_State and Age values:
SELECT DISTINCT Home_State,Age FROM Sample.Person
-
SELECT DISTINCT BY (item): Returns one row for each unique combination of item values. You can specify a single item or a comma-separated list of items. The select-item list may, but does not have to, include the specified item(s). For example, the following query returns a row with Name and Age values for each unique combination of Home_State and Age values:
SELECT DISTINCT BY (Home_State,Age) Name,Age FROM Sample.Person
An item can be any valid select-item value, except an asterisk. It cannot be a column name alias.
Either type of DISTINCT clause can specify more than one item to test for uniqueness. Listing more than one item retrieves all rows that are distinct for the combination of both items. DISTINCT does consider NULL a unique value. For further details, see the DISTINCT clause reference page.
The TOP Clause
The TOP keyword clause specifies that the SELECT statement return only a specified number of rows. It returns the specified number of rows that appear at the “top” of the returned virtual table. By default, which rows are the “top” rows of the table is unpredictable. However, Caché applies the DISTINCT and ORDER BY clauses (if specified) before selecting the TOP rows. For further details, see the TOP clause reference page.
The select-item
This is a mandatory element for all SELECT statements. Commonly, a select-item refers to a field in the table(s) specified in the FROM clause. A select-item consists of one or more of the following items, with multiple items separated by commas:
-
A column name (field name), with or without a table name alias:
SELECT Name,Age FROM Sample.Person
Field names are not case-sensitive. However, the label associated with the field in the result set uses the letter case of the SqlFieldName as specified in the table definition, not the letter case specified in the select-item. See “Field Column Alias” for further details on letter case resolution.
A field name containing one or more underscores references an embedded serial object property. For example, for the field name Home_City, the table contains a referencing field Home that references an embedded serial object that defines the property City. For the field name Home_Phone_AreaCode, the table contains a referencing field Home that references an embedded serial object property Phone that references a nested embedded serial object that defines the property AreaCode. If you select a referencing field such as Home or Home_Phone, you receive the values of all of properties in the serial object in %List data type format.
To list all of the column names defined for a specified table, refer to Column Names and Numbers in the “Defining Tables” chapter of Using Caché SQL.
To display the RowID (record ID), you can use the %ID pseudo-field variable alias, which displays the RowID regardless of what name it is assigned. By default, the name of the RowID is ID, but Caché may rename it if there is a user-defined field named ID. By default, RowID is a hidden field.
A SELECT on a stream field returns the oref (object reference) of the opened stream object:
SELECT Name,Picture FROM Sample.Employee WHERE Picture IS NOT NULL
When the FROM clause specifies more than one table or view, you must include the table name (or a table name alias) as part of the select-item, using periods, as shown in the following two examples:
Full table name:
SELECT Sample.Person.Name,Sample.Employee.Company FROM Sample.Person, Sample.Employee
Table name alias:
SELECT p.Name, e.Company FROM Sample.Person AS p, Sample.Employee AS e
However, you cannot use a full table name as part of a select-item if an alias has been assigned to that table name. Attempting to so results in an SQLCODE -23 error.
You can use a collation function to specify the sorting and display of a select-item field. You can supply the collation function without parentheses (SELECT %SQLUPPER Name) or with parentheses (SELECT %SQLUPPER(Name)). If the collation function specifies truncation, the parentheses are required (SELECT %SQLUPPER(Name,10)).
When the select-item references an embedded serial object property (embedded serial class data), use underline syntax. Underline syntax consists of the name of the object property, an underscore, and the property within the embedded object: for example, Home_City and Home_State. (In other contexts, index tables for example, these are represented using dot syntax: Home.City.)
SELECT Home_City,Home_State FROM Sample.Person
You can use SELECT to directly query a referencing field (such as Home), rather than using underline syntax. Because the data returned is in list format, you may want to use a $LISTTOSTRING or $LISTGET function to display the data. For example:
SELECT $LISTTOSTRING(Home,'^') AS HomeAddress FROM Sample.Person
-
A subquery. A subquery returns a single column from a specified table. This column can be the values of a single table field (SELECT Name), or the values of multiple table fields returned as a single column, either by using concatenation (SELECT Home_City||Home_State) or by specifying a container field (SELECT Home). A subquery can use arrow syntax. A subquery cannot use asterisk syntax, even when the table cited in the subquery has only one data field.
One common use of a subquery is to specify an aggregate function that is not subject to the GROUP BY clause. In the following example, the GROUP BY clause groups ages by decades (for example, 25 through 34). The AVG(Age) select-item gives the average age of each group, as defined by the GROUP BY clause. In order to get the average age of all of the records in all groups, it uses a subquery:
SELECT Age AS Decade, COUNT(Age) AS PeopleInDecade, AVG(Age) AS AvgAgeForDecade, (SELECT AVG(Age) FROM Sample.Person) AS AvgAgeAllDecades FROM Sample.Person GROUP BY ROUND(Age,-1) ORDER BY Age
-
Arrow syntax, used to access a field from a table other than the FROM clause table. This is known as an implicit join. In the following example, the Sample.Employee table contains a Company field containing the RowID for the corresponding company name in the Sample.Company table. The arrow syntax retrieves the company name from that table:
SELECT Name,Company->Name AS CompanyName FROM Sample.Employee
In this case, you must have SELECT privileges for the referenced table: either table-level SELECT privilege, or column-level SELECT privilege for both the referenced field and the RowID column of the referenced table. For further details on arrow syntax, refer to Implicit Joins (Arrow Syntax) in Using Caché SQL.
-
Asterisk syntax (*), which selects all the columns in a table in column number order:
SELECT TOP 5 * FROM Sample.Person
Asterisk syntax selects embedded serial object properties (fields), including properties from a serial object nested within a serial object. A field referencing a serial object is not selected. For example, the Home_City property from an embedded serial object is selected, but the Home referencing field used to access the Sample.Address embedded serial class (which contains the City property) is not selected.
Asterisk syntax does not select hidden fields. By default, the RowID is hidden (not displayed by SELECT *). However, if the table was defined with %PUBLICROWID, SELECT * returns the RowID field and all non-hidden fields. By default, the name of this field is ID, but Caché may rename it if there is a user-defined field named ID.
Note:Most of the example tables supplied in the Samples namespace are defined with %PUBLICROWID.
If the select-item is an asterisk and more than one table is specified, it selects all the columns in all of the joined tables:
SELECT TOP 5 * FROM Sample.Company,Sample.Employee
Asterisk syntax can be qualified or unqualified. If the select-item is qualified by prefixing a table name (or table name alias) and period (.) before the asterisk, the select-item selects all the columns in the specified table. Qualified asterisk syntax can be combined with other select items for other tables.
In the following example, select-item consists of an unqualified asterisk syntax that selects all columns from the table. Note that you can also specify duplicate column names (in this case Name) and non-column select-item elements (in this case {fn NOW}):
SELECT TOP 5 {fn NOW} AS QueryDate, Name AS Client, * FROM Sample.Person
In the following example, select-item consists of qualified asterisk syntax that selects all columns from one table, and a list of column names from another table.
SELECT TOP 5 E.Name AS EmpName, C.*, E.Home_State AS EmpState FROM Sample.Employee AS E, Sample.Company AS C
Note:SELECT * is a fully supported part of Caché SQL that can be extremely convenient during application development and debugging. However, in production applications the preferred programming practice is to explicitly list the selected fields, rather than using the asterisk syntax form. Explicitly listing fields makes your application clearer and easier to understand, easier to maintain, and easier to search for fields by name.
-
A select-item containing one or more SQL aggregate functions. An aggregate function always returns a single value. The argument of an aggregate function may be any of the following:
-
A single column name—computes the aggregate for all non-null values of the rows selected by the query:
SELECT AVG(Age) FROM Sample.Person
-
A scalar expression is also permitted to compute an aggregate:
SELECT SUM(Age) / COUNT(*) FROM Sample.Person
-
Asterisk syntax (*) — used with the COUNT function to compute the number of rows in the table:
SELECT COUNT(*) FROM Sample.Person
-
A select distinct function — computes the aggregate by eliminating redundant values:
SELECT COUNT(DISTINCT Home_State) FROM Sample.Person
-
While ANSI SQL does not allow the combination of column names and aggregate functions in a single SELECT statement, Caché SQL extends the standard by allowing this:
SELECT Name, COUNT(DISTINCT Home_State) FROM Sample.Person
-
An aggregate function using %FOREACH. This causes the aggregate to be computed for each distinct value of a column or columns:
SELECT DISTINCT Home_State, AVG(Age %FOREACH(Home_State)) FROM Sample.Person
-
An aggregate function using %AFTERHAVING. This causes the aggregate to be computed on a sub-population specified with the HAVING clause:
SELECT Name,AVG(Age %AFTERHAVING) FROM Sample.Person HAVING (Age > AVG(Age))
would return those records where Age is greater than average age, giving the average age for those persons whose age is above the average for all persons in the database.
-
-
A user-defined class method stored as a procedure. May be an unqualified method name or a qualified method name. The following are all valid class method names: RandLetter() an unqualified name for which a schema is supplied; Sample.RandLetter() a qualified class method name; and Sample.Rand_Letter() invoking class method "Rand_Letter"(). In the following example, RandCaseLetter() is a class method that returns a random letter, in either uppercase (‘U’) or lowercase ('L'):
SELECT RandCaseLetter('U')
The return value from the method is automatically converted from Logical format to Display/ODBC format. An input value to the method is, by default, not converted from Display/ODBC format to Logical format. However, input display-to-logical conversion can be configured system-wide using the $SYSTEM.SQL.SetSQLFunctionArgConversion()Opens in a new tab method. You can use $SYSTEM.SQL.GetSQLFunctionArgConversion()Opens in a new tab to determine the current configuration of this option.
If the specified method does not exist in the current namespace, the system generates a SQLCODE -359 error. If the specified method is ambiguous (could refer to more than one method), the system generates a SQLCODE -358 error. For further details on creating a class method, refer to CREATE METHOD.
-
A user-supplied ObjectScript function call (extrinsic function) operating on a database column:
SELECT $$REFORMAT(Name)FROM MyTable
You can only invoke user-supplied (extrinsic) functions within an SQL statement if the Allow extrinsic functions in SQL statements option has been configured system-wide. The default is “No”; by default, attempting to invoke user-supplied functions generates an SQLCODE -372 error.
You cannot use a user-supplied function to call a % routine (a routine with a name that begins with the % character). Attempting to do so generates an SQLCODE -373 error.
-
A select-item that applies additional processing to a field value:
Arithmetic operations:
SELECT Name, Age,Age-AVG(Age) FROM Sample.Person
If a select-item arithmetic operation includes division, and there are any values for that field in the database that could produce a divisor with a value of zero or a NULL value, you cannot rely on order of testing to avoid division by zero. Instead, use a case statement to suppress the risk.
SQL functions:
SELECT Name,$LENGTH(Name) FROM Sample.Person
SQL case conversion functions:
SELECT Name,UCASE(Name) FROM Sample.Person
An XMLELEMENT, XMLFOREST, or XMLCONCAT function, which place XML (or HTML) tags around the data values retrieved from specified column names. Refer to XMLELEMENT for further details.
-
A select-item that returns the same value for all records.
When all of the select-items references no table data, the FROM clause is optional. If you include the FROM clause, the specified table must exist. For further details on optional FROM clause, refer to the FROM clause reference page.
-
Arithmetic operations:
SELECT 7 * 7, 7 * 8 FROM Sample.Person
SELECT Name, Age, 9 - 6 FROM Sample.Person
-
A string literal or a function operating on a string literal:
SELECT UCASE('fred') FROM Sample.Person
String literals can be used to produce a more readable output, as shown in the following example:
SELECT TOP 10 Name,'was born on',%EXTERNAL(DOB) FROM Sample.Person
The way a numeric literal is specified determines its data type. Therefore, the string '123' is reported as data type VARCHAR, and the numeric 123 is reported as data type INTEGER or NUMERIC.
-
A %TABLENAME, or %CLASSNAME pseudo-field variable keyword. %TABLENAME returns the current table name. %CLASSNAME returns the name of the class corresponding to the current table. If the query references multiple tables, you can prefix the keyword with a table alias. For example, t1.%TABLENAME.
-
One of the following ObjectScript special variables (or their abbreviations): $HOROLOG, $JOB, $NAMESPACE, $TLEVEL, $USERNAME, $ZHOROLOG, $ZJOB, $ZNSPACE, $ZPI, $ZTIMESTAMP, $ZTIMEZONE, $ZVERSION.
-
The Column Alias
When specifying a select-item, you can use the AS keyword to specify an alias for the name of a column:
SELECT Name AS PersonName, DOB AS BirthDate, ...
The column alias is displayed as the column header in the result set. Specifying a column alias is optional; a default is always provided. A column alias is displayed with the specified letter case; it is not, however, case-sensitive when referenced in an ORDER BY clause. The c-alias name must be a valid identifier. A c-alias name can be a delimited identifier. For further details see the “Identifiers” chapter of Using Caché SQL.
The AS keyword is not required, but makes the query text easier to read. Thus the following is also valid syntax:
SELECT Name PersonName, DOB BirthDate, ...
SQL does not perform uniqueness checking for column aliases. It is possible (though not desirable) for a field column and a column alias to have the same name, or for two column aliases to be identical. Such non-unique column aliases may cause an SQLCODE -24 “Ambiguous sort column” error when referenced by an ORDER BY clause. Column aliases, like all SQL identifiers, are not case-sensitive.
Use of column aliases in other SELECT clauses is governed by query semantic processing order. You can reference a column by its column alias in an ORDER BY clause. You cannot reference a column alias in another select-item in the select list, in a DISTINCT BY clause, a WHERE clause, a GROUP BY clause, or a HAVING clause. You cannot reference a column alias in a JOIN operation’s ON clause or USING clause. You can, however, use a subquery to make a column alias available for use by other these other SELECT clauses, as shown in the “Querying the Database” chapter of Using Caché SQL.
Field Column Aliases
A select-item field name is not case-sensitive. However, unless you supply a column alias, the name of a field column in the result set follows the letter case of the SqlFieldName associated with the column property. The letter case of the SqlFieldName corresponds to the field name as specified in the table definition, not as specified in the select-item list. Therefore, SELECT name FROM Sample.Person returns the field column label as Name. Using a field column alias allows you to specify the letter case to display, as shown in the following example:
SELECT name,name AS NAME
FROM Sample.Person
Letter case resolution takes time. To maximize SELECT performance, you can specify the exact letter case of the field name, as specified in the table definition. However, determining the exact letter case of a field in the table definition is often inconvenient and prone to error. Instead, you can use a field column alias to avoid letter case issues. Note that all references to the field column alias must match in letter case.
The following Dynamic SQL example requires letter case resolution (the SqlFieldNames are “Latitude” and “Longitude”):
ZNSPACE "SAMPLES"
SET myquery = "SELECT latitude,longitude FROM Sample.USZipCode"
SET tStatement = ##class(%SQL.Statement).%New()
SET qStatus = tStatement.%Prepare(myquery)
IF qStatus'=1 {WRITE "%Prepare failed:" DO $System.Status.DisplayError(qStatus) QUIT}
SET rset = tStatement.%Execute()
WHILE rset.%Next() {WRITE rset.latitude," ",rset.longitude,! }
The following Dynamic SQL example does not requires letter case resolution, and therefore executes faster:
ZNSPACE "SAMPLES"
SET myquery = "SELECT latitude AS northsouth,longitude AS eastwest FROM Sample.USZipCode"
SET tStatement = ##class(%SQL.Statement).%New()
SET qStatus = tStatement.%Prepare(myquery)
IF qStatus'=1 {WRITE "%Prepare failed:" DO $System.Status.DisplayError(qStatus) QUIT}
SET rset = tStatement.%Execute()
WHILE rset.%Next() {WRITE rset.northsouth," ",rset.eastwest,! }
The t-alias table alias prefix is not included in the column name. Therefore, in the following example, both columns are labeled as Name:
SELECT p.Name,e.Name
FROM Sample.Person AS p LEFT JOIN Sample.Employee AS e ON p.Name=e.Name
To distinguish the columns in a query that specifies multiple tables, you should specify column aliases:
SELECT p.Name AS PersonName,e.Name AS EmployeeName
FROM Sample.Person AS p LEFT JOIN Sample.Employee AS e ON p.Name=e.Name
You may also wish to provide a column alias to make the data easier to understand. In the following example, the table column “Home_State” is renamed “US_State_Abbrev”:
SELECT Name,Home_State AS US_State_Abbrev
FROM Sample.Person
Note that %ID references a specific column, and therefore returns that field name (ID, by default), or a specified column alias, as shown in the following example:
SELECT %ID,%ID AS Ident,Name
FROM Sample.Person
Non-Field Column Aliases
Non-field columns are automatically assigned a column name. If you provide no alias for such fields, Caché SQL supplies a unique column name, such as “Expression_1”, or “Aggregate_3”. The integer suffix refers to the select-item position as specified in the SELECT statement. They are not a count of fields of that type.
The following are automatically assigned column names (n is an integer). These are listed in increasingly inclusive order. For example, adding a plus or minus sign to a number promotes it from a HostVar to an Expression; concatenating a HostVar and a Literal promotes it to an Expression; specifying a Literal, HostVar, Aggregate, or Expression in a subquery promotes it to a SubQuery:
-
Literal_n: a pseudo-field variable such as %TABLENAME, or the NULL specifier. Note that %ID is not Literal_n; it is given the column name of the actual RowID field.
-
HostVar_n: a host variable. This may be a literal, such as ‘text’, 123, or the empty string (''), an input variable (:myvar), or a ? input parameter replaced by a literal. Note that any expression evaluation on a literal, such appending a sign to a number, string concatenation, or an arithmetic operation, makes it an Expression_n. A literal value supplied to a ? parameter is returned unchanged without expression evaluation. For example, supplying 5+7 returns the string '5+7' as HostVar_n.
-
Aggregate_n: an aggregate function, such as AVG(Age) or COUNT(*). A column is named Aggregate_n if the outermost operation is an aggregate function, even when this aggregate contains an expression. For example, COUNT(Name)+COUNT(Spouse) is Expression_n, but MAX(COUNT(Name)+COUNT(Spouse)) is Aggregate_n, -AVG(Age) is Expression_n, but AVG(-Age) is Aggregate_n.
-
Expression_n: any operation in the select-item list on a literal, a field, or on an Aggregate_n, HostVar_n, Literal_n, or Subquery_n select-item changes its column name to Expression_n. This includes unary operations on numbers (-Age), arithmetic operations (Age+5), concatenation ('USA:'||Home_State), data type CAST operations, SQL collation functions (%SQLUPPER(Name) or %SQLUPPER Name), SQL scalar functions ($LENGTH(Name)), user-defined class methods, CASE expressions, and special variables (such as CURRENT_DATE or $ZPI).
-
Subquery_n: the result of a subquery that specifies a single select-item. The select-item may be a field, aggregate function, expression, or literal. You specify the column alias after the subquery, not within the subquery.
In the following example, the aggregate field column created by the AVG function is given the column alias “AvgAge”; its default name is “Aggregate_3” (an aggregate field in position 3 in the SELECT list).
SELECT Name, Age, AVG(Age) AS AvgAge FROM Sample.Person
The following example is identical to the previous, except that the AS keyword is here omitted. The use of this keyword is recommended, but not required.
SELECT Name, Age, AVG(Age) AvgAge FROM Sample.Person
The following example show how to specify a column alias for a select-item subquery:
SELECT Name AS PersonName,
(SELECT Name FROM Sample.Employee) AS EmpName,
Age AS YearsOld
FROM Sample.Person
FROM Clause
The FROM table-ref clause specifies one or more tables, views, table-valued functions, or subqueries. You can specify any combination of these table-ref types as a comma-separated list or with JOIN syntax. If you specify a single table-ref, the specified data is retrieved from that table or view. If you specify multiple table-refs, SQL performs a join operation on the tables, merging their data into a results table from which the specified data is retrieved.
If you specify more than one table-ref, you can separate these table names with commas or with explicit join syntax keywords. For further details on specifying a comma-separated list of table names, see the FROM clause reference page. For further details on specifying multiple table names with explicit JOIN syntax (such as RIGHT JOIN or *=) see the JOIN reference page.
You can use the $SYSTEM.SQL.TableExists()Opens in a new tab or $SYSTEM.SQL.ViewExists()Opens in a new tab method to determine whether a table or view exists in the current namespace. You can use the $SYSTEM.SQL.CheckPriv()Opens in a new tab method to determine if you have SELECT privileges for that table or view.
The Table Alias
When specifying a table-ref, you can use the AS keyword to specify an alias for that table name or view name:
FROM Sample.Person AS P
The AS keyword is not required, but makes the query text easier to read. The following is valid equivalent syntax:
FROM Sample.Person P
The t-alias name must be a valid identifier. A t-alias name can be a delimited identifier. A t-alias must be unique among table aliases within the query. A t-alias, like all identifiers, is not case-sensitive. Therefore, you cannot specify two t-alias names that differ only in letter case. This results in an SQLCODE -20 “Name conflict” error. For further details see the “Identifiers” chapter of Using Caché SQL.
The table alias is used as a prefix (with a period) to a field name to indicate the table to which the field belongs. For example:
SELECT P.Name, E.Name
FROM Sample.Person AS P, Sample.Employee AS E
You must use a table reference prefix when a query specifies multiple tables that have the same field name. A table reference prefix can be a t-alias (as shown above) or it can be the fully qualified table name, as shown in the following equivalent example:
SELECT Sample.Person.Name, Sample.Employee.Name
FROM Sample.Person, Sample.Employee
However, you cannot use a full table name as part of a select-item if a t-alias has been assigned to that table name. Attempting to so results in an SQLCODE -23 error.
Specifying a table alias is optional when a query references only one table (or view). Specifying a table alias is optional (but recommended) when a query references multiple tables (and/or views) and the field names referenced are unique to each table. Specifying a table alias is required when a query references multiple tables (and/or views) and the field names referenced are the same in different tables. Failing to specify a t-alias (or fully qualified table name) prefix results in an SQLCODE -27 “Field %1 is ambiguous among the applicable tables” error.
A t-alias can be used, but is not required, when specifying a subquery such as the following:
SELECT Name,(SELECT Name FROM Sample.Vendor)
FROM Sample.Person
A t-alias only uniquely identifies a field for query execution; to uniquely identify a field for query result set display you must also use a column alias (c-alias). The following example uses both table aliases (Per and Emp) and column aliases (PName and Ename):
SELECT Per.Name AS PName, Emp.Name AS EName
FROM Sample.Person AS Per, Sample.Employee AS Emp
WHERE Per.Name %STARTSWITH 'G'
You can use the same name for a field, a column alias, and/or a table alias without a naming conflict.
A t-alias prefix is used wherever it is necessary to distinguish which table is being referred to. Some examples of this are shown in the following:
SELECT P.%ID As PersonID,
AVG(P.Age) AS AvgAge,
Z.%TABLENAME||'=' AS Tablename,
Z.*
FROM Sample.Person AS P, Sample.USZipCode AS Z
WHERE P.Home_City = Z.City
GROUP BY P.Home_City
ORDER BY Z.City
WHERE Clause
The WHERE clause qualifies or disqualifies specific rows from the query selection. The rows that qualify are those for which the condition-expression is true. The condition-expression is a list of logical tests (predicates) which can be linked by the AND and OR logical operators. These predicates may be inverted using the NOT unary logical operator.
The SQL predicates fall into the following categories:
For further details on these logical predicates, see the WHERE clause reference page. The condition-expression cannot contain aggregate functions. If you wish to specify a selection condition using a value returned by an aggregate function, use a HAVING clause.
A WHERE clause can specify an explicit join between two tables using the = (inner join), =* (left outer join), and *= (right outer join) symbolic join operators. For further details, refer to the JOIN page of this manual.
A WHERE clause can specify an implicit join between the base table and a field from another table using the arrow syntax (–>) operator. For further details, refer to Implicit Joins in Using Caché SQL.
GROUP BY Clause
The GROUP BY clause takes the resulting rows of a query and breaks them up into individual groups according to one or more database columns. When you use SELECT in conjunction with GROUP BY, one row is retrieved for each distinct value of the GROUP BY fields. The GROUP BY clause is conceptually similar to the Caché extension %FOREACH, but GROUP BY operates on an entire query, while %FOREACH allows selection of aggregates on sub-populations without restricting the entire query population. For instance:
SELECT Home_State, COUNT(Home_State) AS Population
FROM Sample.Person
GROUP BY Home_State
This query returns one row for each distinct Home_State.
For further details, see the GROUP BY clause reference page.
HAVING Clause
The HAVING clause is like a WHERE clause that operates on groups. It is typically used in combination with the GROUP BY clause, or with the %AFTERHAVING keyword. The HAVING clause qualifies or disqualifies specific rows from the query selection. The rows that qualify are those for which the condition-expression is true. The condition-expression is a list of logical tests (predicates) which can be linked by the AND and OR logical operators. The condition-expression can contain aggregate functions. For further details, see the HAVING clause reference page.
ORDER BY Clause
An ORDER BY clause consists of the ORDER BY keywords followed by a select-item or a comma-separated list of items that specify the order in which rows are displayed. Each item can have an optional ASC (ascending order) or DESC (descending order). The default is ascending order. An ORDER BY clause is applied to the results of a query, and is frequently paired with a TOP clause. For further details, see the ORDER BY clause reference page.
The following example returns the selected fields for all rows in the database, and orders these rows in ascending order by age:
SELECT Home_State, Name, Age
FROM Sample.Person
ORDER BY Age
SELECT and Transaction Processing
A transaction performing a query is defined as either READ COMMITTED or READ UNCOMMITTED. The default is READ UNCOMMITTED. A query that is not in a transaction is defined as READ UNCOMMITTED.
-
If READ UNCOMMITTED, a SELECT returns the current state of the data, including changes made to the data by transactions in progress which have not been committed. These changes may be subsequently rolled back.
-
If READ COMMITTED, the behavior depends on the contents of the SELECT statement. Normally, a SELECT statement in read committed mode would only return insert and update changes to data that has been committed. Data rows that have been deleted by a transaction in progress are not returned, even though these deletes have not been committed and may be rolled back.
However, if the SELECT statement contains a %NOLOCK keyword, a DISTINCT clause, or a GROUP BY clause, the SELECT returns the current state of the data, including changes made to data during the current transaction which have not been committed. An aggregate function in a SELECT also returns the current state of the data for the specified column(s), including uncommitted changes.
For further details, refer to SET TRANSACTION and START TRANSACTION.
Query Metadata
You can use Dynamic SQL to return metadata about the query, such as the number of columns specified in the query, the name (or alias) of a column specified in the query, and the data type of a column specified in the query. For further details, refer to the Dynamic SQL chapter of Using Caché SQL, and the %SQL.StatementOpens in a new tab class in the InterSystems Class Reference.
Examples
The following four examples perform similar queries, using different combinations of SELECT clauses. Note that these clauses must be specified in the correct order. In all four examples, three fields are selected from the Sample.Person table: Name, Home_State, and Age, and two fields (AvgAge and AvgMiddleAge) are computed.
HAVING/ORDER BY
In the following example, the AvgAge computed field is computed on all records in Sample.Person. The HAVING clause governs the AvgMiddleAge computed field, calculating the average age of those over 40 from all records in Sample.Person. Thus, every row has the same value for AvgAge and AvgMiddleAge. The ORDER BY clause sequences the display of the rows alphabetically by the Home_State field value.
SELECT Name,Home_State,Age,AVG(Age) AS AvgAge,
AVG(Age %AFTERHAVING) AS AvgMiddleAge
FROM Sample.Person
HAVING Age > 40
ORDER BY Home_State
WHERE/HAVING/ORDER BY
In the following example, the WHERE clause limits the selection to the seven specified northeastern states. The AvgAge computed field is computed on the records from those Home_States. The HAVING clause governs the AvgMiddleAge computed field, calculating the average age of those over 40 from the records from the specified Home_States. Thus, every row has the same value for AvgAge and AvgMiddleAge. The ORDER BY clause sequences the display of the rows alphabetically by the Home_State field value.
SELECT Name,Home_State,Age,AVG(Age) AS AvgAge,
AVG(Age %AFTERHAVING) AS AvgMiddleAge
FROM Sample.Person
WHERE Home_State IN ('ME','NH','VT','MA','RI','CT','NY')
HAVING Age > 40
ORDER BY Home_State
GROUP BY/HAVING/ORDER BY
The GROUP BY clause causes the AvgAge computed field to be separately computed for each Home_State group. The GROUP BY clause also limits the output display to the first record encountered from each Home_State. The HAVING clause governs the AvgMiddleAge computed field, calculating the average age of those over 40 in each Home_State group. The ORDER BY clause sequences the display of the rows alphabetically by the Home_State field value.
SELECT Name,Home_State,Age,AVG(Age) AS AvgAge,
AVG(Age %AFTERHAVING) AS AvgMiddleAge
FROM Sample.Person
GROUP BY Home_State
HAVING Age > 40
ORDER BY Home_State
WHERE/GROUP BY/HAVING/ORDER BY
The WHERE clause limits the selection to the seven specified northeastern states. The GROUP BY clause causes the AvgAge computed field to be separately computed for each of these seven Home_State groups. The GROUP BY clause also limits the output display to the first record encountered from each specified Home_State. The HAVING clause governs the AvgMiddleAge computed field, calculating the average age of those over 40 in each of the seven Home_State groups. The ORDER BY clause sequences the display of the rows alphabetically by the Home_State field value.
SELECT Name,Home_State,Age,AVG(Age) AS AvgAge,
AVG(Age %AFTERHAVING) AS AvgMiddleAge
FROM Sample.Person
WHERE Home_State IN ('ME','NH','VT','MA','RI','CT','NY')
GROUP BY Home_State
HAVING Age > 40
ORDER BY Home_State
Embedded SQL and Dynamic SQL Examples
Embedded SQL and Dynamic SQL can be used to issue a SELECT query from within a program in another language. Embedded SQL can be included in ObjectScript code. Dynamic SQL can be included in either ObjectScript code or Caché Basic code.
The following embedded SQL program retrieves data values from one record and places them in the output host variables specified in the INTO clause.
NEW SQLCODE,%ROWCOUNT
&sql(SELECT Home_State,Name,Age
INTO :a, :b, :c
FROM Sample.Person)
IF SQLCODE=0 {
WRITE !," Name=",b
WRITE !," Age=",c
WRITE !," Home State=",a
WRITE !,"Row count is: ",%ROWCOUNT }
ELSE {
WRITE !,"SELECT failed, SQLCODE=",SQLCODE }
This program retrieves (at most) one row, so the %ROWCOUNT variable is set to either 0 or 1. To retrieve multiple rows, you must declare a cursor and use the FETCH command. For further details, refer to the Embedded SQL chapter in Using Caché SQL.
The following Dynamic SQL example first tests whether the desired table exists and checks the current user’s SELECT privilege for that table. It then executes the query and returns a result set. It uses the WHILE loop to repeatedly invoke the %Next method for the first 10 records of the result set. It displays three field values using %GetData methods that specify the field position as specified in the SELECT statement:
ZNSPACE "Samples"
SET tname="Sample.Person"
IF $SYSTEM.SQL.TableExists(tname)
& $SYSTEM.SQL.CheckPriv($USERNAME,"1,"_tname,"s")
{GOTO SpecifyQuery}
ELSE {WRITE "Table unavailable" QUIT}
SpecifyQuery
SET myquery = 3
SET myquery(1) = "SELECT Home_State,Name,SSN,Age"
SET myquery(2) = "FROM "_tname
SET myquery(3) = "ORDER BY Name"
SET tStatement = ##class(%SQL.Statement).%New()
SET qStatus = tStatement.%Prepare(.myquery)
IF qStatus'=1 {WRITE "%Prepare failed:" DO $System.Status.DisplayError(qStatus) QUIT}
SET rset = tStatement.%Execute()
IF rset.%SQLCODE=0 {
SET x=0
WHILE x < 10 {
SET x=x+1
SET status=rset.%Next()
WRITE rset.%GetData(2)," " /* Name field */
WRITE rset.%GetData(1)," " /* Home_State field */
WRITE rset.%GetData(4),! /* Age field */
}
WRITE !,"End of Data"
WRITE !,"SQLCODE=",rset.%SQLCODE," Row Count=",rset.%ROWCOUNT
}
ELSE {
WRITE !,"SELECT failed, SQLCODE=",rset.%SQLCODE }
For further details, refer to the Dynamic SQL chapter in Using Caché SQL.
See Also
-
SELECT clauses: DISTINCT, FROM, GROUP BY, HAVING, INTO, ORDER BY, TOP, WHERE
-
“Querying the Database” chapter in Using Caché SQL
-
SQL configuration settings described in Caché Advanced Configuration Settings Reference.
-
SQLCODE error messages listed in the Caché Error Reference