Using a SELECT Statement
Using a SELECT Statement
A SELECT statement selects one or more rows of data from one or more tables or views. A simple SELECT is shown in the following example:
SELECT Name,DOB FROM Sample.Person WHERE Name %STARTSWITH 'A' ORDER BY DOB
In this example, Name and DOB are columns (data fields) in the Sample.Person table.
The order that clauses must be specified in a SELECT statement is: SELECT DISTINCT TOP ... selectItems INTO ... FROM ... WHERE ... GROUP BY ... HAVING ... ORDER BY. This is the command syntax order. All of these clauses are optional, except SELECT selectItems. (The optional FROM clause is required to perform any operations on stored data, and therefore is almost always required in a query.) Refer to the SELECT statement syntax for details on the required order for specifying SELECT clauses.
SELECT Clause Order of Execution
The operation of a SELECT statement can be understood by noting its semantic processing order (which is not the same as the SELECT syntax order). The clauses of a SELECT are processed in the following order:
WHERE clause — restricts what data is selected using various criteria.
GROUP BY clause — organizes the selected data into subsets with matching values; only one record is returned for each value.
HAVING clause — restricts what data is selected from groups using various criteria.
selectItem — selects a data fields from the specified table or view. A selectItem can also be an expression which may or may not reference a specific data field.
DISTINCT clause — applied to the SELECT result set, it limits the rows returned to those that contain a distinct (non-duplicate) value.
ORDER BY clause — applied to the SELECT result set, it sorts the rows returned in collation order by the specified field(s).
This semantic order shows that a table alias (which is defined in the FROM clause) can be recognized by all clauses, but a column alias (which is defined in the SELECT selectItems) can only be recognized by the ORDER BY clause.
To use a column alias in other SELECT clauses you can use a subquery, as shown in the following example:
SELECT Interns FROM (SELECT Name AS Interns FROM Sample.Employee WHERE Age<21) WHERE Interns %STARTSWITH 'A'
In this example, Name and Age are columns (data fields) in the Sample.Person table, and Interns is a column alias for Name.
When you issue a SELECT, InterSystems SQL attempts to match each specified selectItem field name to a property defined in the class corresponding to the specified table. Each class property has both a property name and a SqlFieldName. If you defined the table using SQL, the field name specified in the CREATE TABLE command is the SqlFieldName, and InterSystems IRIS generated the property name from the SqlFieldName.
Field names, class property names, and SqlFieldName names have different naming conventions:
Field names in a SELECT statement are not case-sensitive. SqlFieldName names and property names are case-sensitive.
Field names in a SELECT statement and SqlFieldName names can contain certain non-alphanumeric characters following identifier naming conventions. Property names can only contain alphanumeric characters. When generating a property name, InterSystems IRIS strips out non-alphanumeric characters. InterSystems IRIS may have to append a character to create a unique property name.
The translation between these three names for a field determine several aspects of query behavior. You can specify a selectItem field name using any combination of letter case and InterSystems SQL will identify the appropriate corresponding property. The data column header name in the result set display is the SqlFieldName, not the field name specified in the selectItem. This is why the letter case of the data column header may differ from the selectItem field name.
You can specify a column alias for a selectItem field. A column alias can be in any mix of letter case, and can contain non-alphanumeric characters, following identifier naming conventions. A column alias can be referenced using any combination of letter case (for example, in the ORDER BY clause) and InterSystems SQL resolves to the letter case specified in the selectItem field. InterSystems IRIS always attempts to match to the list of column aliases before attempting to match to the list of properties corresponding to defined fields. If you have defined a column alias, the data column header name in the result set display is the column alias in the specified letter case, not the SqlFieldName.
When a SELECT query completes successfully, InterSystems SQL generates a result set class for that query. The result set class contains a property corresponding to each selected field. If a SELECT query contains duplicate field names, the system generates unique property names for each instance of the field in the query by appending a character. For this reason, you cannot include more than 36 instances of the same field in a query.
The generated result set class for a query also contains properties for column aliases. To avoid the performance cost of letter case resolution, you should use the same letter case when referencing a column alias as the letter case used when specifying the column alias in the SELECT statement.
In addition to user-specified column aliases, InterSystems SQL also automatically generates up to three aliases for each field name, aliases which correspond to common letter case variants of the field name. These generated aliases are invisible to the user. They are provided for performance reasons, because accessing a property through an alias is faster than resolving letter case through letter case translation. For example, if SELECT specifies FAMILYNAME and the corresponding property is familyname, InterSystems SQL resolves letter case using a generated alias (FAMILYNAME AS familyname). However, if SELECT specifies fAmILyNaMe and the corresponding property is familyname, InterSystems SQL must resolves letter case using the slower letter case translation process.
A selectItem item can also be an expression, an aggregate function, a subquery, a user-defined function, as asterisk, or some other value. For further details on selectItem items other than field names, refer to the selectItem argument of the SELECT command reference page.
The JOIN Operation
A JOIN provides a way to link data in one table with data in another table and are frequently used in defining reports and queries. Within SQL, a JOIN is an operation that combines data from two tables to produce a third, subject to a restrictive condition. Every row of the resulting table must satisfy the restrictive condition.
InterSystems SQL supports five types of joins (some with multiple syntactic forms): CROSS JOIN, INNER JOIN, LEFT OUTER JOIN, RIGHT OUTER JOIN, and FULL OUTER JOIN. Outer joins support the ON clause with a full range of conditional expression predicates and logical operators. There is partial support for NATURAL outer joins and outer joins with a USING clause. For definitions of these join types and further details, refer to the JOIN page in the InterSystems SQL Reference.
If a query contains a join, all of the field references within that query must have an appended table alias. Because InterSystems IRIS does not include the table alias in the data column header name, you may wish to provide column aliases for selectItem fields to clarify which table is the source of the data.
The following example uses a join operation to match the “fake” (randomly-assigned) zip codes in Sample.Person with the real zip codes and city names in Sample.USZipCode. A WHERE clause is provided because USZipCode does not include all possible 5-digit zip codes:
SELECT P.Home_City,P.Home_Zip AS FakeZip,Z.ZipCode,Z.City AS ZipCity,Z.State FROM Sample.Person AS P LEFT OUTER JOIN Sample.USZipCode AS Z ON P.Home_Zip=Z.ZipCode WHERE Z.ZipCode IS NOT NULL ORDER BY P.Home_City
Queries Selecting Large Numbers of Fields
A query cannot select more than 1,000 selectItem fields.
A query selecting more than 150 selectItem fields may have the following performance consideration. InterSystems IRIS automatically generates result set column aliases. These generated aliases are provided for field names without user-defined aliases to enable rapid resolution of letter case variations. Letter case resolution using an alias is significantly faster than letter case resolution by letter case translation. However, the number of generated result set column aliases is limited to 500. Because commonly InterSystems IRIS generates three of these aliases (for the three most common letter case variations) for each field, the system generates aliases for roughly the first 150 specified fields in the query. Therefore, a query referencing less than 150 fields commonly has better result set performance than a query referencing significantly more fields. This performance issue can be avoided by specifying an exact column alias for each field selectItem in a very large query (for example, SELECT FamilyName AS FamilyName) and then making sure that you use the same letter case when referencing the result set item by column alias.