Skip to main content
Previous sectionNext section

TOP

A SELECT clause that specifies how many rows to return.

Synopsis

SELECT  [DISTINCT clause] 
  [TOP {[((]int[))] | ALL}]
  select-item {,select-item2}

Arguments

Argument Description
int
Limits the number of rows returned to the specified integer number. The int argument can be either a positive integer, a Dynamic SQL input parameter (?) or an Embedded SQL host variable (:var) that resolve to a positive integer.
In Dynamic SQL, the int value can optionally be enclosed with single parentheses or double parentheses (double parentheses are the preferred syntax); these parentheses suppress literal substitution of the int value in the corresponding cached query.
ALL
TOP ALL is only meaningful in a subquery or in a CREATE VIEW statement. It is used to support the use of an ORDER BY clause in these situations, fulfilling the requirement that an ORDER BY clause must be paired with a TOP clause in a subquery or a query used in a CREATE VIEW. TOP ALL does not restrict the number of rows returned.

Description

The optional TOP clause appears after the SELECT keyword and the optional DISTINCT clause, and before the first select-item.

The TOP keyword is used in Dynamic SQL and in cursor-based Embedded SQL. In non-cursor Embedded SQL the only meaningful use of the TOP keyword is TOP 0. Any other TOP int (where int is any non-zero integer) is valid but not meaningful because a SELECT in non-cursor Embedded SQL always returns at most one row of data.

The TOP clause of a SELECT statement limits the number of rows returned to the number specified in int. If no TOP clause is specified, the default is to display all the rows that meet the SELECT criteria. If a TOP clause is specified, the number or rows displayed is either int or all of the rows that fulfill the query predicate requirements, whichever is smaller. If you specify ALL, SELECT returns all the rows in the table that fulfill the query predicate requirements.

If no ORDER BY clause is specified in the query, which records are returned as the “top” rows is unpredictable. If an ORDER BY clause is specified, the top rows accord to the order specified in that clause.

The DISTINCT clause (if specified) is applied before TOP, specifying that (at most) int number of unique values are to be returned.

TOP short circuits when all rows have been delivered. Thus, if you select until you get SQLCODE 100, the FETCH that sets SQLCODE 100 is instant.

When accessing data through a view, or through a FROM clause subquery, you can limit the number of rows returned by using the %vid view ID, rather than (or in addition to) the TOP clause. For further details on using %vid, refer to the Defining and Using Views chapter of Using InterSystems SQL.

The TOP int Value

The int numeric value can be an integer, or a numeric string, a Dynamic SQL input parameter (?), or an input host variable (:var) that resolve to an integer value.

The int value specifies the number of rows to return. Permitted values are 0 and positive numbers. You cannot specify the int value as an arithmetic expression, field name, subquery column alias, scalar function, or aggregate function. A fractional number or a numeric string is parsed as its integer value. Zero (0) is a valid int value. TOP 0 executes the query but returns no data.

TOP ALL must be specified as a keyword in the query. You cannot specify ALL as a ? input parameter or :var host variable value. The query parser interprets the string “ALL” supplied in this way as a numeric string with a value of 0.

Note that the TOP argument metadata is returned as xDBC data type 12 (VARCHAR) rather than 4 (INTEGER) because it is possible to specify TOP int as a numeric string or an integer.

TOP and Cached Queries

An int value can be specified with or without enclosing parentheses. These parentheses affect how a Dynamic SQL query is cached (non-cursor Embedded SQL queries are not cached). An int value without parentheses is converted to a ? parameter variable in the cached query. This means that repeatedly invoking the same query with different TOP int values invokes the same cached query, rather than preparing and optimizing the query each time.

Enclosing parentheses suppress literal substitution. For example, TOP ((7)). When int is enclosed in parentheses, the cached query preserves the specific int value. Re-invoking the query with the same TOP int value uses the cached query; invoking the query with a different TOP int value causes SQL to prepare, optimize, and cache this new version of the query.

TOP ALL is not cached as a ? parameter variable. ALL is parsed as a keyword, not a literal. Therefore, the same query with TOP 7 and with TOP ALL will generate two different cached queries.

TOP and ORDER BY

TOP is generally used in a SELECT with an ORDER BY clause. Note that the default ascending ORDER BY collation sequence considers NULL to be the lowest (“top”) value, followed by the empty string ('').

TOP is required in a subquery SELECT or a CREATE VIEW SELECT when specifying an ORDER BY clause. In these cases you can specify either TOP int (to limit the number of rows to return) or TOP ALL.

TOP ALL is only used in a subquery or in a CREATE VIEW statement. It is used to support the use of an ORDER BY clause in these situations, fulfilling the requirement that an ORDER BY clause must be paired with a TOP clause in a subquery or a CREATE VIEW query. TOP ALL does not restrict the number of rows returned. TOP ALL ... ORDER BY does not change default SELECT optimization. The ALL keyword cannot be enclosed in parentheses.

TOP Optimization

By default, a SELECT optimizes for fastest time to return all data. Adding both a TOP int clause and an ORDER BY clause optimizes for fastest time to return first row. (Note that both clauses are required to change the optimization.) You can use the %SYS.PTools.StatsSQL class TotalTimeToFirstRow property to return the time required to return the first row.

The following are special case optimizations:

  • You may wish to use the TOP and ORDER BY optimization strategy without limiting the number of rows returned; for example, if you are returning data that is displayed in page units. In such a case, you may want to issue a TOP clause with an int value larger than the total number of rows.

  • You may wish to limit the number of rows returned and specify their order without changing the default SELECT optimization. In this case, specify a TOP clause, an ORDER BY clause, and the %NOTOPOPT keyword to preserve fastest time to return all data optimization. See the FROM clause for more details.

TOP with Aggregates and Functions

An aggregate function or a scalar function can only return a single value. If the query select-item list contains only aggregates and functions, the application of the TOP clause is as follows:

  • If the select-item list contains an aggregate function, for example COUNT(*) or AVG(Age), and does not contain any field references, no more than one row is returned, regardless of the TOP int value or the presence of an ORDER BY clause. These clauses are validated, but ignored. This is shown in the following examples:

    SELECT TOP 5 AVG(Age),CURRENT_TIMESTAMP(3) FROM Sample.Person
      /* returns 1 row */
    Copy code to clipboard
    SELECT TOP 1 AVG(Age),CURRENT_TIMESTAMP(3) FROM Sample.Person ORDER BY Age
      /* returns 1 row */
    Copy code to clipboard
  • If the select-item list contains one or more scalar functions, expressions, literals (such as %TABLENAME), subqueries, or host variables, and does not contain any field references or aggregates, the TOP clause is applied. This is shown in the following example:

    SELECT TOP 5 ROUND(678.987,2),CURRENT_TIMESTAMP(3) FROM Sample.Person
      /* returns 5 identical rows */
    Copy code to clipboard

    The actual number of rows returned depends on the number of rows in the table, even when table fields are not referenced. For example:

    SELECT TOP 300 CURRENT_TIMESTAMP(3) FROM Sample.Person
      /* returns either the number of rows in Sample.Person
         or 300 rows, whichever is smaller */
    Copy code to clipboard

    When the query is restricted by a predicate condition, the number of rows returned is restricted by that condition, even when table fields are not referenced in the select-item list. For example:

    SELECT TOP 300 CURRENT_TIMESTAMP(3) FROM Sample.Person WHERE Home_State = 'MA'
      /* returns either the number of rows in Sample.Person
         where Home_State = 'MA'
         or 300 rows, whichever is smaller */
    Copy code to clipboard
  • If the SELECT statement does not contain a FROM clause, at most one row is returned, regardless of the TOP value. For example:

    SELECT TOP 5 ROUND(678.987,2),CURRENT_TIMESTAMP(3)
      /* returns 1 row */
    Copy code to clipboard
  • The DISTINCT clause further limits the TOP clause. If there are fewer distinct values than the TOP value, only the rows with distinct values are returned. When only scalar functions are referenced, only one row is returned. For example:

    SELECT DISTINCT TOP 15 CURRENT_TIMESTAMP(3) FROM Sample.Person
      /* returns 1 row */
    Copy code to clipboard
  • TOP 0 always returns no rows, regardless of the contents of the select-item list, or whether the SELECT statement contains a FROM clause or a DISTINCT clause.

    In non-cursor Embedded SQL, a query with TOP 0 returns no rows and sets SQLCODE=100; a non-cursor Embedded SQL query with TOP 1 (or any other TOP int value) returns one row and sets SQLCODE=0. In cursor-based Embedded SQL, completion of the fetch loop always sets SQLCODE=100, regardless of the TOP int value.

Examples

The following query returns the first 20 rows retrieved from Sample.Person in the order that they are stored in the database. This record order is generally not predictable.

SELECT TOP 20 Home_State,Name FROM Sample.Person
Copy code to clipboard

The following query returns the first 20 distinct Home_State values retrieved from Sample.Person in ascending collation sequence order.

SELECT DISTINCT TOP 20 Home_State FROM Sample.Person ORDER BY Home_State
Copy code to clipboard

The following query returns the first 40 distinct FavoriteColor values. The “top” rows reflect the ORDER BY clause sequencing of all of the rows in Sample.Person in descending (DESC) collation sequence. Descending collation sequence is used rather than the default ascending collation sequence because the FavoriteColors field is known to have NULLs, which would appear at the top of the ascending collation sequence.

SELECT DISTINCT TOP 40 FavoriteColors FROM Sample.Person 
      ORDER BY FavoriteColors DESC
Copy code to clipboard

Also note in the preceding example that because FavoriteColors is a list field, the collation sequence includes the element length byte. Thus six-letter elements (YELLOW, PURPLE, ORANGE) collate together, listed before five-letter elements (WHITE, GREEN, etc.).

Dynamic SQL can specify the int value as an input parameter (indicated by “?”). In the following example, the TOP ? input parameter is set to 10 by the %Execute method:

  SET myquery = "SELECT TOP ? Name,Age FROM Sample.Person"
  SET tStatement = ##class(%SQL.Statement).%New()
  SET qStatus = tStatement.%Prepare(myquery)
    IF qStatus'=1 {WRITE "%Prepare failed:" DO $System.Status.DisplayError(qStatus) QUIT}
  SET rset = tStatement.%Execute(10)
  DO rset.%Display()
Copy code to clipboard

The following cursor-based Embedded SQL example performs the same operation:

  SET topnum=10
  &sql(DECLARE pCursor CURSOR FOR
       SELECT TOP :topnum Name,Age INTO :name,:years FROM Sample.Person
      )
  &sql(OPEN pCursor)
       QUIT:(SQLCODE'=0)
  FOR { &sql(FETCH pCursor)
        QUIT:SQLCODE
        WRITE "Name=",name," Age=",years,!
      }
  &sql(CLOSE pCursor)
Copy code to clipboard

See Also