InterSystems SQL Optimization Guide
A bitmap index is a special type of index that uses a series of bitstrings to represent the set of ID values that correspond to a given indexed data value.
Bitmap indices have the following important features:
Bitmaps are highly compressed: bitmap indices can be significantly smaller than standard indices. This reduces disk and cache usage considerably.
Bitmaps operations are optimized for transaction processing: you can use bitmap indices within tables with no performance penalty as compared with using standard indices.
Logical operations on bitmaps (counting, AND, and OR) are optimized for high performance.
The SQL Engine includes a number of special optimizations that can take advantage of bitmap indices.
The creation of bitmap indices depends upon the nature of the table’s unique identity field(s):
If the table’s ID field is defined as a single field with positive integer values, you can define a bitmap index for a field using this ID field. This type of table either uses a system-assigned unique positive integer ID, or uses an IdKey to define custom ID values where the IdKey is based on a single property with type %Integer and MINVAL > 0, or type %Numeric with SCALE = 0 and MINVAL > 0.
If the table’s ID field is not defined as single field with positive integer values (for example, a child table), you can define a %BID (bitmap ID) field
that takes positive integers which acts as a surrogate ID field; this allows you to create bitmap indices for fields in this table.
Subject to the restrictions
listed below, bitmap indices operate in the same manner as standard indices. Indexed values are collated
and you can index on combinations of multiple fields.
This chapter addresses the following topics related to bitmap indices:
Bitmap indices work in the following way. Suppose you have a Person
table containing a number of columns:
Each row in this table has a system-assigned RowID
number (a set of increasing integer values). A bitmap index uses a set of bitstrings (a string containing 1 and 0 values). Within a bitstring, the ordinal position of a bit corresponds to the RowID of the indexed table. For a given value, say where State
is “NY”, there is a string of bits with a 1 for every position that corresponds to a row containing “NY” and a 0 in every other position.
For example, a bitmap index on State might look like this:
State Bitmap Index
While an index on Age might look like this:
Age Bitmap Index
The Age field shown here can be an ordinary data field or a field whose value can be reliably derived
(Calculated and SQLComputed).
In addition to using bitmap indices for standard operations, the SQL engine can use bitmap indices to efficiently perform special set-based operations using combinations of multiple indices. For example, to find all instances of Person
that are 24 years old and live in New York, the SQL Engine can simply perform the logical AND of the Age and State indices:
Using Multiple Indices
The resulting bitmap contains the set of all rows that match the search criteria. The SQL Engine uses this to return data from these rows.
The SQL Engine can use bitmap indices for the following operations:
ANDing of multiple conditions on a given table.
ORing of multiple conditions on a given table.
RANGE conditions on a given table.
COUNT operations on a given table.
If the table’s ID is a field with values restricted to unique positive integers, you can add bitmap index definitions to a class definition using either the New Index Wizard or by editing the text of the class definition in the same way that you would create a standard index. The only difference is that you need to specify the index Type
as being “bitmap”:
Class MyApp.SalesPerson Extends %Persistent [DdlAllowed]
Property Name As %String;
Property Region As %Integer;
Index RegionIDX On Region [Type = bitmap];
If the table’s ID is not restricted to positive integers, you can create a %BID property to use to create bitmap index definitions. You can use this option for a table with an ID field of any datatype, as well as an IDKEY consisting of multiple fields (which includes child tables). A %BID bitmap can be created for either data storage type
: a default structure table or a %Storage.SQL table. This feature is referred to as “Bitmaps for Any Table,” or BAT.
To enable use of bitmap indices on such a table, you must do the following:
Define a %BID property/field for the class. This could be an existing property of the class, or a new property. It can have any name. If this is a new property, you will have to populate this property/field for all existing rows in the table. This %BID field must be defined with a data type that restricts the field data values to unique positive integers. For example, Property MyBID As %Counter;
Define a new class parameter to define which property is the %BID field. This parameter is named BIDField. This parameter is set to the SQLFieldName of the %BID property. For example, Parameter BIDField = "MyBID";
Define an index for the %BID. For example, Index BIDIdx On MyBID [ Type = key, Unique ];
Define the %BID locator index. This ties the %BID index to the table’s ID key field(s). The following example is for a table with a compound IDKey consisting of two fields:
Index IDIdx On (IDfield1, IDfield2) [ IdKey, Unique ];
Index BIDLocIdx On (IDfield1, IDfield2, MyBID) [ Data = IdKey, Unique ];
This table now supports bitmap indices. You can define bitmap indices as needed using standard syntax. For example: Index RegionIDX On Region [Type = bitmap];
This table now also supports bitslice indices
. You can define bitslice indices using standard syntax.
To build or rebuild a %BID bitmap index you must use %BuildIndices()
. The %ConstructIndicesParallel()
method is not supported for %BID bitmap indices.
If you are using DDL statements to define tables, you can also use the following DDL commands to create and remove bitmap indices for a table with a positive integer ID:
This is identical to creating standard indices, except that you must add the BITMAP keyword to the CREATE INDEX statement:
CREATE BITMAP INDEX RegionIDX ON TABLE MyApp.SalesPerson (Region)
When compiling a class that contains a bitmap index, the class compiler generates a bitmap extent index if there are any bitmap indices present in the class and no bitmap extent index is defined for that class. The class inherits the bitmap extent index from the primary superclass if it exists, either defined or generated. When building indices for a class, the bitmap extent index is built either if it is asked to be built or if another bitmap index is being built and the bitmap extent index structure is empty.
InterSystems IRIS does not generate a bitmap extent index unless there are bitmap indices present. A bitmap extent index is defined as: type = bitmap, extent = true. That means a bitmap extent index inherited from a primary superclass is considered to be a bitmap index and will trigger a bitmap extent index to be generated in the subclass, if no bitmap extent index is explicitly defined in that subclass.
InterSystems IRIS does not generate a bitmap extent index in a superclass based on future possibility. This means that InterSystems IRIS does not ever generate a bitmap extent index in a persistent class unless an index whose type = bitmap is present. A presumption that some future subclass might introduce an index with type = bitmap is not sufficient.
Special care is required during the process of adding a bitmap index to a class on a production system (where users are actively using a particular class, compiling said class, and subsequently building the bitmap index structure for it). On such a system, the bitmap extent index may be populated in the interim between the compile completing and the index build proceeding. This can cause the index build procedure to not implicitly build the bitmap extent index, which leads to a partially complete bitmap extent index.
The following is a general guideline for choosing between bitmap and standard indices. In general, use standard indices for indexing on all types of keys and references:
Simple object references
Otherwise, bitmap indices are generally preferable (assuming that the table uses system-assigned numeric ID numbers).
Separate bitmap indices on each property usually have better performance than a bitmap index on multiple properties. This is because the SQL engine can efficiently combine separate bitmap indices using AND and OR operations.
If a property (or a set of properties that you really need to index together) has more than 10,000-20,000 distinct values (or value combinations), consider standard indices. If, however, these values are very unevenly distributed so that a small number of values accounts for a substantial fraction of rows, then a bitmap index could be much better. In general, the goal is to reduce the overall size required by the index.
All bitmap indices have the following restrictions:
You cannot define a bitmap index on a UNIQUE column.
You cannot store data values within a bitmap index.
For a table containing more than 1 million records, a bitmap index is less efficient than a standard index when the number of unique values exceeds 10,000. Therefore, for a large table it is recommended that you avoid using a bitmap index for any field that contains (or is likely to contain) more than 10,000 unique values; for a table of any size, avoid using a bitmap index for any field that is likely to contain more than 20,000 unique values. These are general approximations, not exact numbers.
You must create a %BID property to support bitmap indices on a table that:
Uses a non-integer field as the unique ID key.
Uses a multi-field ID key.
Is a child table within a parent-child relationship.
You can use the $SYSTEM.SQL.SetBitmapFriendlyCheck()
method to set a system-wide configuration parameter to check at compile time for this restriction, determining whether a defined bitmap index is allowed in a %Storage.SQL class. This check only applies to classes that use %Storage.SQL. You can use $SYSTEM.SQL.GetBitmapFriendlyCheck()
to determine the current configuration of this option.
Application Logic Restrictions
A bitmap structure can be represented by an array of bit strings, where each element of the array represents a "chunk" with a fixed number of bits. Because undefined is equivalent to a chunk with all 0 bits, the array can be sparse. An array element that represents a chunk of all 0 bits need not exist at all. For this reason, application logic should avoid depending on the $BITCOUNT(str,0)
count of 0-valued bits.
Because a bit string
contains internal formatting, application logic should never depend upon the physical length of a bit string or upon equating two bit strings that have the same bit values. Following a rollback operation, a bit string is restored to its bit values prior to the transaction. However, because of internal formatting, the rolled back bit string may not equate to or have the same physical length as the bit string prior to the transaction.
In a volatile table (one that undergoes many INSERT
operations) the storage for a bitmap index can gradually become less efficient. To maintain bitmap indices, you can run the %SYS.Maint.Bitmap
utility methods to compress the bitmap indices, restoring them to optimal efficiency. You can use the OneClass()
method to compress the bitmap indices for a single class. Or you can use the Namespace()
method to compress the bitmap indices for an entire namespace. These maintenance methods can be run on a live system.
InterSystems SQL provides the following extension to directly manipulating bitmap indices:
%BITMAP aggregate function
%BITMAPCHUNK aggregate function
%SETINCHUNK predicate condition
All of these extensions follow the InterSystems SQL conventions for bitmap representation, representing a set of positive integers as a sequence of bitmap chunks, of up to 64,000 integers each.
These extensions enable easier and more efficient manipulation of certain conditions and filters, both within a query and in embedded SQL. In embedded SQL they enable simple input and output of bitmaps, especially at the single chunk level. They support the processing of complete bitmaps, which are handled by %BITMAP() and the %SQL.Bitmap
class. They also enable bitmap processing for non-RowID values, such as foreign key values, parent-reference of a child table, either column of an association, etc.
For example, to output the bitmap for a specified chunk:
SELECT %BITMAPCHUNK(Home_Zip) FROM Sample.Person
To output all the chunks for the whole table:
SELECT %CHUNK(Home_Zip),%BITMAPCHUNK(Home_Zip) FROM Sample.Person
GROUP BY %CHUNK(Home_Zip) ORDER BY 1
%CHUNK(f) returns the chunk assignment for a bitmap indexed field f value. This is calculated as f\64000+1. %CHUNK(f) for any field or value f that is not a bitmap indexed field always returns 1.
%BITPOS(f) returns the bit position assigned to a bitmap indexed field f value within its chunk. This is calculated as f#64000+1 . %BITPOS(f) for any field or value f that is not a bitmap indexed field returns 1 more than its integer value. A string has an integer value of 0.
%BITMAP aggregate function
The aggregate function %BITMAP(f)
combines many f
values into a %SQL.Bitmap
object, in which the bit corresponding to f
in the proper chunk is set to 1 for each value f
in the result set. f
in all of the above would normally be a positive integer field (or expression), usually (but not necessarily) the RowID.
%BITMAPCHUNK aggregate function
The aggregate function %BITMAPCHUNK(f) combines many values of the field f into an InterSystems SQL standard bitmap string of 64,000 bits, in which bit f#64000+1=%BITPOS(f) is set to 1 for each value f in the set. Note that the bit is set in the result regardless of the value of %CHUNK(f) . %BITMAPCHUNK() yields NULL for the empty set, and like any other aggregate it ignores NULL values in the input.
%SETINCHUNK predicate condition
The condition (f %SETINCHUNK bm) is true if and only if ($BIT(bm,%BITPOS(f))=1) . bm could be any bitmap expression string, e.g. an input host variable :bm, or the result of a %BITMAPCHUNK() aggregate function, etc. Note that the <bm> bit is checked regardless of the value of %CHUNK(f) . If <bm> is not a bitmap or is NULL, the condition returns FALSE. (f %SETINCHUNK NULL) yields FALSE (not UNKNOWN).
Content Date/Time: 2019-09-19 06:44:29