Global Structure
In a table with columnar storage, each column of a dataset is stored in a separate global. Within each column global, all row value elements are of the same data type and “chunked” into separate subscripts per 64,000 rows, similar to how $BIT values are stored. For example, if a table has 100,000 rows, then each column global has two subscripts. The first subscript contains the first 64,000 row values. The second subscript contains the remaining 36,000 row values. InterSystems IRIS uses a specialized vector encoding to efficiently store data of the same data type.
Suppose the TransactionHistory table defined in the previous section contains these records.
SELECT AccountNumber,TransactionDate,Description,Amount,Type FROM Sample.TransactionHistory
AccountNumber |
TransactionDate |
Description |
Amount |
Type |
10001234 |
02/22/2022 |
Deposit to Savings |
40.00 |
Deposit |
10001234 |
03/14/2022 |
Payment to Vendor |
-20.00 |
Withdrawal |
10002345 |
07/30/2022 |
Transfer to Checking |
-25.00 |
Transfer |
10002345 |
08/13/2022 |
Deposit to Savings |
30.00 |
Deposit |
You can optionally examine the global storage structure from the Management Portal by clicking System Explorer and then Globals. In the namespace containing the table, you can then select Show SQL Table Name and find the globals corresponding to your table. For more details, see Managing Globals.
The Data/Master global contains a subscript for each row and is used to reference data involving row operations. Each subscript row is empty, because the data is stored by column in separate globals. This code shows a sample Data/Master global.
^THist = 4
^THist is the name of the global. The name shown here is for illustrative purposes. In tables created using DDL, or in a persistent class with the USEEXTENTSET=1 parameter specified, InterSystems IRIS generates more efficient, hashed globals with names such as ^EW3K.B3vA.1. If a persistent class table does not specify USEEXTENTSET=1, then the global has a name of the format ^TableNameD. In the projected persistent class for an SQL table, the global is stored in the <DataLocation> element on the Storage class member. For example:
Storage Default
{
...
<DataLocation>^THist</DataLocation>
...
}
The table includes five additional globals, one per column, with names of the form, ^THist.V1, ^THist.V2, and so on. Each global stores a column of row values in a vector encoding, an internal data type designed to work with values of the same type and efficiently encode sparse data. The actual encoding is internal, but the Globals page and informational commands such as ZWRITE present a more readable format that describes:
Because this table has fewer than 64,000 rows, each column global contains only a single subscript. The data in the globals shown here have been truncated for readability.
^THist.V1(1) = {"type":"integer", "count":4, "length":5, "vector":[,10001234,...]}
^THist.V2(1) = {"type":"integer", "count":4, "length":5, "vector":[,66162,...]}
^THist.V3(1) = {"type":"string", "count":4, "length":5, "vector":[,"Deposit to Savings",...]}
^THist.V4(1) = {"type":"decimal", "count":4, "length":5, "vector":[,40,...]}
^THist.V5(1) = {"type":"string", "count":4, "length":5, "vector":[,"Deposit",...]}
In this column global for a table with 200,000 rows, the data is spread across four global subscripts containing 64,000 + 64,000 + 64,000 + 8,000 elements. The count of elements is lower than the length, because the column includes NULL values.
^MyCol.V1(1) = {"type":"integer", "count":63867, "length":64000, "vector":[,1,1,1,,...]}
^MyCol.V1(2) = {"type":"integer", "count":63880, "length":64000, "vector":[1,1,1,,1,...]}
^MyCol.V1(3) = {"type":"integer", "count":63937, "length":64000, "vector":[1,1,1,2,2,...]}
^MyCol.V1(4) = {"type":"integer", "count":7906, "length":8000, "vector":[1,1,1,,2,...]}