Defining Models for InterSystems Business Intelligence
Details for the Fact and Dimension Tables
Internally, the system uses a star schema
, which consists of the fact table and dimension tables, which the fact table refers. Note that dimension tables are also called star tables
. You can directly query these tables, which the system populates when you build a cube, and which the system updates when changes occur. This chapter discusses the following:
When you compile a cube definition class, the system also generates a table name Listing
, in the same package as the fact table. Do not use the Listing
table, which is for internal use only.
When you compile a cube definition class, the system generates the corresponding fact table with the name Package_CubeClass.Fact
corresponds to the package and class of the cube definition, following the usual rules for translating package and class name to table names.
The system updates the fact table whenever you build the cube or update it incrementally.
It is useful to examine this table, as well as the class definition that uses it.
The fact table contains the following fields:
ID of this row, assigned when the row is created.
The ID of the record in the base class on which this record is based. This field is a pointer to the source table.
This field stores the actual measure values. The fact table class defines bitslice indices on these values. For example:
/// Index for measure M1.
Index MxAge On MxAge [ Type = bitslice ];
/// Measure: MxAge <br/>
/// Source: Age
Property MxAge As %Integer;
For levels in data dimensions, the fact table stores the ID of the level member, which is a pointer to a table that defines the members.
The fact table class defines bitmap indices on these values. For example:
/// Index for fact 1. Index DxGender On DxGender [ Type = bitmap ];
/// Dimension: DxGender <br/>
/// Source: Gender
Property DxGender As Test.TestCube.DxGender;
For a time- or age-type level, the fact table stores an integer.
The fact table class defines the corresponding property as a computed value and defines a bitmap index for it. For example:
/// Index for fact 4. Index DxBirthDateFxYear On DxBirthDateFxYear [ Type = bitmap ];
/// Dimension: DxBirthDateFxMonthYear<br/>
/// Source: BirthDate Property DxBirthDateFxMonthYear As %Integer [ details omitted ];
Property DxBirthDateFxYear As %Integer [ Calculated, SqlComputeCode = ... , SqlComputed ];
For a cube-to-cube <relationship>
, if this cube specifies the sourceProperty
for the relationship, the fact table stores the ID of the corresponding row in the other fact table. That ID is a pointer to the other fact table.
The class does not include properties for any NLP levels or NLP measures; those are handled differently.
If you specified the dependsOn
attribute for any levels, the fact table contains additional indices on the level combinations:
Index DxPostalCodeViaHomeCityANDDx2642257510 On (DxPostalCodeViaHomeCity, Dx2642257510) [ Type = bitmap ];
If you defined any <index>
elements in the <cube>
, the fact table contains additional, custom indices. For example:
Index %UserIndexName On (MxAge, DxGender) [ Type = bitmap ];
These custom indices are for your own use; the system does not use them.
The following table summarizes how the system determines the names for the measure, level, and dimension fields (in the case where you do not specify the factName
If a cube contains multiple measures that use the same property (via sourceProperty
), the fact table contains a field for only one of those measures (the last of those measures found in the cube definition). For example, suppose that the cube contained the following measure definitions:
<measure name="Age" sourceProperty="Age" aggregate="SUM" factName="AgeFact"/>
<measure name="Avg Age" sourceProperty="Age" aggregate="AVG" factName="AvgAgeFact"/>
These two measures are different only by how they aggregate across multiple
records; the fact table would contain the same value for these measures, for any given record. For efficiency, the fact table would include the field AvgAgeFact
, but not the field AgeFact
The same logic applies when a cube contains multiple levels that use the same property. The logic also applies when a cube contains multiple age or time dimensions that use the same property.
For a given measure, level, or dimension, if you use sourceExpression
and access the property via %source.propertyname
, the system always generates a separate field for that value.
When you compile a cube definition class, the system also generates a table for each level, other than the age- and time-type levels. These tables are in the same package as the fact table.
The system updates the dimension tables (also called star tables
) whenever you build the cube or update it incrementally.
The dimension table for a level contains one row for each member of that level. The dimension tables are created dynamically as the system processes records in the base table. For a given level, each time a new unique value is discovered, that value is added as a new row to the appropriate dimension table. This means that the system automatically adds rows to the dimension tables when needed; no intervention is required.
If the cube definition specifies the factName
attribute for the corresponding level, the applicable dimension table uses that value as its name.
Otherwise, the name of the dimension table has the following form:
The columns in this row are as follows:
ID of this row, assigned when the row is created.
One column to contain the key for this member. The field name is the same as the column name field in the fact table that corresponds to this level; see the previous section.
One column for each property of this level, to contain the actual property value for this member.
The field name starts with Dx
and is based either on the source property name or is generated as a unique number, as described earlier.
If you use linkClass
to define both the property and the level to which it belongs, the field names for the property and the level would be identical. In this scenario, the system appends _Link
to the end of the field name for the property.
One column for the parent level of this level, to contain the ID of the parent of this member.
The field name is the same as the column name in the fact table that corresponds to the parent level.
Depending on how the level is defined, you can find the member names as follows:
By default, the key is used as the name, and the name is not stored separately.
If the level includes a property that is defined with isName="true"
, then the member names are stored in the column that contains that property (with one exception). The exception is when the property is also defined with isReference="true"
; in this case, the field is computed at runtime.