Performance Tips
This page contains performance tips for InterSystems IRIS® data platform Business Intelligence, which you should review as part of the implementation process. Also see Placing the Business Intelligence Globals in a Separate Database.
For more information on performance and troubleshooting options, see the InterSystems Developer CommunityOpens in a new tab.
Result Caching and Cube Updates
For any cube that uses more than 64,000 records (by default), the system maintains and uses a result cache. When you update a cube by synchronizing or rebuilding it, or when you explicitly invoke %SetCubeDSTime() after a manual update, parts of the result cache are considered invalid and are cleared. The details depend upon options in the cube definition (see Cache Buckets and Fact Order, later in this page). Therefore, it is not generally desirable to update the cubes constantly.
The result cache works as follows: Each time a user executes a query (via the Analyzer for example), the system caches the results for that query. The next time any user runs that query, the system checks to see if the cache is still valid. If so, the system then uses the cached values. Otherwise, the system re-executes the query, uses the new values, and caches the new values. The net effect is that performance improves over time as more users run more queries.
Cache Buckets and Fact Order
As noted earlier, for large data sets, the system maintains and uses a result cache. In this case, it can be useful to control the order of rows in the fact table, because this affects how the system creates and uses the cache. To do this, you can specify the Initial build order option for the cube; see Other Cube Options.
When users evaluate pivot tables, the system computes and caches aggregate values that it later reuses whenever possible. To determine whether the system can reuse a cache, the system uses the following logic:
-
It examines the IDs of the records used in a given scenario (for example, for a given pivot table cell).
-
It checks the buckets to which those IDs belong. A bucket is a large number of contiguous records in the fact table (details given later).
-
If the bucket has been updated (because there was a change for at least one ID in the bucket), the system discards any corresponding cache associated with that bucket and regenerates the result.
-
If the bucket has not been updated, the system reuses the appropriate cache (if available) or generates the result (if not).
-
In some scenarios, changes to the source records (and the corresponding updates to any cubes) occur primarily in the most recent source records. In such scenarios, it is useful to make sure that you build the fact table in order by age of the records, with the oldest records first. This approach means that the caches for the older rows would not be made invalid by changes to the data. (In contrast, if the older rows and newer rows were mixed throughout the fact table, all the caches would potentially become invalid when changes occurred to newer records.)
For more information, see How the Analytics Engine Works.
Removing Inactive Cache Buckets
When a cache bucket is invalidated (as described in the previous section), it is marked as inactive but is not removed. To remove the inactive cache buckets, call the %PurgeObsoleteCache() method of %DeepSee.UtilsOpens in a new tab. For example:
d ##class(%DeepSee.Utils).%PurgeObsoleteCache("patients")
Precomputing Cube Cells
As noted earlier, when users evaluate pivot tables, the system computes and caches aggregate values that it later reuses whenever possible. This caching means that the more users work with Business Intelligence, the more quickly the system runs. (For details, see How the Analytics Engine Works.)
To speed up initial performance as well, you can precompute and cache specific aggregate values that are used in your pivot tables, especially wherever performance is a concern. The feature works as follows:
-
Within the cube class, you specify an additional XData block (CellCache) that specifies cube cells that should be precomputed and cached. For details, see the first subsection.
-
You programmatically precompute these cube cells by using a utility method. See the second subsection.
You must do this after building the cube.
A simpler option is to simply run any queries ahead of time (that is, before any users work with them).
Defining the Cell Cache
Your cube class can contain an additional XData block (CellCache) that specifies cube cells that can be precomputed and cached, which speeds up the initial performance of Business Intelligence. The following shows an example:
/// This xml document defines aggregates to be precomputed.
XData CellCache [ XMLNamespace = " http://www.intersystems.com/deepsee/cellCache" ]
{
<cellCache xmlns= "http://www.intersystems.com/deepsee/cellCache" >
<group name= "BS">
<item>
<element >[Measures].[Big Sale Count]</element >
</item>
</group>
<group name= "G1">
<item>
<element >[UnitsPerTransaction].[H1].[UnitsSold]</ element>
<element >[Measures].[Amount Sold]</element >
</item>
<item>
<fact >DxUnitsSold</fact >
<element >[Measures].[Amount Sold]</element >
</item>
</group>
</cellCache >
}
The <cellCache> element is as follows:
-
It must be in the namespace "http://www.intersystems.com/deepsee/cellCache"
-
It contains zero or more <group> elements.
Each <group> element is as follows:
-
It has a name attribute, which you use later when specifying which groups of cells to precompute.
-
It contains one or more <item> elements.
Each <item> element represents a combination of cube indexes and corresponds to the information returned by %SHOWPLAN. An <item> element consists of one or more <element> elements.
An <element> can include one or more of either of the following structures, in any combination:
<fact>fact_table_field_name</fact>
Or:
<element>mdx_member_expression</element >
Where:
-
fact_table_field_name is the field name in the fact table for a level or measure, as given by the factName attribute for that level or measure.
-
mdx_member_expression is an MDX expression that evaluates to a member. This can be either a member of a level or it can be a measure name (each measure is a member of the special MEASURES dimension).
This expression cannot be a calculated member.
Each group defines a set of intersections. The number of intersections in a group affects the processing speed when you precompute the cube cells.
Precomputing the Cube Cells
To precompute the aggregate values specified by a <group>, use the %ComputeAggregateGroup() method of %DeepSee.UtilsOpens in a new tab. This method is as follows:
classmethod %ComputeAggregateGroup(pCubeName As %String,
pGroupName As %String,
pVerbose As %Boolean = 1) as %Status
Where pCubeName is the name of the cube, pGroupName is the name of the <group>, and pVerbose specifies whether to write progress information while the method is running. For pGroupName, you can use "*" to precompute all groups for this cube.
If you use this method, you must first build the cube.
The method processes each group by looping over the fact table and computing the intersections defined by the items within the group. Processing is faster with fewer intersections in a group. The processing is single-threaded, which allows querying in the foreground.
Using the Index Compression Utility
When a cube is frequently updated via synchronization, its need for storage capacity for indexes will grow significantly. In order to minimize index storage requirements, InterSystems provides a %CompressIndices method as part of the %DeepSee.Utils class. This method is as follows:
classmethod %CompressIndices(pCubeName As %String,
pVerbose As %Boolean = 0) As %Status
Where pCubeName is the name of the cube, and pVerbose specifies whether to write information while the method is running.
Limiting Worker Assignment for Background Tasks
Users may limit the number of %SYSTEM.WorkMgr agents assigned to particular groupings of background tasks via the %SetAgentCount method. This method is as follows:
classmethod %SetAgentCount(pNumAgents As %Integer = "", pType = "build", Output pStatus As %Status) As %Integer
Where pNumAgents is the number of agents which can be assigned to a given type of background task, and pType is the category of background task to which the limit is being applied. pType defaults to build tasks, but can also be set to runTime. Each type's limit is stored separately and can be retrieved by running the following command:
write %DeepSee.Utils:%GetAgentCount(type)
Where type is the category of task for which you want to see the limit of assignable agents.