Defining DeepSee Models
Compiling and Building Cubes
|
|
This chapter describes how to compile and build cubes. It includes the following topics:
Note:
During the build process, users cannot execute queries. (However, if a query is currently running, you can build the cube.)
-
-
-
The cube version feature, which enables you to modify a cube definition, build it, and provide it to users, with only a short disruption of running queries. See the appendix
Using Cube Versions in the above listed book.
If you make any change to a cube class or a subject area class, you must recompile that class before those changes take effect. For many changes to a cube, you must also rebuild the cube before those changes take effect.
The following table lists the required actions after changes:
-
The system starts to compile the class and displays a dialog box that shows progress.
If you have made changes that you have not yet saved, the system saves them before compiling the cube.
-
Or open the cube class in Studio and compile it in the same way that you compile other classes.
When you compile a cube class, the system automatically generates the fact table and all related classes if needed. If the fact table already exists, the system regenerates it only if it is necessary to make a structural change.
If there are any cached results for this cube, the system purges them.
The phrase
building a cube refers to two tasks: adding data to the fact table and other tables and building the indices used to access this data.
-
The system displays a dialog box.
-
By default, DeepSee iterates through all records in the source table and builds the same number of records to the fact table. You can override this behavior when you build the cube. If you specify the
Maximum Number of Records to Build option, DeepSee iterates through only that number of records. The result is a smaller fact table that DeepSee builds more quickly.
-
DeepSee starts to build the cube and displays progress as it does so.
-
classmethod %BuildCube(pCubeName As %String, pAsync As %Boolean = 1, pVerbose As %Boolean = 1, pIndexOnly As %Boolean = 1, pMaxFacts As %Boolean = 0, pTracking As %Boolean = 0, ByRef pBuildStatistics As %String = "") as %Status
-
pCubeName is the logical name of the cube as given in its XData block; this is not case-sensitive.
-
pAsync controls whether DeepSee performs the build in multiple background processes. If this argument is true, the system uses multiple processes and does not return until they are all complete. If this argument is false, the system uses a single process and does not return until it is complete.
-
pVerbose controls whether the method writes status information. If this argument is 1, the system writes status updates to the command line. (This argument does not affect whether the method writes build errors or other logging information.)
-
pIndexOnly controls whether the method only updates the indices. If this argument is 1, the system only updates the indices of the fact table.
-
pMaxFacts specifies the maximum number of rows in the fact table. This determines the number of rows of the base table that the system uses when building the cube.
If
pMaxFacts is 0, the default, all rows of the base table are processed.
-
-
pBuildStatistics returns an array of information about the cube build. This array has the following nodes:
This method returns a status. If errors occur during the cube build, the status code indicates the number of build errors.
set status = ##class(%DeepSee.Utils).%BuildCube("patients")
This method writes output that indicates the following information:
-
Number of processors used.
-
Total elapsed time taken by the build process
-
Total amount of time spent evaluating source expressions, summed across all processors.
Building cube [patients]
Existing cube deleted.
Fact table built: 1,000 fact(s) (2 core(s) used)
Fact indices built: 1,000 fact(s) (2 core(s) used)
Complete
Elapsed time: 1.791514s
Source expression time: 0.798949s
If
Source expression time seems too high, you should re-examine your source expressions to be sure that they are as efficient as possible; in particular, if the expressions use SQL queries, double-check that you have the appropriate indices on the tables that the queries use.
While you are developing a cube, you typically recompile and rebuild it frequently. If you are using a large data set, you might want to limit the number of facts in the fact table, in order to force the cube to be rebuilt more quickly. To do this, do one of the following:
-
-
If you do so, be sure to remove this attribute before deployment.
-
If all the following items are true, DeepSee uses multiple cores to perform the build:
-
-
-
The persistent class is bitmap-friendly.
-
When you build a cube asynchronously, DeepSee sets up a pool of agents to do the work, if it is possible to use parallel processing. This pool consists of a set of agents with high priority and the same number of agents with low priority. By default, the total number of high priority agents (or low priority agents) is four times the number of cores detected on the machine where DeepSee is running.
Note:
These agents are also used to execute queries.
-
The desired number of high priority agents that is, half of the desired total agent count.
-
Null to restore the default behavior.
The method returns the current number of high priority agents that is, half of the total agent count. Or it can return null, which means that the default behavior is in effect.
For example, the following command sets the total agent count to 10:
d ##class(%DeepSee.Utils).%SetAgentCount(5)
To see the current number of agents, use the
%GetAgentCount() method of the same class. This method returns the current number of high priority agents. Or it can return null, which means that the default behavior is in effect.
On rare occasions, you might need to reset these agents. To do so, use the
%Reset() method of
%DeepSee.Utils. This method also clears any pending tasks and clears the result cache for the current namespace, which would have an immediate impact on any users. This method is intended for use only during development.
When you build a cube, pay attention to any error messages and to the number of facts that it builds and indexes. This section discusses the following topics:
-
-
Fact count, which is a useful indicator of build problems in all scenarios
-
-
When you build a cube in the Architect or in the Terminal, DeepSee indicates if there are any build errors. For example:
This user interface does not display all the build errors (the same is true if you build a cube in the Terminal).
To see
all the recorded build errors, do either of the following:
-
The time stamp in this file uses
$NOW to write the local date and time, ignoring daylight saving time.
-
do ##class(%DeepSee.Utils).%PrintBuildErrors(cubename)
Where
cubename is the logical name of the cube, in quotes.
This method displays information about all build errors. For example (with added line breaks):
SAMPLES>do ##class(%DeepSee.Utils).%PrintBuildErrors("patients")
1 Source ID: 13
ERROR #5001: Error inserting/updating fact: (Source ID:'13')
Field 'DeepSee_Model_PatientsCube.Fact.Dx3295243289' (value 'abc') failed validation
2 Source ID: 22
ERROR #5001: Error inserting/updating fact: (Source ID:'22')
Field 'DeepSee_Model_PatientsCube.Fact.Dx3295243289' (value 'abc') failed validation
3 Source ID: 37
ERROR #5001: Error inserting/updating fact: (Source ID:'37')
Field 'DeepSee_Model_PatientsCube.Fact.Dx3295243289' (value 'abc') failed validation
...
81 build error(s) for 'patients'
Important:
In some cases, DeepSee might not generate an error, so it is important to also check the fact count as discussed in the
next section.
When you build a cube, DeepSee reports the number of facts that it builds and indexes. For example, in the Architect:
Each fact is a record in the fact table. The fact table should have the same as the number of records in the base table, except in the following cases:
Also, when DeepSee builds the indices, the index count should equal the number of records in the fact table. For example, the Architect should show the same number for
Building facts and for
Building indices. If there is a discrepancy between these numbers, check the log files.
-
An error of this kind affects the index count but not the fact count.
-
Try disabling selected dimensions or measures. Then recompile and rebuild to isolate the dimension or measure that is causing the problem.
In some cases, the build log might include errors like the following:
ERROR #5002: Cache error: <STORE>%ConstructIndices+44^Cube.cube_name.Fact.1
This error can occur when a level has a very large number of members. By default, when DeepSee builds the indices, it uses local memory to store the indices in chunks and then write these to disk. If a level has a very large number of members, it is possible to run out of local memory, which causes the <STORE> errors.
To avoid such errors, try either of the following:
-
Build the cube with a single process. To do so, use
%BuildCube() in the Terminal, and use 0 for its second argument.
-
In the
<cube> element, specify
bitmapChunkInMemory="false" (this is the default). When this cube is built using background processes, the system will use process-private globals instead of local variables (and will not be limited by local memory).
If your cubes have relationships to other cubes, the build log might include errors like the following:
ERROR #5001: Missing relationship reference in RelatedCubes/Patients: source ID 1 missing reference to RxHomeCity 4
If you are certain that you have built the cubes in the correct order, see the
next section for information on recovering from the errors.
DeepSee provides a way to rebuild only the records that previously generated build errors, rather than rebuilding the entire cube. To do this:
-
Correct the issues that cause these errors.
-
set sc=##class(%DeepSee.Utils).%FixBuildErrors(cubename)
Where
cubename is the logical name of the cube, in quotes. This method accepts a second argument, which specifies whether to display progress messages; for this argument, the default is true.
Fact '100' corrected
Fact '500' corrected
Fact '700' corrected
3 fact(s) corrected for 'patients'
0 error(s) remaining for 'patients'
Or rebuild the entire cube.
DeepSee creates an additional log file (apart from the previously described build logs). After it builds the cube or tries to build the cube, DeepSee also writes the
DeepSeeTasks_NAMESPACE.log file to the directory
install-dir/mgr. This file contains information about the background agents that DeepSee used during the build process. For example:
2009-11-04 16:53:35.648 2312 TaskMaster Background agents killed
2009-11-04 16:53:35.663 2312 TaskMaster Create background agents..
2009-11-04 16:53:35.739 6900 TaskMaster Agent started:1
...
2009-11-04 16:54:19.561 2312 TaskMaster Background agents killed
Tip:
This file also contains information about runtime errors of various kinds such as listing errors and KPI errors.
The time stamps in this files use the local date and time (taking daylight saving time into account).