Tuples and Cubes
This page discusses two additional key concepts in Business Intelligence MDX: tuples and cubes.
Also see Accessing the BI Samples.
Introduction to Tuples
A tuple is a combination of members from different dimensions. Each tuple has a single value (possibly null).
Every data cell in a result set is a tuple. For example, consider the following query:
SELECT MEASURES.[%COUNT] ON 0, homed.city.members ON 1 FROM demomdx
%COUNT
1 Cedar Falls 110
2 Centerville 99
3 Cypress 112
4 Elm Heights 118
5 Juniper 122
6 Magnolia 114
7 Pine 121
8 Redwood 111
9 Spruce 93
This query returns a set of nine tuples. For example, the first tuple is a combination of Cedar Falls (from the City dimension) and %COUNT (from the Measures dimension).
Creating Tuples
You can create a tuple directly, via the following syntax:
(member_expr1, member_expr2, member_expr3, ...)
Where member_expr1, member_expr2, member_expr3, and so on are member expressions.
In other implementations of MDX, each of these member expressions must be associated with a different dimension. This means that a tuple cannot include more than one member from the same dimension.
In InterSystems MDX, a tuple expression can include more than one member expression from the same dimension. In most cases, the result is null, because in most cases, a record belongs to only one member. However, in InterSystems IRIS Business Intelligence, a level can be based on a list value, which means that a given record can belong to multiple members. For example, the tuple (allerd.soy,allerd.wheat) represents all patients who are allergic to both soy and wheat.
Fully and Partially Qualified Tuples
A tuple is either fully qualified or partially qualified:
-
If the tuple expression refers to each dimension in the cube, the tuple is fully qualified. A fully qualified tuple refers to a very small number of records and is too granular to be commonly used.
-
If the tuple expression does not refer to each dimension in the cube, the tuple is partially qualified. A partially qualified tuple can be very useful, especially when used to filter the data used by the query.
If a tuple refers to only one member, the tuple is equivalent to that member. For example, the following expressions both access the same data:
(colord.red) colord.red
The expression (colord.red) is a tuple expression uses the Red member of the ColorD dimension.
The expression colord.red is a member expression that refers to the Red member of the ColorD dimension.
Each expression accesses only the patients whose favorite color is red.
Sets of Tuples
You can create sets of tuples, by enclosing a comma-separated list of tuple expressions within curly braces:
{tuple_expression1, tuple_expression2, ...}
(Note that in other implementations of MDX, for any tuple in a set, you must construct each tuple in the same way. For example, if the first tuple uses dimension A in its first list item, all the other tuples must do so as well. InterSystems MDX does not have this restriction.)
You can also create sets of tuples by using the CROSSJOIN or NONEMPTYCROSSJOIN functions. For example:
SELECT MEASURES.[%COUNT] ON 0, CROSSJOIN(gend.gender.MEMBERS,homed.city.members) ON 1 FROM demomdx
%COUNT
1 Female->Cedar Falls 58
2 Female->Centerville 41
3 Female->Cypress 51
4 Female->Elm Heights 53
5 Female->Juniper 58
6 Female->Magnolia 58
7 Female->Pine 64
8 Female->Redwood 58
9 Female->Spruce 47
10 Male->Cedar Falls 52
11 Male->Centerville 58
12 Male->Cypress 61
13 Male->Elm Heights 65
14 Male->Juniper 64
15 Male->Magnolia 56
16 Male->Pine 57
17 Male->Redwood 53
18 Male->Spruce 46
You can use these set expressions in all the places where set expressions are permitted:
-
As axes of a query
-
In the WITH clause
-
As an argument to an MDX function that uses a set
Tuple Values
Every tuple has a value (which might be null).
The value of a tuple is determined as follows:
-
The system finds the rows in the fact table that correspond to all the non-measure members used in the tuple expression.
-
The system then finds values for those rows as follows:
-
If the tuple expression includes a specific measure, the system finds the value of that measure for each relevant row of the fact table.
-
If the tuple expression does not include a specific measure, the system uses the default measure (typically, %COUNT).
-
-
The system aggregates those values together, using the aggregation function specified for the measure.
For example, consider the following tuple:
(homed.32006,colord.red,allerd.[dairy products],MEASURES.[avg test score])
To determine the value of this tuple, the system finds all the patients in the fact table that belong to the 32006 ZIP code, and whose favorite color is red, and who are allergic to dairy products. The system then accesses the values for the Test Score measure for those patients and averages those values.
For another example, consider the following tuple (permitted in InterSystems MDX):
(allerd.soy,allerd.wheat)
To determine the value of this tuple, the system counts the patients who are allergic to both soy and to wheat.
Finally, consider the following tuple:
(homed.juniper,homed.centerville)
To determine the value of this tuple, the system counts the patients whose home city is Juniper and whose home city is Centerville. The value of this tuple is null, because each patient has one home city.
Example Tuple Expressions
A tuple expression can refer to a member at any level in any hierarchy of a dimension. Consider the following dimension (from the Patients cube) which includes one hierarchy with four levels:
You can create tuples that use members of any of these levels. For example, you can use any of the following tuple expressions:
(aged.[all patients])
(aged.[0 to 29])
(aged.5)
For another example, let us create variations of the preceding expressions. In this case, let us include members of other dimensions in the tuple expressions:
(aged.[all patients],gend.male)
(aged.[0 to 29],diagd.asthma)
(aged.5,allerd.soy,colord.red)
Using Sets of Tuples as Axes of a Query
You can use sets of tuples as axes of a query. The following example shows the simplest case, a set that consists of one tuple:
SELECT MEASURES.[%COUNT] ON 0, (homed.juniper,allerd.wheat,aged.[20 to 29]) ON 1 FROM demomdx
%COUNT
Juniper->wheat->20 to 29 1
The following example shows a set of tuples used as a query axis:
WITH SET myset AS
'{(homed.[cedar falls],allerd.soy,colord.red),(homed.magnolia,allerd.soy,colord.green),
(homed.34577,allerd.eggs,colord.green)}'
SELECT MEASURES.[%COUNT] ON 0, myset ON 1 FROM demomdx
%COUNT
1 Cedar Falls->soy->Red *
2 Magnolia->soy->Green 1
3 34577->eggs->Green *
For another example, the following is a valid query in InterSystems MDX:
WITH SET myset AS
'{(homed.[cedar falls],allerd.soy,colord.green),(colord.red,allerd.soy,homed.pine,gend.male)}'
SELECT MEASURES.[%COUNT] ON 0, myset ON 1 FROM demomdx
%COUNT
1 Cedar Falls->soy->Green *
2 Red->soy->Pine->Male *
Finally, the following example uses tuples that refer multiple times to a single dimension:
SELECT MEASURES.[%COUNT] ON 0,
{(allerd.soy,allerd.wheat),(homed.juniper,homed.centerville)} ON 1 FROM demomdx
%COUNT
1 soy->wheat 4
2 Juniper->Centerville *
There are 4 patients who are allergic to both soy and wheat.
There are no patients with two home cities.
Introduction to Cubes
A cube is an n-dimensional structure that contains one axis (or edge) for each dimension. The cells of this cube are tuples. An MDX query retrieves specific tuples from the cube.
It is useful to visualize this cube, at least in simple cases. The DemoMDX cube has 10 dimensions (including the Measures dimension). For the sake of simplicity, the following figure shows three of those dimensions (HomeD, GenD, and Measures). Note that only three measures are actually shown.
Each axis of the cube is divided into segments, with one segment for each of the lowest-level members of the corresponding dimension. For the HomeD axis, these segments are the members of the City level.
Each cell in the cube is a fully qualified tuple. Each tuple has a value, as shown in the figure.
An MDX query is a request for a set of tuples, each of which has a value. Consider the following query:
SELECT CROSSJOIN(MEASURES.[%COUNT],gend.gender.MEMBERS) ON 0, homed.city.MEMBERS ON 1 FROM demomdx
Female Male
1 Cedar Falls 569 571
2 Centerville 625 560
3 Cypress 575 543
4 Elm Heights 545 560
5 Juniper 570 546
6 Magnolia 566 503
7 Pine 562 563
8 Redwood 536 540
9 Spruce 550 516
For this query, the system finds the relevant tuples in the cube and obtains their values. For example, the first tuple is (homed.[cedar falls],gend.female,measures.[%COUNT]). The value of this tuple is 569.
Each measure that is aggregated by addition (such as Age) is contained directly in the cube. For other measures, MDX uses values from the cube and aggregates them as specified in the measure definition.
For example, the Avg Age measure is not contained directly in the cube, but the Age measure is; the Age measure contains the cumulative age of all the patients represented in a tuple. To calculate the Avg Age measure, MDX divides Age by %COUNT. Consider the following query:
SELECT CROSSJOIN(MEASURES.[avg age],gend.gender.MEMBERS) ON 0, homed.city.members ON 1 FROM demomdx
Female Male
1 Cedar Falls 36.90 34.56
2 Centerville 35.98 34.68
3 Cypress 37.02 33.55
4 Elm Heights 36.87 34.05
5 Juniper 38.09 34.26
6 Magnolia 35.64 35.03
7 Pine 36.64 33.38
8 Redwood 36.70 36.52
9 Spruce 37.90 32.93
In this example, the second tuple is (homed.[cedar falls],gend.male,measures.[avg age]). To obtain this value, MDX divides the value of (homed.[cedar falls],gend.male,measures.[age]) by the value of (homed.[cedar falls],gend.male,measures.[%COUNT]) — 19734 divided by 571 is 34.56, as shown in the preceding results.
Higher Levels and a Cube Dimension
For now, we consider only dimensions that contain a single hierarchy.
For any dimension, only the lowest level is represented directly on the corresponding cube axis.
For example, the following figure shows all the levels of the HomeD dimension:
Notice that the HomeD axis includes only the leaf members of this dimension — that is, only the members of its lowest level. The higher levels consist of combinations of lower members. For example, each member of the ZIP level consists of one or more members of the City dimension.
Now consider the following query:
SELECT CROSSJOIN(MEASURES.[%COUNT],gend.gender.MEMBERS) ON 0, homed.zip.members ON 1 FROM demomdx
Female Male
1 32006 1,120 1,062
2 32007 536 540
3 34577 1,703 1,609
4 36711 625 560
5 38928 1,114 1,131
For this query, the system finds the relevant tuples of the cube and obtains their values.
For example, the first tuple is (homed.[32006],gend.female,measures.[%COUNT]). The member 32006 consists of the cities Juniper and Spruce. This means that the tuple (homed.[32006],gend.female,measures.[%COUNT]) consists of the combination of the following tuples:
-
(homed.[juniper],gend.female,measures.[%COUNT])
-
(homed.[spruce],gend.female,measures.[%COUNT])
These tuples have the values 570 and 550, respectively. The %COUNT measure is aggregated by adding, so the value for (homed.[32006],gend.female,measures.[%COUNT]) is 1120.
Multiple Hierarchies in a Cube Dimension
A dimension can have multiple hierarchies. For a dimension that includes multiple hierarchies, the corresponding axis of the cube contains one segment for each member of the lowest level in each hierarchy.
Consider the following theoretical cube:
Members of cube 'theoretical':
-----------------------------------------
...
Dimensions
...
Sales Date
H1
Sales Year
Sales Period
Sales Date
H2
Sales Quarter
...
The Sales Date dimension contains two hierarchies. The H1 hierarchy has three levels:
-
The Sales Year level. For example, a members of this level is 1990.
-
The Sales Period level. For example, a members of this level is Jan–1990.
-
The Sales Date level. For example, a members of this level is Jan 3 1990.
The other hierarchy contains only one level.
In this case, the Sales Date axis contains one segment for each member of Sales Date and one segment for each member of Sales Quarter. For example:
(For reasons of space, the picture of the cube is not divided into tuples.)
When a query uses, for example, the Sales Quarter level, the system uses the appropriate part of this axis and accesses the requested tuples.