Using MDX with DeepSee
Introduction to MDX Queries
[Back] [Next]
   
Server:docs2
Instance:LATEST
User:UnknownUser
 
-
Go to:
Search:    

This chapter introduces MDX queries, and it covers the following topics:

Contents of the DemoMDX Cube
When you create SQL queries in an unfamiliar database, you start by becoming acquainted with the tables and their columns. Similarly, when you create MDX queries, you start by becoming acquainted with the available cubes and their contents.
  1. Start the Terminal.
  2. Switch to the SAMPLES namespace.
  3. To access the MDX shell, enter the following command:
     Do ##class(%DeepSee.Utils).%Shell()
  4. To see the available cubes, enter the following command (note that it is not case-sensitive):
    CUBE
    The Terminal then displays a list of cubes.
  5. To see the available contents of a cube, enter the following command:
    CUBE cubename
    For example:
    CUBE demomdx
    The shell ignores the case of the command and of the cube name.
    The Terminal displays the following:
    Measures
      %COUNT
      Age
      Avg Age
      Allergy Count
      Avg Allergy Count
      Test Score
      Avg Test Score
    AgeD
        All Patients
      H1
        All Patients
        Age Group
        Age Bucket
    AllerD
      H1
        Allergies
    BirthD
      H1
        Year
        Quarter Year
    BirthQD
      H1
        Quarter
    DiagD
      H1
        Diagnoses
    GenD
      H1
        Gender
    ColorD
      H1
        Favorite Color
    HomeD
      H1
        ZIP
        City
    DocD
      H1
        Doctor
The DemoMDX cube represents patients. The contents of this cube are as follows:
In many MDX applications, the same name is used for a dimension, a hierarchy in it, and a level in that hierarchy. This practice can be confusing for someone who is learning MDX, so this sample cube uses the following arbitrary naming conventions:
As you will see later, in DeepSee MDX, you can omit parts of identifiers. The naming conventions in this sample make it clear which parts can be omitted.
Note:
A cube can also contain calculated members and named sets. The CUBE command in the MDX shell does not display these elements, although you can use them in the shell and elsewhere.
The Simplest Query
In the MDX shell, enter the following MDX query (this is not case-sensitive):
SELECT FROM demomdx
The shell displays the results as follows:
Result:              1,000
This query simply counts patients.
MDX is not case-sensitive except for member keys, which are discussed in the chapter Working with Levels.”
Members
A key component of an MDX query is the member. Each level contains one or more members. For example, the City level contains multiple members, one for each city in the data. A level enables you to select records; specifically, each member of the level allows you to access a subset of the records.
In the DemoMDX cube, each member of each level in this cube allows you to select some group of patients.
In this section, we execute a simple query to see members of a level in the DemoMDX cube:
  1. In the MDX shell, enter the following MDX query (this is not case-sensitive):
    SELECT homed.h1.city.MEMBERS ON ROWS FROM demomdx
    The shell displays the members of the City level, as follows:
    1 Cedar Falls                           110
    2 Centerville                            99
    3 Cypress                               112
    4 Elm Heights                           118
    5 Juniper                               122
    6 Magnolia                              114
    7 Pine                                  121
    8 Redwood                               111
    9 Spruce                                 93
    For now, let us discuss only the member names, which are shown in the second column.
The City level contains the members Cedar Falls, Centerville, Cypress, and so on. Each member of this level represents the set of patients with that home city. For example, the Centerville member represents all patients whose home city is Centerville.
Measures
Another key component of an MDX query is the measure. All DeepSee queries use at least one measure. If you do not specify a measure, DeepSee uses the default measure defined in the cube. For most cubes, the default measure is %COUNT, which is a count of the records. Let us examine some of the measures in the sample cube:
  1. In the MDX shell, enter the following simple query:
    SELECT MEASURES.[%COUNT] ON COLUMNS FROM demomdx
    This query returns a result set that contains one column of data — the aggregate value for the %COUNT measure — across the entire data set that the cube represents. Depending on the data in your sample, the shell displays something like the following:
                                        %COUNT
                                          1,000
    
    In this example, there are 1000 patients.
  2. In the MDX shell, enter the following query:
    SELECT MEASURES.[avg test score] ON COLUMNS FROM demomdx
    This query returns a result set that shows the aggregate value for the Avg Test Score measure across the entire data set.
    Depending on the data in your sample, the shell displays something like the following:
                                Avg Test Score
                                          74.75
    
    This number is the average test score across all patients.
Referring to Members and Measures
In the preceding sections, you explored the elements of the DemoMDX cube, in particular its measures and levels, and you should have some sense of the data contained in it. You also wrote simple MDX queries. The next step is to learn the syntax that you use to refer to members and measures:
Note the following variations:
You cannot omit the dimension name.
The following examples are all equivalent in DeepSee MDX:
[GenD].[H1].[GENDER].Female
[GenD].Female
GenD.H1.GENDER.Female
GenD.H1.Female
GenD.Female 
Simple MDX Queries with %COUNT
This section presents simple forms of MDX queries, which do not refer to a measure and thus use the default measure defined in the cube (which is usually %COUNT).
You can use 0 instead of COLUMNS, and you can use 1 instead of ROWS. (For reasons of space, this book uses 0 and 1 rather than COLUMNS and ROWS.)
In all cases, the SELECT statement returns a result set, which the MDX shell displays in tabular form.
Let us try queries that use these variations:
  1. Enter the following MDX query:
    SELECT gend.h1.gender.MEMBERS ON 0 FROM demomdx
    The shell executes the query and displays the results like the following (yours will be slightly different):
                                  Female                 Male
                                      488                  512
    
    Notice the following:
  2. Try a shorter version of the same query:
    SELECT gend.gender.MEMBERS ON 0 FROM demomdx
    This query returns the same data as the previous query.
  3. Now enter the following variation:
    SELECT gend.gender.female ON 0 FROM demomdx
    The result might be like the following:
                                        Female
                                            488
    
    In this example, the query selected a specific member rather than both members of this dimension.
  4. Try this variation (with the member name in a different case):
    SELECT gend.gender.FEMALE ON 0 FROM demomdx
    This returns the same result as the preceding query.
  5. Enter a slightly more complex query:
    SELECT gend.h1.gender.MEMBERS ON 0,homed.h1.zip.MEMBERS ON 1 FROM demomdx
    The shell executes the query and displays the results like the following:
                                  Female                 Male
    1 32006                           105                  110
    2 32007                            58                   53
    3 34577                           173                  174
    4 36711                            41                   58
    5 38928                           111                  117
    
    In this case, the results contain multiple rows, one row for each patient ZIP code. The counts are shown for each ZIP code, by gender.
    If there are multiple rows of results, the MDX shell displays a column that indicates the row numbers of the results.
Axis Skipping
In other implementations of MDX, you cannot omit an axis if you use a higher-numbered axis. That is, you cannot use ROWS unless you also use COLUMNS.
In DeepSee MDX, however, if you omit COLUMNS, DeepSee uses %COUNT, as follows:
SELECT gend.h1.gender.MEMBERS ON ROWS FROM demomdx
 
1 Female                                488
2 Male                                  512
Sets
In MDX, the columns and the rows are axes of the query and of the result set. The following result set, for example, has gender on the column axis and home ZIP codes on the row axis:
                              Female                 Male
1 32006                           105                  110
2 32007                            58                   53
3 34577                           173                  174
4 36711                            41                   58
5 38928                           111                  117
An axis uses a set. The general syntax for a set expression is as follows:
{expression1, expression2, ...}
This list can include any number of items. In DeepSee MDX, if the list includes only one item, you can omit the curly braces. Also, a set can be empty, but if so cannot be used on a query axis.
Within the set, each expression can be one of the following:
There are other forms of expressions and other kinds of set elements; see the chapter Working With Sets and the DeepSee MDX Reference.
You can use any non-null set expression within a SELECT statement. In general, SELECT has the following basic syntax for a query that uses one axis:
SELECT set_expression ON COLUMNS FROM cubename
Or:
SELECT set_expression ON 0 FROM cubename
The following form is a query that uses two axes:
SELECT set_expression ON COLUMNS,set_expression ON ROWS FROM cubename
Or:
SELECT set_expression ON 0,set_expression ON 1 FROM cubename
A SELECT statement can use additional axes, but the shell does not display their results in a readable form.
Examples
Now try some query variations that use different kinds of sets, as shown in the preceding section.
  1. The following example uses a set created by a comma-separated list:
    SELECT {gend.h1.gender.MEMBERS,homed.h1.city.MEMBERS} ON 0 FROM demomdx
     
                       Female      Male   Cedar F   Centerv   Cypress   Elm Hei ...
                           488       512       110        99       112       118...
    
    As you can see, the results have too many columns to be shown in full.
  2. Try a variation that uses the same set as rows instead of columns:
    SELECT {gend.h1.gender.MEMBERS,homed.h1.city.MEMBERS} ON 1 FROM demomdx
     
     1 Female                               488
     2 Male                                 512
     3 Cedar Falls                          110
     4 Centerville                           99
     5 Cypress                              112
     6 Elm Heights                          118
     7 Juniper                              122
     8 Magnolia                             114
     9 Pine                                 121
    10 Redwood                              111
    11 Spruce                                93
    
  3. Let us expand the preceding by moving gender to the columns and adding home ZIP codes as another set of rows:
    SELECT gend.h1.gender.MEMBERS ON 0,{homed.h1.city.MEMBERS,homed.h1.zip.MEMBERS} ON 1 FROM demomdx
     
                                  Female                 Male
     1 Cedar Falls                     58                   52
     2 Centerville                     41                   58
     3 Cypress                         51                   61
     4 Elm Heights                     53                   65
     5 Juniper                         58                   64
     6 Magnolia                        58                   56
     7 Pine                            64                   57
     8 Redwood                         58                   53
     9 Spruce                          47                   46
    10 32006                          105                  110
    11 32007                           58                   53
    12 34577                          173                  174
    13 36711                           41                   58
    14 38928                          111                  117
    
  4. Try using a member multiple times within a set:
    SELECT gend.h1.gender.MEMBERS ON 0,{homed.h1.[36711],homed.h1.[36711]} ON 1 FROM demomdx
     
                                  Female                 Male
    1 36711                            41                   58
    2 36711                            41                   58
    
Displaying Measures
Any MDX query uses at least one measure. If you do not indicate the measure to use, DeepSee uses the default measure defined in the cube, usually %COUNT, which is a count of the records. There are multiple ways to display other measures. This section introduces a couple of them.
To use measures in queries, you can do the following:
Including a Simple Filter in the Query
An MDX query can also include a filter, which reduces the number of rows of the fact table that the query could potentially use. To add a filter to a query, add a clause like the following to the end of your SELECT statement:
WHERE filter_details
For filter_details, the simplest form is as follows:
[dim_name].[hier_name].[level_name].[member_name]
You can use the same variations here as described in Referring to Members and Measures,” earlier in this chapter.
This expression filters the query so that DeepSee accesses only the records associated with this member. For example, the following query uses only patients who have osteoporosis:
SELECT MEASURES.[%COUNT] ON 0,aged.[age bucket].MEMBERS ON 1 FROM demomdx WHERE diagd.osteoporosis

                                    %COUNT
1 0 to 9                                  *
2 10 to 19                                *
3 20 to 29                                *
4 30 to 39                                *
5 40 to 49                                *
6 50 to 59                                *
7 60 to 69                                7
8 70 to 79                                7
9 80+                                     8
The MDX shell uses an asterisk (*) to indicate that a value is null.
The chapter Filtering a Query discusses WHERE in more detail.
Understanding the Contents of the MDX Results
Now that you have seen a variety of MDX queries and their results, it is time to review the results more formally. The MDX shell presents the results for an MDX query in the following general form:
The following rules determine the results:
Notes on Independence of Query Axes
DeepSee considers each query axis independently of the others. Sometimes the result is counter-intuitive. This section shows two examples.
Set Order Is Unaffected by Other Sets in the Query
In all cases, it is important to remember that the order of the set returned is independent of any other sets used in the query, and sometimes the result is counter-intuitive. For example, consider the following query:
SELECT MEASURES.[%COUNT] ON 0, 
TOPCOUNT(homed.city.MEMBERS,100,MEASURES.[%COUNT]) ON 1 FROM demomdx
 
                                    %COUNT
1 Juniper                               122
2 Pine                                  121
3 Elm Heights                           118
4 Magnolia                              114
5 Cypress                               112
6 Redwood                               111
7 Cedar Falls                           110
8 Centerville                            99
9 Spruce                                 93
This query shows the sort order that you obtain when you sort cities by patient count. (In this example, the number of members to select is 100, which is greater than the number of members; therefore all members are shown.)
If you modify the preceding query to return the top three members, you see the following:
SELECT MEASURES.[%COUNT] ON 0, TOPCOUNT(homed.city.MEMBERS,3,MEASURES.[%COUNT]) ON 1 FROM demomdx
 
                                    %COUNT
1 Juniper                               122
2 Pine                                  121
3 Elm Heights                           118
Now consider the results when you break out the patients by gender:
SELECT CROSSJOIN(MEASURES.[%COUNT],gend.gender.MEMBERS) ON 0,
TOPCOUNT(homed.city.MEMBERS,3,Measures.[%COUNT]) ON 1 FROM demomdx

                              Female                 Male
1 Juniper                          58                   64
2 Pine                             64                   57
3 Elm Heights                      53                   65
The cities are listed in the same order in this query as in the preceding query, which did not specify a breakout for the columns. In this example, Juniper is the top-rated city by total patient count and so appears first. That is, the sorting is controlled by the total patient count in a city, not by any of the displayed values.
Set Membership Is Unaffected by Other Sets in the Query
It is also important to remember that the members of the returned set are independent of any sets used in the query, and sometimes the result is counter-intuitive. For example, consider the following query:
SELECT MEASURES.[%COUNT] ON 0, TAIL(birthd.year.MEMBERS,10) ON 1 FROM demomdx

                                    %COUNT
 1 1912                                   3
 2 1918                                   1
 3 1919                                   1
 4 1920                                   4
 5 1921                                   2
 6 1922                                   1
 7 1923                                   2
 8 1924                                   1
 9 1925                                   2
10 1927                                   5
Now consider the results when we show only a single gender:
SELECT CROSSJOIN(gend.male,MEASURES.[%COUNT]) ON 0, HEAD(birthd.year.MEMBERS,10) ON 1 FROM demomdx
 
                                    %COUNT
 1 1912                                   *
 2 1918                                   *
 3 1919                                   *
 4 1920                                   1
 5 1921                                   *
 6 1922                                   *
 7 1923                                   2
 8 1924                                   *
 9 1925                                   1
10 1927                                   1
The birth years are the same as in the preceding query, which shows data aggregated across genders.
DeepSee Name Resolution
In some cases, it is possible for multiple entities of the same type to have the same name. For example, an MDX cube can have two levels with the same name, as long as they are in different hierarchies (or possibly different dimensions). Suppose that the cube command showed the contents of a cube as follows:
...
Dimensions
  Geography
    ShipToHierarchy
      State
      City
    OrderByHierarchy
      State
      City
In DeepSee, you can omit the hierarchy name when you refer to a level. If the dimension contains multiple levels with the same name, DeepSee uses the first level with the given name. To refer to a level unambiguously, include the hierarchy name as well.
For another example, a level can have multiple members with the same name. Different states could have cities that have the same name, and those cities are different members. Or if your cube has a fine-grained level such as doctor name, that level could contain multiple members with the same name. In DeepSee, if you refer to the member by name, you access the first member of that name within the level. To refer to a member unambiguously, use its key. See Member Keys in the next chapter.
Nonexistent Members
In most cases, for a nonexistent member, DeepSee returns null, which the shell represents as an asterisk (*). For example, consider the following query:
SELECT colord.h1.color.pink ON 0 FROM demomdx
 
                                 No Member
                                          *
The exception is for measures, which are members of the MEASURES dimension. For a nonexistent measure, the system returns an error. For example:
SELECT MEASURES.[pat count] ON 0 FROM demomdx
 
ERROR #5001: Measure not found: pat count
Typographical Errors
In most situations, DeepSee treats typographical errors in the same way that it treats nonexistent members. For example:
SELECT colord.h1.color.MEMBERSSS ON 0 FROM demomdx
 
                                 No Member
                                          *
For another example:
SELECT colord.MEMBERSSS ON 0 FROM demomdx
 
                                 No Member
                                          *
When you refer to a dimension or an element within a dimension, however, the dimension name is required. If you mistype the dimension name, DeepSee treats that as an error:
SELECT colorddd.h1.color.MEMBERS ON 0 FROM demomdx
 
ERROR #5001: Dimension not found: colorddd
If you mistype the name of the cube or subject area, DeepSee treats that as an error:
SELECT colord.h1.color.MEMBERS ON 0 FROM demo

ERROR #5001: Cannot find Subject Area: 'DEMO'
Conventions Used in Remainder of the Book
For reasons of space, the rest of this book uses the following conventions:
Also, to help you quickly scan query examples: