DeepSee End User Guide
Using the Pivot Analysis Window
[Home] [Back] 
InterSystems: The power behind what matters   
Class Reference   
Search:    

This chapter describes how to use the Pivot Analysis window, which you can access via the Analysis button on pivot table widgets. This chapter discusses the following:

Using the Pivot Analysis Window
DeepSee provides an analysis window that you can use for several specialized kinds of analysis. In each case, you first select one or more cells, and the analysis considers the lowest-level data associated with those cells. To access this window:
  1. Select the data cells in the row or rows.
    To select multiple cells, hold the Shift key down while selecting the cells.
    To select an entire row, select the row label on the left. To select an entire column, select the column header.
    The analysis option is not available for cells in a total row or a total column.
  2. Select the Analysis button .
    Depending on how the widget is configured, the system either provides a choice of analysis options or displays one of them without any choice.
  3. (If applicable) For Analysis Option, select one of the following:
    Or select iKnow Plugins and then select one of the following:
    The iKnow options are applicable only if your cube includes iKnow measures.
The following sections provide the details.
Note:
This window is also available in the Analyzer.
Cluster Analysis
For background on the Pivot Analysis window, see Using the Pivot Analysis Window.”
Cluster analysis or clustering is the assignment of a set of observations into subsets (called clusters) so that observations in the same cluster are similar in some sense. Clustering is a method of unsupervised learning, and is a common technique for statistical data analysis used in many fields, including machine learning, data mining, pattern recognition, image analysis, information retrieval, and bioinformatics.
For a cluster analysis, the top area of the page looks something like the following:
To use this page:
  1. Specify details of the analysis in the top area of the page.
    Details are beyond the scope of this book; it is assumed that the reader is familiar with cluster analysis.
  2. Select Run.
The bottom area of the page displays the results. For example:
For general information on cluster analysis, see the Wikipedia page (http://en.wikipedia.org/wiki/Cluster_analysis). There are also many books available on this topic.
Also see the DataMining package in the SAMPLES database.
Distribution Analysis
For background on the Pivot Analysis window, see Using the Pivot Analysis Window.”
A distribution analysis shows the number of occurrences of different values of specific measurement, across a set of records.
For a distribution analysis, the system displays something like the following:
To use this page, select a measure (such as Age in this example).
The horizontal axis then shows all the values of this measure for the selected cells. For each measure value, the vertical axis shows the number of source records (for the selected cells) that have that value.
The top of the page shows the following values:
Regression Analysis
For background on the Pivot Analysis window, see Using the Pivot Analysis Window.”
A regression analysis attempts to determine the relationship between independent variables and a dependent variable. (The DeepSee regression analysis considers only one independent variable.)
For a regression analysis, the system displays something like the following:
To use this page, specify the following details:
In each case, the system determines the details of the equation that predicts Y as a function of X, and it plots the predicted X-Y curve on the page. The area above the chart displays details of the fit. For example, for a linear regression:
For another example, the page displays the following details for an exponential regression:
For general information on regression analysis, see the Wikipedia page (http://en.wikipedia.org/wiki/Regression_analysis). There are also many books available on this topic.
iKnow Content Analysis
For background on the Pivot Analysis window, see Using the Pivot Analysis Window.”
The iKnow content analysis option displays information about the most typical and least typical unstructured text values. This analysis option is applicable only if your cube includes iKnow measures.
For the iKnow content analysis, the system displays something like the following:
The Most typical facts section lists the records that have the most typical content, as determined by the iKnow engine. The Most breaking facts section lists the records that have the breaking content. Breaking content is content in which the dominant entities in that source are least similar to the dominant elements of the group of sources. For more information, see the chapter Dominance and Proximity in Using iKnow.
Each table lists the records and displays the fields listed in the selected detail listing.
On this page, you can do the following:
This example uses the Aviation Events demo in the SAMPLES namespace. For reasons of space, this demo is not initialized when you install Caché. To set up the demo, open the Terminal and enter the following commands:
 zn "SAMPLES"
 do ##class(Aviation.Utils).Setup()
iKnow Entity Analysis
For background on the Pivot Analysis window, see Using the Pivot Analysis Window.”
The iKnow entity analysis option displays information about the entities in your unstructured text values. This analysis option is applicable only if your cube includes iKnow measures.
For the iKnow entity analysis, the system displays something like the following:
This page provides three tabs, which display information about the records associated with the pivot table cells that you selected before you selected the Analysis button . These tabs work together as follows:
This example uses the Aviation Events cube demo in the SAMPLES namespace. For reasons of space, this demo is not initialized when you install Caché. For information on setting it up, see the previous topic.
Overview Tab
The Overview tab displays information about the top entities among the records associated with the pivot table cells that you selected before you selected the Analysis button . The chart on this tab displays one rectangle for each of the top 20 entities, according to your choice of metric.
Selecting the Top Entities
To determine how the system selects the top entities, select a metric from the Select by drop-down list. The options are as follows:
Note:
The BM25 and TFIDF options are computation-intensive and can take some time to complete.
Color Coding
The colors in this chart indicate how well each entity serves as an indicator for a given pivot table cell, based on the Naive Bayes probability for that entity, as follows:
Green generally denotes good indicators, and red denotes bad indicators. Solid green means that the term is a very good indicator, pale green means that the term is a good indicator, pale red means that it is a bad indicator, and solid red means that it is a very bad indicator. Note that in some cases, an entity is a bad indicator because it is common to all sources and thus does not enable you to discriminate categories of sources.
For example, suppose that we use a pivot table that displays aircraft type as rows. As the starting point for analysis, we select the pivot table cells Airplane and Helicopter. If we use Airplane to color-code, this chart looks as follows:
Notice that the entities airplane and runway are good indicators for the pivot table cell Airplane. The other entities (especially ground) are bad indicators.
In contrast, if we use Helicopter to color-code, the chart would be colored as follows:
This shows that the entities airplane and runway are bad indicators for the pivot table cell Helicopter. The other entities (especially ground) are good indicators.
Other Options
On the Overview tab, you can also do the following:
Cell Breakdown Tab
The Cell breakdown tab is useful only if you started by selecting multiple cells of a pivot table.
This tab shows how an entity is distributed among the pivot table cells from which you started. When you select a rectangle in the Overview tab, the system displays the Cell breakdown tab with information for that entity.
The Cell breakdown tab looks like this:
For the given entity, this tab shows how that entity is distributed among the pivot table cells that you selected before you selected the Analysis button. For each cell, the chart shows five color-coded bars, which display the following series:
Important:
These series are scaled separately so the chart can show them all. The scale shown on the chart applies only to the entities series. To see the actual numeric value for any bar, hover the cursor over the bar. For example:
On this tab, you can do the following:
Entities Tab
The Entities tab displays related entities in the given pivot table cell or cells. When you select a bar in the Cell breakdown tab, the system displays the Entities tab with the associated details.
The Entities tab looks like this:
On this tab, you can do the following: