Using PMML Models with InterSystems Products
This article discusses how to use the InterSystems IRIS Data Platform™ runtime support for PMML (Predictive Modelling Markup Language). It discusses the following topics:
PMML (Predictive Modelling Markup Language) is an XML-based standard that expresses analytics models. It provides a way for applications to define statistical and data mining models so that they can be easily reused and shared. The standard is particularly helpful because the analytics tools used to generate models (tools such as PMML:R, KNIME, SAS, and SPSS) are very different in architecture from the tools used in an InterSystems IRIS or production environment.
In a typical scenario, data scientists use an analytical tool to produce a data mining model based on large amounts of historical data, which is then exported to PMML. The model can then be deployed in a runtime environment and executed on incoming observations, predicting values for the model’s target metrics.
InterSystems IRIS provides runtime support for PMML 4.1
When that class is compiled, the system generates the code needed to execute the model or models described in it.
InterSystems IRIS provides an API for executing the models, based on the data input that you provide.
InterSystems IRIS provides a sample test page that uses the API.
InterSystems IRIS supports PMML 4.1
and the following PMML models:
This sample includes a copy of the Iris data set, a well-known sample used in predictive analytics. The Iris data set provides measurements for the petal and sepal measurements for approximately 50 flowers in three different species of irises. These measurements are strongly predictive of the iris species.
Once you have set up the sample, you can use the PMML models in the DataMining.PMML.Iris
class. This class contains a PMML definition that includes the following models:
A tree model that predicts the iris species, based on petal and sepal measurements
A general regression model that predicts the sepal length, based on the sepal width, petal measurements, and species
To create a class that contains PMML models:
For Class name
, type a fully qualified class name.
XMLNamespace = "http://www.intersystems.com/deepsee/pmml"
Class DataMining.PMML.Iris Extends %DeepSee.PMML.Definition
XData PMML [ XMLNamespace = "http://www.intersystems.com/deepsee/pmml" ]
<X-SQLDataSource name="Analysis dataset">
<X-FieldMap fieldName="PetalLength" spec="PetalLength" />
<X-FieldMap fieldName="PetalWidth" spec="PetalWidth" />
<X-FieldMap fieldName="SepalLength" spec="SepalLength" />
<X-FieldMap fieldName="SepalWidth" spec="SepalWidth" />
<X-FieldMap fieldName="Species" spec="Species" />
<X-SQL>SELECT PetalLength, PetalWidth, SepalLength, SepalWidth, UPPER(Species) Species
<DataField name="PetalLength" displayName="PetalLength" optype="continuous" dataType="double" />
<DataField name="PetalWidth" displayName="PetalWidth" optype="continuous" dataType="double" />
<DataField name="SepalLength" displayName="SepalLength" optype="continuous" dataType="double" />
<DataField name="SepalWidth" displayName="SepalWidth" optype="continuous" dataType="double" />
<DataField name="Species" displayName="Species" optype="categorical" dataType="string">
<Value value="IRIS-SETOSA" property="valid" />
<Value value="IRIS-VERSICOLOR" property="valid" />
<Value value="IRIS-VIRGINICA" property="valid" />
InterSystems IRIS uses these classes to execute the model or models.
Open the Management Portal.
Switch to the appropriate namespace.
The system then displays a page like the following (partially shown):
, select a model class, and then click OK
Click a model from the Model
Click an option from the Data source
drop-down list. Options include:
InterSystems IRIS iterates through the records and then displays a summary of the results. The details depend upon the model. The following shows an example:
You can also test the model with a single input record. To do so, press Test
, which displays a dialog box like the following:
The fields listed in Data object
correspond to the data fields in your PMML definition.
To use this page, select a model from the Model
drop-down list. The model determines which fields are input fields and which are output fields. Then enter values into the input fields. When you have entered all the input values, the page displays the predicted value for the output field for the given model. For example:
InterSystems IRIS also provides an API that you can use to execute PMML models.
To run a predictive model for a single record:
Create an instance of the generated class PackageName.ClassName.Data
and set its properties. The purpose of this instance is to contain the input values.
Method %ExecuteModel(ByRef pData As %DeepSee.PMML.Data,
Output pOutput As %DeepSee.PMML.ModelOutput) as %Status
, use the data object that you created in step 2.
To see the details for the output, use ZWRITE. The pOutput
object includes one property for each <OutputField> in the <Output> element of the model definition. If there is no <Output> element, pOutput
includes a single field named after the predicted <MiningField> element.
is the quoted name of an output field of an InterSystems IRIS PMML model class.
is the optional number of a series (row) in the plug-in. Specify 1 or omit this argument.
is the quoted name of an InterSystems IRIS PMML model class.
Specifies the cube on which this KPI is executed.
Specifies the name of the model to execute. If specified, this must be a model in the given model class. If left blank, the first model in the class will be executed.
Note that not all aggregations might make sense for each output field.
Specifies whether or not to include null predictions when aggregating results. Available values are "ignore"
(the default) and "count"
The order in which you list the parameters does not affect the results.
You can specify up to 16 parameters and their values.
The special %CONTEXT
parameter to cause the plug-in to consider the context of query, which is otherwise ignored. For details, see the reference for the %KPI
function in the InterSystems MDX Reference
For example, use the following syntax to get the average value for the output field MyField
for a PMML model class named Test.MyModel
, which contains only a single model:
To include record-level predictions in a detail listing, you can use the $$$PMML token in the listing query. This token takes the PMML definition class name and the model name as its primary parameters. As an optional third argument, you can pass the name of the predicted feature you wish to include in the query (this argument defaults to "predictedValue"
The following shows the definition of a listing query that uses this token:
UserID, TotalWagered, PercentLost "Lost %" , $$$PMML[MyPMML.Poker,PercentLost] "Predicted Loss %"
After you run a predictive model with a batch of input records, you can export the results to a cube. This option enables you to visualize the results in a different way. The cube contains two levels: ActualValue
To export the results to a cube, use the PMML test page
and click Export
. InterSystems IRIS prompts you for the following information:
Result class name
Specify the name of the persistent class to which the results are written. This is used as the source class for the cube.
Link to source class
Specify the class that contains the source records. The result class includes a property named Record
that points to this class.
Select this if you want to empty the result class (Result class name
) before performing the export. Or clear this if you want to append the newly exported data to the end of the result class table.
Specify the logical name of the cube.
Select this if you have performed this export earlier and now want to overwrite the classes with new data and definitions.
The system then displays the Build Cube
dialog box, where you can build the given cube. Click either Build
. You can also later access this cube via the Architect and build it there.
After you build the cube, use the Analyzer to examine it. The following shows an example. The ActualValue
level is used as rows and the PredictedValue
levels is used as columns: