Skip to main content
************* PRERELEASE CONTENT *************
Previous section

PROBABILITY

A function that applies a specified trained model to return the probability that the specified value is true for each input value provided.

Synopsis

PROBABILITY ( model-name FOR label-value)

Or

PROBABILITY ( model-name USE  trained-model-name FOR label-value )

Or

PROBABILITY ( model-name FOR label-value WITH  feature-columns-clause )

Or

PROBABILITY ( model-name USE  trained-model-name FOR label-value  WITH feature-columns-clause ] )

Arguments

model-name The name of the trained model.
FOR label-value The output value. See details below.
USE trained-model-name Optional — The name of a non-default trained model. See details below.
WITH feature-columns-clause Optional — The specific columns to provide as input for your trained model. See details below.

Description

The PROBABILITY function applies a given model to a given table, returning the probability that, for each row in the table, the model would predict the specified value. This probability is returned as a value from 0 to 1. This function can only be used with classification models (not regression models).

FOR

FOR provides the output value that PROBABILITY finds the probability of.

For example:

SELECT * FROM flower_dataset WHERE PROBABILITY(iris_flower FOR 'iris-setosa') > 0.6
Copy code to clipboard

Uses the iris_flower model to return each row in flower_dataset where the probability of the result being “iris-setosa” is greater than 0.6.

Omitting FOR implies a value of 1. For example:

SELECT PROBABILITY(IsSpam) FROM email_data
Copy code to clipboard

Implicitly forms this query:

SELECT PROBABILITY(IsSpam FOR 1) FROM email_data
Copy code to clipboard

When the value provided for FOR is invalid for the specified trained model, there is a SQLCODE –400 error with the following message:

[%msg: <PREDICT execution error: ERROR #2853: Specified positive label value not found in the dataset.>]

USE

If a trained model is not explicitly named by USE, PROBABILITY uses the default trained model for the specified model definition.

For example, if multiple models are trained:

CREATE MODEL MyModel PREDICTING( label ) FROM data
TRAIN MODEL MyModel AS FirstModel
TRAIN MODEL MyModel AS SecondModel NOT DEFAULT
Copy code to clipboard

FirstModel is the default model for MyModel. This means that PROBABILITY queries would use FirstModel for predictions. To specify use of SecondModel:

PROBABILITY( MyModel FOR label-value USE SecondModel)
Copy code to clipboard

WITH

PROBABILITY is a smart function, mapping the feature columns of the specified dataset to those in the model implicitly when there is no WITH clause. You can use a WITH clause to specify the mapping of columns between the dataset and your model. For example:

SELECT PROBABILITY(iris_flower FOR 'iris-setosa' WITH petal_length = length_petal) FROM flower_dataset
Copy code to clipboard

This query matches the petal_length column from the iris_flower model to the length_petal column from flower_dataset.

Required Security Privileges

Calling PROBABILITY requires %USE_MODEL privileges; otherwise, there is a SQLCODE –99 error (Privilege Violation). To assign %USE_MODEL privileges, use the GRANT command.

Examples

CREATE MODEL PatientReadmission PREDICTING ( IsReadmitted ) FROM patient_data
TRAIN MODEL PatientReadmission
SELECT * FROM new_patient_data WHERE PROBABILITY( PatientReadmission FOR 1) > 0.8
Copy code to clipboard

See Also