Skip to main content

This documentation is for an older version of this product. See the latest version of this content.Opens in a new tab

CREATE MODEL (SQL)

Creates a model definition.

Synopsis

CREATE MODEL [ IF NOT EXISTS ] model-name PREDICTING ( label-column )
      FROM model-source [ USING json-object ]
CREATE MODEL [ IF NOT EXISTS ] model-name PREDICTING ( label-column )
      WITH feature-column-clause [ USING json-object ]
CREATE MODEL [ IF NOT EXISTS ] model-name PREDICTING ( label-column )
      WITH feature-column-clause FROM model-source [ USING json-object ]

Arguments

This synopsis shows the valid forms of CREATE MODEL. The CREATE MODEL command must have either a FROM or WITH clause (or both).

model-name The name for the model definition being created. A valid identifier, subject to the same additional naming restrictions as a table name. A model name is unqualified (modelname). An unqualified model name takes the default schema name.
PREDICTING ( label-column ) The name of the column being predicted, aka, the label column. A standard identifier. See details below.
WITH feature-column-clause Inputs to the model, aka the feature columns, as either the name of a column and it’s datatype or as a comma-separated list of the names of columns and datatypes. Each column name is a standard identifier.
FROM model-source The table or view from which the model is being built. This can be a table, view, or results of a join.
USING json-object-string Optional — A JSON string specifying one or more key-value pairs. See more details below.

Description

The CREATE MODEL command creates a model definition of the structure specified. This includes, at a minimum:

  • The model name

  • The label column

  • The feature column(s)

Predicting

You must specify the output column (or label column) that your model predicts, given the input columns (or feature columns). For example, if you are designing a SpamFilter model which identifies emails that are spam mail, you may have a label column named IsSpam, which is a boolean value designating whether a given email is spam or not. You can also specify the data type of this column; otherwise, IntegratedML infers the type:

CREATE MODEL SpamFilter PREDICTING (IsSpam) FROM EmailData
CREATE MODEL SpamFilter PREDICTING (IsSpam binary) FROM EmailData

WITH and FROM

A model definition must contain a WITH and/or FROM to specify the schema characteristics of the model.

WITH

Using WITH, you can specify which input columns (features) to include in your model definition. Note that you must specify the data type of each column, even when using a FROM clause in your statement:

CREATE MODEL SpamFilter PREDICTING (IsSpam) WITH (email_length int, subject_title varchar)
CREATE MODEL SpamFilter PREDICTING (IsSpam) WITH (email_length int, subject_title varchar) FROM EmailData

FROM

FROM allows you to use every single column from a specified table or view, without having to identify each column individually:

CREATE MODEL SpamFilter PREDICTING (IsSpam) FROM EmailData

This clause is fully general, and can specify any subquery expression. IntegratedML infers the data types of each column. By using FROM, you supply a default data set for future TRAIN MODEL statements using this model definition. You can use FROM along with WITH to both supply a default data set and to explicitly name feature columns.

Without a WITH clause, IntegratedML infers the data types of each column, and implicitly uses the result of the FROM clause as if it were the following query:

SELECT * FROM model-source

USING

You can specify a default USING clause for your model definition. This clause accepts a JSON string with one or more key-value pairs. When TRAIN MODEL is executed, by default the USING clause of the model definition is used. All parameters specified in the USING clause of your ML configuration overwrite those same parameters in the USING clause of your model definition.

You must make sure that the parameters you specify are recognized by the provider you select. Failing to do so may result in an error when training.

Required Security Privileges

Calling CREATE MODEL requires %MANAGE_MODEL privileges; otherwise, there is a SQLCODE –99 error (Privilege Violation). To assign %MANAGE_MODEL privileges, use the GRANT command.

Model Naming Conventions

Model names follow identifier conventions, subject to the restrictions below. By default, model names are simple identifiers. A model name should not exceed 256 characters. Model names are not case-sensitive.

InterSystems IRIS uses the model name to generate a corresponding class name. A class name contains only alphanumeric characters (letters and numbers) and must be unique within the first 96 characters. To generate this class name, InterSystems IRIS first strips punctuation characters from the model name, and then generates an identifier that is unique within the first 96 characters, substituting an integer (beginning with 0) for the final character when needed to create a unique class name. InterSystems IRIS generates a unique class name from a valid model name, but this name generation imposes the following restrictions on the naming of models:

  • A model name must include at least one letter. Either the first character of the view name or the first character after initial punctuation characters must be a letter

  • InterSystems IRIS supports 16-bit (wide) characters for model names. A character is a valid letter if it passes the $ZNAME test.

  • If the first character of the model name is a punctuation character, the second character cannot be a number. This results in an SQLCODE -400 error, with a %msg value of “ERROR #5053: Class name 'schema.name' is invalid” (without the punctuation character). For example, specifying the model name %7A generates the %msg “ERROR #5053: Class name 'User.7A' is invalid”.

  • Because generated class names do not include punctuation characters, it is not advisable (though possible) to create a model name that differs from an existing model name only in its punctuation characters. In this case, InterSystems IRIS substitutes an integer (beginning with 0) for the final character of the name to create a unique class name.

  • A model name may be much longer than 96 characters, but model names that differ in their first 96 alphanumeric characters are much easier to work with.

A model name can only be unqualified. An unqualified model name (viewname ) takes the system-wide default schema name.

Examples

CREATE MODEL PatientReadmit PREDICTING (IsReadmitted) FROM patient_table USING {"seed": 3}
CREATE MODEL PatientReadmit PREDICTING (IsReadmitted) WITH (age, gender, encounter_type, admit_reason, starttime, endtime, prior_visits, diagnosis, comorbitities)

See Also

FeedbackOpens in a new tab