This is documentation for Caché & Ensemble. See the InterSystems IRIS version of this content.

For information on migrating to InterSystems IRIS, see How to Migrate to InterSystems IRIS, available on the WRC Distributions page (login required).


abstract class %DeepSee.PMML.Dataset extends %Library.RegisteredObject

A Dataset is a wrapper for a collection of records that can be analyzed, in order to build or run a model. Implementations abstracting different sources of data can be found in %DeepSee.PMML.Dataset.

property Fields as array of %DeepSee.PMML.Dataset.Field;
property IdField as %DeepSee.PMML.Dataset.Field;
property Name as %String (MAXLEN = 200);
method Clear() as %Status
Clears all temporary structures created by this object. The dataset should remain usable after calling this method!
abstract method Get1DDistribution(pField As %String, Output pDistribution, ByRef pFilters) as %Status
Returns an array describing the distribution of values for a field pField (categorical) accepts pFilters(n) = $lb(field, operator, key) returns: pDistribution("total") = tTotalCount pDistribution(n) = $lb(value, count)
abstract method GetAggregatesByCategory(pContField As %String, pCatField As %String, Output pAggregates, ByRef pFilters) as %Status
Returns an array listing aggregate values for a continuous field pContField for each value of a categorical field pCatField. accepts pFilters(n) = $lb(field, operator, key) returns: pAggregates("total") = tTotalCount pAggregates(n) = $lb(category value, count, average, sum, max, min, countNonNull)
method GetFieldBySpec(pFieldSpec As %String) as %DeepSee.PMML.Dataset.Field
abstract method GetRecordIds(Output pIds, ByRef pFilters) as %Status
returns pIds(n) = rowid
abstract method GetValueCount(pField As %String, pIncludeNull As %Boolean = 1, ByRef pFilters, Output pSC As %Status) as %Integer
Returns the number of distinct values for pField (categorical)
abstract method GetXDDistribution(pFields As %List, Output pDistribution, ByRef pFilters) as %Status
accepts pFilters(n) = $lb(field, operator, key) returns: pDistribution = $lb(dim1Count, dim2Count, ...) pDistribution("value", dim, i) = value pDistribution(i, j, ...) = tCount pDistribution("total", dim, i) = tDimTotal
method HasField(pFieldName As %String, Output pSC As %String) as %Boolean

