This is documentation for Caché & Ensemble.

For information on converting to InterSystems IRIS, see the InterSystems IRIS Adoption Guide and the InterSystems IRIS In-Place Conversion Guide, both available on the WRC Distributions page (login required).

Home > Class Reference > %SYS namespace > %iKnow.Source.Converter.TextTransformation


class %iKnow.Source.Converter.TextTransformation extends %iKnow.Source.Converter

This %iKnow.Source.Converter implementation wraps around a Text Transformation model and will extract sections and key-value pairs as defined in the model. Select sections will be concatenated and used as text input for indexing by the iKnow engine, while select key-value pairs can be saved as metadata values.

Converter parameters:

  1. Model class name (%String): name of the %iKnow.TextTransformation.Definition class containing the TT model definition. This parameter is required.
  2. Section headers to index (%String, default = ""): comma-separated list of section headers whose contents is to be indexed. Leaving this parameter blank (default) will cause all sections to be indexed. Header names are case-insensitive.
  3. Include headers in sections (%Boolean, default = 0): whether or not to include the header itself to be indexed as well. Setting this value to 1 will ensure section contents is always prepended with the title.
  4. Keys to extract for metadata (%String, default = ""): comma-separated list of keys the model extracts that need to be saved as metadata values. Leaving this parameter blank (default) will result in no key-value pairs being saved as metadata. Key names are case-insensitive.
  5. Metadata field names (%String, default = ""): comma-separated list of metadata field names corresponding to the key names in the third parameter. If left blank, it is assumed the key names themselves are valid metadata field names.

Property Inventory (Including Private)

Method Inventory (Including Private)

Properties (Including Private)

property Buffer [ Private , MultiDimensional ];
Property methods: BufferDisplayToLogical(), BufferGet(), BufferIsValid(), BufferLogicalToDisplay(), BufferLogicalToOdbc(), BufferNormalize(), BufferSet(), BufferString()
property OutputText [ Private , MultiDimensional ];
Property methods: OutputTextDisplayToLogical(), OutputTextGet(), OutputTextIsValid(), OutputTextLogicalToDisplay(), OutputTextLogicalToOdbc(), OutputTextNormalize(), OutputTextSet()

Methods (Including Private)

method BufferString(data As %String) as %Status
Inherited description:

This method takes the raw input text and buffers it internally in the converter. The text is provided in chunks of 32k. Every custom converter will need to implement this method so that it can take in the raw data.

method Convert() as %Status

This method is called after all data has been buffered. In this method the converter will need to parse the raw data and extract/convert it into plain text data. If any metadata is present within the document the converter can extract that metadata here, and provide it to the system. Metadata can be reported by using the SetCurrentMetadataValues() function.

classmethod GetMetadataKeys(params As %String) as %List
Inherited description: If the Converter extracts metadata, this method should return a list of keys of the metadata fields that are extracted from the contents. The values will be exposed in the Convert() method in the same order as they are reported here.
method NextConvertedPart() as %String

When conversion is done, this method will be called to fetch the converted data back from the converter. The method should return the converted text in chuncks of maximum 32k in size. When no more data is available, the method should return the empty string ("") to signal that all data has been transferred.

Inherited Members

Inherited Properties (Including Private)

Inherited Methods (Including Private)