This is documentation for Caché & Ensemble.

For information on converting to InterSystems IRIS, see the InterSystems IRIS Adoption Guide and the InterSystems IRIS In-Place Conversion Guide, both available on the WRC Distributions page (login required).

Home > Class Reference > %SYS namespace > %iKnow.Source.Converter.Html

%iKnow.Source.Converter.Html

class %iKnow.Source.Converter.Html extends %iKnow.Source.Converter

This is a sample implementation for %iKnow.Source.Converter, designed to weed out HTML tags from plain text input. Data is first buffered into a process-private global and stripped of HTML in the Convert() call.

Converter parameters:

  1. Unescape As %Boolean: set to 1 to unescape HTML special characters such as converting "&" to "&" (default = 1)
  2. SkipTags As %String: comma-separated list of tags whose content (text nested between the start and end tag) is to be left out (default = "script,style")
  3. BreakLines As %Boolean: whether or not to insert double line breaks for non-inline tags (such as p, br, td, ...), in order for the iKnow engine to split sentences at those positions (default = 1)

Property Inventory (Including Private)

Method Inventory (Including Private)

Properties (Including Private)

property BreakLines as %Boolean [ InitialExpression = 1 ];
Property methods: BreakLinesDisplayToLogical(), BreakLinesGet(), BreakLinesIsValid(), BreakLinesLogicalToDisplay(), BreakLinesNormalize(), BreakLinesSet()
property SkipTags as %String) [ InitialExpression = ",script,style," ];
Property methods: SkipTagsDisplayToLogical(), SkipTagsGet(), SkipTagsIsValid(), SkipTagsLogicalToDisplay(), SkipTagsLogicalToOdbc(), SkipTagsNormalize(), SkipTagsSet()
property Unescape as %Boolean [ InitialExpression = 1 ];
Property methods: UnescapeDisplayToLogical(), UnescapeGet(), UnescapeIsValid(), UnescapeLogicalToDisplay(), UnescapeNormalize(), UnescapeSet()

Methods (Including Private)

private method %OnClose() as %Status
Inherited description: This callback method is invoked by the %Close() method to provide notification that the current object is being closed.

The return value of this method is ignored.

private method %OnNew(params As %String) as %Status
Make sure the PPG is empty
method BufferString(data As %String) as %Status
Buffer data in the PPG
method Convert() as %Status

Loop through buffered data and strip off HTML tags. Reset the pointer in the root PPG node at the end, for NextConverterdPart() to know where to start.

method NextConvertedPart() as %String
Loop through the PPG again and return processed strings.
method SetParams(params As %String) as %Status

Utility method called by the %iKnow.Source.Processor and %iKnow.Source.Loader logic to register any new or changed parameter values.

classmethod StripHTML(ByRef pText As %String, pUnescape As %Boolean = 1, pSkipTags As %String = "script,style", pBreakLines As %Boolean = 1, Output pSC As %Status) as %String
Utility method to strip HTML tags from the supplied string. See the class documentation for more details on the available parameters.

Inherited Members

Inherited Properties (Including Private)

Inherited Methods (Including Private)