Skip to main content

Sorry, your browser does not support JavaScript or JavaScript is disabled. Please enable JavaScript or use another browser to have a better experience.

This is documentation for Caché & Ensemble. See the InterSystems IRIS version of this content.Opens in a new tab

For information on migrating to InterSystems IRISOpens in a new tab, see Why Migrate to InterSystems IRIS?

Percent classes:

%Activate

%Api

%Archive

%Atelier

%BI

%CPP

%CPT

%CSP

%Calendar

%Collection

%Compiler

%Dashboard

%DataModel

%Debugger

%DeepSee

%Dictionary

%Document

%DotNet

%Exception

%ExtentMgr

%FileMan

%IO

%Installer

%Iterator

%Library

%MV

%Monitor

%Net

%OAuth2

%Projection

%Regex

%ResultSet

%SAML

%SOAP

%SQL

%SYNC

%SYS

%SYSTEM

%Service

%Standards

%Stream

%Studio

%TSQL

%Text

%UMLS

%UnitTest

%WebStress

%XEP

%XGEN

%XML

%XSQL

%ZEN

%iFind

%iKnow

Classification

Compiler

Configuration

DeepSee

Domain
DomainDefinition

Filters

Matching

Metrics

Model

NativeSupport

Objects

Queries

CcAPI
CcQAPI

CcWSAPI

CrcWSAPI

EntityWSAPI

EntityWSAPI
EquivQAPI

EquivWSAPI

MetadataWSAPI

PathWSAPI

SentenceWSAPI

SourceWSAPI

SourceWSAPI
Utils

REST

Semantics

Source

Stemming

Tables

TextTransformation

UI

UserDictionary

Utils

%xsd

Backup

Cloud

Config

DeepSee

EMS

FT

INFORMATION

Inventory

Journal

MonitorTools

Net

OAuth2

PKI

Provider

SYS

Security

TCPProvider

XDEV

Thanks for your feedback!
Need to tell us more? Click here or use the Feedback button.

Is this page helpful?

Class Details

Parameters (10)
Methods (24)
Inherited Members

Show private members

%iKnow.Queries.SentenceAPI

class %iKnow.Queries.SentenceAPI extends %iKnow.Queries.AbstractAPI

Main Query API class to retrieve sentence information.

Method Inventory

Parameters

parameter GetAttributesRT = attTypeId:%Integer,attType:%String,start:%Integer,span:%Integer,wordPositions:%String,properties:%String,level:%Integer;

parameter GetByCrcIdsRT = srcId:%Integer,externalId:%String,sentId:%Integer,sentenceValue:%String;

parameter GetByCrcMaskRT = srcId:%Integer,externalId:%String,sentId:%Integer,sentenceValue:%String;

parameter GetByCrcsRT = srcId:%Integer,externalId:%String,sentId:%Integer,sentenceValue:%String;

parameter GetByEntitiesRT = srcId:%Integer,externalId:%String,sentId:%Integer,sentenceValue:%String;

parameter GetByEntityIdsRT = srcId:%Integer,externalId:%String,sentId:%Integer,sentenceValue:%String;

parameter GetByPathIdsRT = srcId:%Integer,externalId:%String,sentId:%Integer,sentenceValue:%String;

parameter GetBySourceRT = sentId:%Integer,sentenceValue:%String,sentenceIsTruncated:%Boolean;

parameter GetNewBySourceRT = sentId:%Integer,sentenceValue:%String,score:%Numeric;

parameter GetPartsRT = entOccId:%Integer,entUniId:%Integer,literal:%String,role:%Integer,stemUniId:%Integer;

Methods

classmethod GetAttributes(ByRef pResult, pDomainId As %Integer, pSentId As %Integer, vSrcId As %Integer = 0, pIncludePathAttributes As %Boolean = 0) as %Status

Returns all attributes for a given sentence. By default, only entity-level attributes are returned, with the wordPositions result column referring which words within the affected entities are actually attributed. Using pIncludePathAttributes, also path-level attributes (such as implied negation) can be returned, but these will have no values for the wordPositions column. Also note that the start and span columns for path-level results will refer to positions within those paths and not entity positions within the sentence. See also GetAttributes() in %iKnow.Queries.PathAPI and GetOccurrenceAttributes() in %iKnow.Queries.EntityAPI.

Any named attribute properties are also included through sub-nodes (not available through SQL or SOAP):

pResult(rowNumber, propertyName) = propertyValue

The returned wordPositions apply to the entities starting from start up to offset and only extend to the last attributed word position (there might be more words within the entity).

classmethod GetByCrcIds(ByRef result, domainid As %Integer, crcidlist As %List, filter As %iKnow.Filters.Filter = "", page As %Integer = 1, pagesize As %Integer = 10, setop As %Integer = $$$UNION) as %Status

Retrieves all sentences containing the given CRC ids, optionally limited to all sentences in records satisfying filter. For querying Virtual Sources, set filter to a single, negative integer. In this case, crcidlist is expected to contain virtual Entity IDs.

See also GetByEntities() for a description of the parameters.

classmethod GetByCrcMask(ByRef result, domainid As %Integer, master As %String = $$$WILDCARD, relation As %String = $$$WILDCARD, slave As %String = $$$WILDCARD, filter As %iKnow.Filters.Filter = "", page As %Integer = 1, pagesize As %Integer = 10, setop As %Integer = $$$UNION, pActualFormOnly As %Boolean = 0) as %Status

Retrieves all sentences containing a CRC satisfying the given CRC Mask, optionally limited to all sentences in records satisfying filter. For querying Virtual Sources, set filter to a single, negative integer.

See also GetByEntities() for a description of the parameters.

classmethod GetByCrcs(ByRef result, domainid As %Integer, crclist As %List, filter As %iKnow.Filters.Filter = "", page As %Integer = 1, pagesize As %Integer = 10, setop As %Integer = $$$UNION) as %Status

Retrieves all sentences containing the given CRCs, optionally limited to all sentences in records satisfying filter. For querying Virtual Sources, set filter to a single, negative integer.

See also GetByEntities() for a description of the parameters.

classmethod GetByEntities(ByRef result, domainid As %Integer, entitylist As %List, filter As %iKnow.Filters.Filter = "", page As %Integer = 1, pagesize As %Integer = 10, setop As %Integer = $$$UNION, pActualFormOnly As %Boolean = 0) as %Status

This method will retrieve all sentences containing any (if setop = $$$UNION) or all (if setop = $$$INTERSECT) of the entities supplied through entitylist, optionally limited to all sentences in records satisfying filter. For querying Virtual Sources, set filter to a single, negative integer.

If stemming is enabled for this domain through $$$IKPSTEMMING, sentences containing any actual form of the entities in entityList will be returned. Use pActualFormOnly=1 to retrieve only those sentences containing the actual forms in entitylist. This argument is ignored if stemming is not enabled.

classmethod GetByEntityIds(ByRef result, domainid As %Integer, entityidlist As %List, filter As %iKnow.Filters.Filter = "", page As %Integer = 1, pagesize As %Integer = 10, setop As %Integer = $$$UNION, pActualFormOnly As %Boolean = 0) as %Status

Retrieves all sentences containing the given entity IDs., optionally limited to all sentences in records satisfying filter. For querying Virtual Sources, set filter to a single, negative integer. In this case, entityidlist is expected to contain virtual Entity IDs.

See also GetByEntities() for a description of the parameters.

classmethod GetByPathIds(ByRef result, domainid As %Integer, pathidlist As %List, sourceidlist As %List, page As %Integer = 1, pagesize As %Integer = 10) as %Status

Retrieves all sentences containing the given path IDs.

See also GetByEntities() for a description of the parameters.

classmethod GetBySource(ByRef result, domainid As %Integer, sourceid As %Integer, page As %Integer = 1, pagesize As %Integer = 10) as %Status

Returns the sentences for the given source. A negative source ID is interpreted as a Virtual Source.

classmethod GetCountByCrcIds(domainid As %Integer, crcidlist As %List, filter As %iKnow.Filters.Filter = "", setop As %Integer = $$$UNION, Output sc As %Status = $$$OK) as %Integer

Retrieves the number of sentences containing the given CRC ids, optionally limited to all sentences in records satisfying filter. For querying Virtual Sources, set filter to a single, negative integer. In this case, crcidlist is expected to contain virtual Entity IDs.

See also GetByEntities() for a description of the parameters.

classmethod GetCountByCrcMask(domainid As %Integer, master As %String = $$$WILDCARD, relation As %String = $$$WILDCARD, slave As %String = $$$WILDCARD, filter As %iKnow.Filters.Filter = "", setop As %Integer = $$$UNION, Output sc As %Status = $$$OK, pActualFormOnly As %Boolean = 0) as %Integer

Retrieves the number of sentences containing a CRC satisfying the given CRC Mask, optionally limited to all sentences in records satisfying filter. For querying Virtual Sources, set filter to a single, negative integer.

See also GetByEntities() for a description of the parameters.

classmethod GetCountByCrcs(domainid As %Integer, crclist As %List, filter As %iKnow.Filters.Filter = "", setop As %Integer = $$$UNION, Output sc As %Status = $$$OK) as %Integer

Retrieves the number of sentences containing the given CRCs, optionally limited to all sentences in records satisfying filter. For querying Virtual Sources, set filter to a single, negative integer.

See also GetByEntities() for a description of the parameters.

classmethod GetCountByDomain(domainid As %Integer, filter As %iKnow.Filters.Filter = "", Output sc As %Status = $$$OK) as %Integer

Returns the total number of sentences for a given domain, optionally filtered to those sources satisfying a %iKnow.Filters.Filter object passed in through filter.

classmethod GetCountByEntities(domainid As %Integer, entitylist As %List, filter As %iKnow.Filters.Filter = "", setop As %Integer = $$$UNION, Output sc As %Status = $$$OK, pActualFormOnly As %Boolean = 0) as %Integer

Retrieves the number of sentences containing the given entities, optionally limited to all sentences in records satisfying filter. For querying Virtual Sources, set filter to a single, negative integer.

See also GetByEntities() for a description of the parameters.

classmethod GetCountByEntityIds(domainid As %Integer, entityidlist As %List, filter As %iKnow.Filters.Filter = "", setop As %Integer = $$$UNION, Output sc As %Status = $$$OK, pActualFormOnly As %Boolean = 0) as %Integer

Retrieves the nubmer of sentences containing the given entity ids. For querying Virtual Sources, set filter to a single, negative integer. In this case, entityidlist is expected to contain virtual Entity IDs.

See also GetByEntities() for a description of the parameters.

If stemming is enabled for this domain through $$$IKPSTEMMING, sources containing any actual form of the entities in entityidlist will be returned. Use pActualFormOnly=1 to retrieve only those sources containing the actual forms in entityidlist. This argument is ignored if stemming is not enabled.

classmethod GetCountByPathIds(domainid As %Integer, pathidlist As %List, sourceidlist As %List, Output sc As %Status = $$$OK) as %Integer

Retrieves the number of sentences containing the given path IDs.

See also GetByEntities() for a description of the parameters.

classmethod GetCountBySource(domainid As %Integer, sourceidlist As %List, Output sc As %Status = $$$OK) as %Integer

Returns the total number of sentences for the given sources. Negative Source IDs are interpreted as referring to Virtual Sources.

classmethod GetHighlighted(pDomainId As %Integer, pSentenceId As %Integer, ByRef pHighlight="", vSrcId As %Integer = 0, Output pFullSentence="", Output pSC As %Status = $$$OK, pEscapeHTML As %Boolean = 1) as %String

Highlighting

This is a flexible method to highlight specific elements within a sentence using user-supplied markup passed in through the pHighlight argument (by reference) in a multidimensional form:

 set pHighlight("FLAG") = "markup"
   set pHighlight("FLAG", id) = "markup"

The first option will highlight any element of the type identified by "FLAG", the second option allows refining this to a particular instance, identified by id, overriding any eventual definitions at the generic "FLAG" level.

Note: unless explicitly stated otherwise, all highlighting is based on the entity level.

Markup options

Any single (opening) HTML tag can be specified on the value side of pHighlight and will automatically be wrapped around every entity. The closing tag will be automatically derived from the opening tag supplied through pHighlight

HTML markup supplied this way supports a basic means of annotating with metadata about the particular thing being highlighted. Any occurrences of "$$$ID" in the HTML tag will be substituted with the relevant identifier of what's being highlighted, such as entity IDs for entity markup, CRC IDs for CRC markup or match IDs for dictionary matching markup. Most entity-level markup also supports the $$$LITERAL tag to replace with the original text string for that entity.
For example, the following highlight spec would add links to an info page that takes entity IDs as a URL parameter:

 set tHighlight("ROLE", "concept") = "<a href=""Example.MyEntityViewer.cls?entity=$$$ID"">"

Note that in some cases, such as dictionary matches, ther may be multiple IDs associated with the same highlighted entity. These will be provided as a comma-separated list replacing the $$$ID placeholder.

As an alternative to HTML markup, you can also supply two-character strings that will be used to wrap entities that need highlighting. For example, this array will put square brackets around all concepts and curly braces around relationships:

 set tHighlight("ROLE", "concept") = "[]"
   set tHighlight("ROLE", "relation") = "{}"

Highlighting specific entities, CRCs and paths

To highlight all occurrences of a particular entity, stem, CRC, CC or path, use the corresponding flag. For entities, you can also supply the string value (except when the string value is an integer number itself).

 set tHighlight("ENTITY", 123) = "<b>"
   set tHighlight("ENTITY", "snow storm") = "<b>"
   set tHighlight("STEM", 234) = "<strong title=""$$$LITERAL"">"
   set tHighlight("CRC", 345) = "<u>"
   set tHighlight("PATH", 456) = "<span style='border: 1px solid blue;'>"

Highlighting based on role

The "ROLE" flag can be used to mark concepts, relations and non-relevants, either by using the corresponding integer code (i.e. $$$ENTTYPECONCEPT) or a simple string value. Note that in some cases, some words inside a relationship entity may be marked as non-relevant. These will be highlighted at the word level (only if there is a specific highlighting spec for non-relevants) and are an exception to the general rule that all highlighting happens at the entity level.

 set tHighlight("ROLE", "concept") = "<c>"
   set tHighlight("ROLE", "relation") = "<r>"
   set tHighlight("ROLE", "non-relevant") = "()"
   set tString = "The newspaper published the article and it sold very well."
   write $system.iKnow.Highlight(tString, .tHighlight)

The above example would print:

(The) <c>newspaper</c> <r>published</r> (the) <c>article</c> <r>and (it) sold very well</r>.

Highlighting based on attributes

Attributes can be highlighted at two levels. Using the regular "ATTRIBUTE" flag will highlight all entities affected by the attribute specified by attribute ID (such as $$$IKATTNEGATION). However, some attributes support more fine-grained annotation at the word level, marking those words that actually caused the attribute to apply to an entity or part of a path. These can be highlighted individually through the "ATTRIBUTEWORDS" flag and are an exception to the general rule that highlighting happens per-entity.

 set tHighlight("ATTRIBUTE", $$$IKATTNEGATION) = "<span style='color: red;'>"
   set tHighlight("ATTRIBUTEWORDS", $$$IKATTNEGATION) = "<u>"
   set tString = "The landlord doesn't accept late payments, but makes exceptions for students."
   write $system.iKnow.Highlight(tString, .tHighlight)

The above example would display as:

The landlord doesn't accept late payments, but makes exceptions for students.

Highlighting based on matching results

Dictionary matches can be highlighted using the "MATCH" flag, optionally restricted to a particular dictionary ID. To refine to a particular dictionary item, use the "MATCHITEM" flag. Highlighting can further be refined to distinguish based on full or partial matches using the "FULL" and "PARTIAL" flags as an additional subscript. Please note this is a refinement and the parent node (ID-specific or generic) should contain a value:

Additional information about the matches themselves is available through the metadata rewrite mechanism: $$$TERM, $$$TERMID, $$$ITEM, $$$ITEMID, $$$ITEMURI, $$$DICT, $$$DICTID. Note that the regular $$$ID markers will be replaced with dictionary match IDs, not the IDs of the Dictionary or Dictionary Items.

 set tHighlight("MATCH") = "<a href=""Example.MoreInfo.zen?uri=$$$ITEMURI"" style='border: 1px solid Tomato;' >"
   set tHighlight("MATCH", "FULL") = "<a href=""Example.MoreInfo.zen?uri=$$$ITEMURI""style='background-color: Tomato'>"
   set tHighlight("MATCHITEM", 123) = "<a href=""Example.MoreInfo.zen?uri=$$$ITEMURI"" style='border: 1px solid Lime;'>"

Highlighting based on character position

If external tooling provided annotations based on character positions, use the "CHARS" flag to highlight those annotations by providing the start and end positions as second and third subscripts of the highlight spec array. This will highlight the entities "covering" these start and end positions, starting with the entity which includes the character at the designated start position and ending with the entity including the character at the designated end position.

 set tHighlight("CHARS", 13, 21) = "<a href=""www.imdb.com/title/tt1636826/"">"
   set tHighlight("CHARS", 71, 75) = "<a href=""http://www.haren.nl/"">"
   set tString = "The instant Project X party was not well-received by the cummunity of Haren in the Netherlands."
   write $system.iKnow.Highlight(tString, .tHighlight)

The above example will annotate the entire entities "instant Project X party" and "Haren".

Note that the iKnow indexing engine in certain cases may modify input text while processing text and therefore, character position based informations from external sources that based themselves on the original text, may no longer point to the expected positions. The two most important cases where this can happen is when User Dictionaries are used to rewrite the input explicitly or when duplicate whitespace is normalized by the engine. To work around this issue, present the output of the iKnow engine (as retrieved through GetValue() to these external tools to be sure the same normalizations are applied.

In cases where the externally provided character positions span more than a single sentence, you can pass an offset as the data element of the main "CHARS" node to mark the character position that corresponds the start of this sentence. This should be easier than recalculating all character positions and allows you to reuse the entire array for successive calls to GetHighlighted().

Style precedence

For the purpose of HTML styling precedence, this is the order in which tags are wrapped around entities, from innermost to outermost:

ATTRIBUTEWORDS (wrapped around individual words)
ATTRIBUTE - ID-specific (attribute type ID)
ATTRIBUTE - generic
ENTITY - ID-specific
STEM - ID-specific
CRC - ID-specific
CC - ID-specific
MATCHITEM - ID-specific (dictionary item ID)
MATCH - ID-specific (dictionary ID)
MATCHITEM - generic
MATCH - generic
PATH - ID-specific
ROLE - ID-specific (role)
CHARS

classmethod GetLanguage(domainid As %Integer, sentenceid As %Integer, Output confidence As %Numeric = "", vSrcId As %Integer = 0) as %String

Retrieves the language of the given sentence, as derived by the Automatic Language Identification algorithm or, if ALI was disabled, the language specified when indexing this sentence.

The confidence level is returned as well through an output parameter. If the confidence level is 0, this means ALI was not used and the language was defined by the user loading the source.

If a Virtual Source ID is specified, the sentence ID is treated as a virtual one, in the context of the supplied vSrcId.

classmethod GetNewBySource(ByRef result, domainid As %Integer, sourceid As %Integer, length As %Integer = 5, filter As %iKnow.Filters.Filter = "", algorithm As %String = $$$NEWENTSIMPLE, algorithmParams As %List = "", newEntitiesWindow As %Integer = 100, blackListIds As %List = "") as %Status

Retrieves the sentences with the most significant concepts compared to the rest of the domain (or optionally a subset thereof as filtered through filter). This array of sentences is based on results of the GetNewBySource query in %iKnow.Queries.EntityAPI, using the supplied algorithm and parameter values. The scores of the first [newEntitiesWindow] concepts are aggregated across sentences to produce the result of this query.

Please refer to the documentation of the GetNewBySource query in %iKnow.Queries.EntityAPI for more details on the parameters and available algorithms.

classmethod GetPartLiteral(domainId As %Integer, sentenceId As %Integer, position As %Integer, vSrcId As %Integer = 0) as %String

Returns the literal of the entity or nonrelevant at the specified position.

classmethod GetParts(ByRef result, domainid As %Integer, sentenceid As %Integer, includeCRCMarkers As %Boolean = 0, includePathMarkers As %Boolean = 0, vSrcId As %Integer = 0) as %Status

Returns the elements (concepts, relations and nonrelevants) that make up the sentence, optional including markers for the beginning and end of any CRCs or Paths in the sentence. This information can be used to display the sentence value (see also GetValue()) and/or highlight specific elements of interest.

Output structure: 
result(pos) = $lb(entOccId, entUniId, entity, role)
when includeCRCMarkers = 1, adds 
result(pos, [CRCMASTER | CRCRELATION | CRCSLAVE]) = $lb(crcOccId, crcUniId)
when includePathMarkers = 1, adds 
result(pos, [PATHBEGIN | PATHEND]) = $lb(pathId)

Note: the subscript levels for CRC and Path markers are not available in the QAPI and WSAPI versions of this query.

If a Virtual Source ID is specified, the sentence ID is treated as a virtual one, in the context of the supplied vSrcId.

classmethod GetPosition(domainId As %Integer, sentenceId As %Integer, vSrcId As %Integer = 0) as %Integer

Returns the position within the source this sentence occurs at (1-based).

classmethod GetSourceId(domainId As %Integer, sentenceId As %Integer) as %Integer

Returns the source ID in which the supplied sentence ID occurs

classmethod GetValue(domainid As %Integer, sentenceid As %Integer, Output fullSentence As %Boolean = 1, vSrcId As %Integer = 0) as %String

This method rebuilds a sentence based on the literals and entities it is composed of.

The string returned is the first part, up to the maximum string length, whereas the output parameter fullSentence is an array containing all the parts in the right order, containing a %Boolean value at the top level indicating whether the returned string is the full sentence (1) or (if 0) the user should have to look into this array to learn the full sentence.

If a Virtual Source ID is specified, the sentence ID is treated as a virtual one, in the context of the supplied vSrcId.