Using iKnow
Filtering Sources
[Back] [Next]
   
Server:docs1
Instance:LATEST
User:UnknownUser
 
-
Go to:
Search:    

You can use filters to include or exclude sources supplied to an iKnow query. Often you do not wish to perform a query on all of the loaded sources in the domain. A filter allows you to limit the scope of the query to only those sources that meet the criteria of the filter. A filter selects sources based either on which entities are found in the source, or on some information associated with the source itself (metadata). A filter always includes or excludes an entire source and is specified in each query that uses that filter.

Supported Filters
iKnow supplies a number of predefined filters, and provides facilities to allow users to easily define their own filters.
You can use GroupFilter to logically combine the results of the filters you define.
Filtering by the ID of the Source
The most basic source filter is used to limit the sources supplied to a query by providing the Source Id or the External Id of each source that you wish to include in the filtered result set.
By External Id
The ExternalIdFilter includes those sources whose external Ids are listed in a %List structure. Any element in this list that is not a valid External Id, or is a duplicate external Id is silently passed over.
The following example filters sources by external Id. The external Id for Aviation.Event sources includes either the word “Accident” or “Incident”; this filter include only the sources whose external Id includes the word “Incident”. It then lists the details of the filtered sources:
DomainCreateOrOpen
  SET dname="mydomain"
  IF (##class(%iKnow.Domain).Exists(dname))
      { SET domoref=##class(%iKnow.Domain).Open(dname)
        GOTO DeleteOldData }
  ELSE { SET domoref=##class(%iKnow.Domain).%New(dname)
         DO domoref.%Save()
         GOTO SetEnvironment }
DeleteOldData
  SET stat=domoref.DropData()
  IF stat { GOTO SetEnvironment }
  ELSE    { WRITE "DropData error ",$System.Status.DisplayError(stat)
            QUIT}
SetEnvironment
  SET domId=domoref.Id
  IF ##class(%iKnow.Configuration).Exists("myconfig") {
         SET cfg=##class(%iKnow.Configuration).Open("myconfig") }
  ELSE { SET cfg=##class(%iKnow.Configuration).%New("myconfig",0,$LISTBUILD("en"),"",1)
         DO cfg.%Save() }
ListerAndLoader
  SET domId=domoref.Id
  SET flister=##class(%iKnow.Source.SQL.Lister).%New(domId)
  SET myloader=##class(%iKnow.Source.Loader).%New(domId)
QueryBuild
   SET myquery="SELECT TOP 100 ID AS UniqueVal,Type,NarrativeFull FROM Aviation.Event"
   SET idfld="UniqueVal"
   SET grpfld="Type"
   SET dataflds=$LB("NarrativeFull")
UseListerAndLoader
  SET stat=flister.AddListToBatch(myquery,idfld,grpfld,dataflds)
      IF stat '= 1 {WRITE "The lister failed: ",$System.Status.DisplayError(stat) QUIT }
  SET stat=myloader.ProcessBatch()
      IF stat '= 1 {WRITE "The loader failed: ",$System.Status.DisplayError(stat) QUIT }
DefineExtIdFilter
  DO ##class(%iKnow.Queries.SourceAPI).GetByDomain(.result,domId,1,100)
  SET i=1
  SET extlist=$LB("")
  WHILE $DATA(result(i)) {
        SET extId = $LISTGET(result(i),2)
        IF $PIECE(extId,":",3)="Incident" {
        SET extlist=extlist_$LB(extId) }
        SET i=i+1
  }
  SET filt=##class(%iKnow.Filters.ExternalIdFilter).%New(domId,extlist) 
SourceCountQuery
  SET numSrcD=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId)
  WRITE "The ",dname," domain contains ",numSrcD," sources",!
ApplyExtIdFilter
  SET numSrcFD=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId,filt)
  WRITE "The Id filter includes ",numSrcFD," sources:",!
  DO ##class(%iKnow.Queries.SourceAPI).GetByDomain(.result,domId,1,100,filt)
       SET j=1
       WHILE $DATA(result(j)) {
         SET intId = $LISTGET(result(j),1)
         SET extId = $LISTGET(result(j),2)
         WRITE intId," ",extId,!
         SET j=j+1
      }
      WRITE "End of list"
 
By Source Id
The SourceIdFilter includes those sources whose source Ids are listed in a %List structure. Source Ids are integers. They can be listed in any order. Any element in this list that is not a valid source Id, or is a duplicate source Id is silently passed over.
The following example takes SQL records as sources and filters in several sources by Source Id. Source Ids in this data set are numbered 1 through 100. The SourceIdFilter specifies five source Ids for inclusion, but only three of these source Ids correspond to records in the table. Therefore, the total filtered source count is 3:
DefineAFilter
  SET srclist=$LB(10,14,74,110,2799)
  SET filt=##class(%iKnow.Filters.SourceIdFilter).%New(domId,srclist)
SourceCounts
  SET numsrc = ##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId)
     WRITE "The ",dname," domain contains ",numsrc," sources",!
 SET numfsrc = ##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId,filt)
     WRITE "Source count after source Id filtering: ",numfsrc
The iKnow.Filters.Filter class includes the %iKnow.Filters.SourceIdRangeFilter subclass. The following example filters a range of sources by source Id. It returns sources with source Ids 5 through 10, inclusive:
  SET filt=##class(%iKnow.Filters.SourceIdRangeFilter).%New(domId,5,10)
Filtering a Random Selection of Sources
You can use the %iKnow.Filters.RandomFilter to select a random sample of your sources. A random sample allows you to perform tests on a manageable subset of your sources. It also allows you to divide your sources (or a subset of them) into “training” and “test” sets. You would use the “training” set to define iKnow analytics (dictionary matches, source categories, etc.), then would use the “test” set to determine how well these analytics apply to another set of data. In this way you can avoid “overfitting” the analytics to a particular set of data.
You can specify the size of the random subset in two ways:
The following example randomly selects 33% of 50 sources, returning 17 sources. You can run this example repeatedly to demonstrate that different sources are randomly sampled:
DomainCreateOrOpen
  SET dname="mydomain"
  IF (##class(%iKnow.Domain).Exists(dname))
      { SET domoref=##class(%iKnow.Domain).Open(dname)
        GOTO DeleteOldData }
  ELSE { SET domoref=##class(%iKnow.Domain).%New(dname)
         DO domoref.%Save()
         GOTO SetEnvironment }
DeleteOldData
  SET stat=domoref.DropData()
  IF stat { GOTO SetEnvironment }
  ELSE    { WRITE "DropData error ",$System.Status.DisplayError(stat)
            QUIT}
SetEnvironment
  SET domId=domoref.Id
  IF ##class(%iKnow.Configuration).Exists("myconfig") {
         SET cfg=##class(%iKnow.Configuration).Open("myconfig") }
  ELSE { SET cfg=##class(%iKnow.Configuration).%New("myconfig",0,$LISTBUILD("en"),"",1)
         DO cfg.%Save() }
ListerAndLoader
  SET domId=domoref.Id
  SET flister=##class(%iKnow.Source.SQL.Lister).%New(domId)
  SET myloader=##class(%iKnow.Source.Loader).%New(domId)
QueryBuild
   SET myquery="SELECT TOP 50 ID AS UniqueVal,Type,NarrativeFull FROM Aviation.Event"
   SET idfld="UniqueVal"
   SET grpfld="Type"
   SET dataflds=$LB("NarrativeFull")
UseListerAndLoader
  SET stat=flister.AddListToBatch(myquery,idfld,grpfld,dataflds)
      IF stat '= 1 {WRITE "The lister failed: ",$System.Status.DisplayError(stat) QUIT }
  SET stat=myloader.ProcessBatch()
      IF stat '= 1 {WRITE "The loader failed: ",$System.Status.DisplayError(stat) QUIT }
DefineAFilter
  SET filt=##class(%iKnow.Filters.RandomFilter).%New(domId,.33)
SampledSourceQueries
  SET numSrcD=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId)
  WRITE "The ",dname," domain contains ",numSrcD," sources",!
  SET numSrcFD=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId,filt)
  WRITE "Of these ",numSrcD," sources ",numSrcFD," were sampled:",!
  DO ##class(%iKnow.Queries.SourceAPI).GetByDomain(.result,domId,1,20,filt)
  SET i=1
  WHILE $DATA(result(i)) {
     SET intId = $LISTGET(result(i),1)
     SET extId = $LISTGET(result(i),2)
     WRITE "sample #",i," is source ",intId," ",extId,!
     SET i=i+1 } 
     WRITE "End of list"
 
The following example filters the sources in the domain by source Id, returning 11 sources. It then supplies this source Id filter when defining a random filter. Thus, the random filter returns 3 of these source-Id-filtered sources. You can run this example repeatedly to demonstrate that different sources are randomly sampled:
DomainCreateOrOpen
  SET dname="mydomain"
  IF (##class(%iKnow.Domain).Exists(dname))
      { SET domoref=##class(%iKnow.Domain).Open(dname)
        GOTO DeleteOldData }
  ELSE { SET domoref=##class(%iKnow.Domain).%New(dname)
         DO domoref.%Save()
         GOTO SetEnvironment }
DeleteOldData
  SET stat=domoref.DropData()
  IF stat { GOTO SetEnvironment }
  ELSE    { WRITE "DropData error ",$System.Status.DisplayError(stat)
            QUIT}
SetEnvironment
  SET domId=domoref.Id
  IF ##class(%iKnow.Configuration).Exists("myconfig") {
         SET cfg=##class(%iKnow.Configuration).Open("myconfig") }
  ELSE { SET cfg=##class(%iKnow.Configuration).%New("myconfig",0,$LISTBUILD("en"),"",1)
         DO cfg.%Save() }
ListerAndLoader
  SET domId=domoref.Id
  SET flister=##class(%iKnow.Source.SQL.Lister).%New(domId)
  SET myloader=##class(%iKnow.Source.Loader).%New(domId)
QueryBuild
   SET myquery="SELECT TOP 50 ID AS UniqueVal,Type,NarrativeFull FROM Aviation.Event"
   SET idfld="UniqueVal"
   SET grpfld="Type"
   SET dataflds=$LB("NarrativeFull")
UseListerAndLoader
  SET stat=flister.AddListToBatch(myquery,idfld,grpfld,dataflds)
      IF stat '= 1 {WRITE "The lister failed: ",$System.Status.DisplayError(stat) QUIT }
  SET stat=myloader.ProcessBatch()
      IF stat '= 1 {WRITE "The loader failed: ",$System.Status.DisplayError(stat) QUIT }
DefineSourceIdFilter
  SET srclist=$LB(1,3,5,7,9,11,13,15,17,21,23)
  SET idfilt=##class(%iKnow.Filters.SourceIdFilter).%New(domId,srclist)
SourceCounts
  SET numsrc = ##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId)
     WRITE "The ",dname," domain contains ",numsrc," sources",!
 SET numfsrc = ##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId,idfilt)
     WRITE "Source count after source Id filtering: ",numfsrc,!
  DO ##class(%iKnow.Queries.SourceAPI).GetByDomain(.result,domId,1,20,idfilt)
  SET i=1
  WHILE $DATA(result(i)) {
     SET intId = $LISTGET(result(i),1)
     WRITE intId," "
     SET i=i+1 } 
     WRITE !,"End of list",! 
DefineRandomFilter
  SET rfilt=##class(%iKnow.Filters.RandomFilter).%New(domId,3,idfilt)
RandomSample
  SET numrsrc=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId,rfilt)
  WRITE "From ",numfsrc," sources ",numrsrc," are randomly sampled:",!
  DO ##class(%iKnow.Queries.SourceAPI).GetByDomain(.result,domId,1,20,rfilt)
  SET j=1
  WHILE $DATA(result(j)) {
     SET intId = $LISTGET(result(j),1)
     WRITE intId," "
     SET j=j+1 } 
     WRITE !,"End of list",! 
 
Filtering by Number of Sentences
iKnow divides a source text into sentences. The following example filters out sources that contain less than 75 sentences. It returns the total number of sources, then uses the filter to return the total number of sources containing 75 or more sentences, then uses the filter again to return the number of sentences in each of these sources:
DomainCreateOrOpen
  SET dname="mydomain"
  IF (##class(%iKnow.Domain).Exists(dname))
      { SET domoref=##class(%iKnow.Domain).Open(dname)
        GOTO DeleteOldData }
  ELSE { SET domoref=##class(%iKnow.Domain).%New(dname)
         DO domoref.%Save()
         GOTO SetEnvironment }
DeleteOldData
  SET stat=domoref.DropData()
  IF stat { GOTO SetEnvironment }
  ELSE    { WRITE "DropData error ",$System.Status.DisplayError(stat)
            QUIT}
SetEnvironment
  SET domId=domoref.Id
  IF ##class(%iKnow.Configuration).Exists("myconfig") {
         SET cfg=##class(%iKnow.Configuration).Open("myconfig") }
  ELSE { SET cfg=##class(%iKnow.Configuration).%New("myconfig",0,$LISTBUILD("en"),"",1)
         DO cfg.%Save() }
ListerAndLoader
  SET domId=domoref.Id
  SET flister=##class(%iKnow.Source.SQL.Lister).%New(domId)
  SET myloader=##class(%iKnow.Source.Loader).%New(domId)
QueryBuild
   SET myquery="SELECT TOP 50 ID AS UniqueVal,Type,NarrativeFull FROM Aviation.Event"
   SET idfld="UniqueVal"
   SET grpfld="Type"
   SET dataflds=$LB("NarrativeFull")
UseListerAndLoader
  SET stat=flister.AddListToBatch(myquery,idfld,grpfld,dataflds)
      IF stat '= 1 {WRITE "The lister failed: ",$System.Status.DisplayError(stat) QUIT }
  SET stat=myloader.ProcessBatch()
      IF stat '= 1 {WRITE "The loader failed: ",$System.Status.DisplayError(stat) QUIT }
DefineAFilter
  SET filt=##class(%iKnow.Filters.SentenceCountFilter).%New(domId)
  SET nsent=75
  DO filt.MinSentenceCountSet(nsent)
SourceSentenceQueries
  SET numSrcD=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId)
  WRITE "The domain contains ",numSrcD," sources",!
  SET numSrcFD=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId,filt)
  WRITE "Of these ",numSrcD," sources ",numSrcFD," contain ",nsent," or more sentences:",!
  DO ##class(%iKnow.Queries.SourceAPI).GetByDomain(.result,domId,1,50,filt)
  SET i=1
  WHILE $DATA(result(i)) {
     SET numSentS = ##class(%iKnow.Queries.SentenceAPI).GetCountBySource(domId,result(i))
     SET intId = $LISTGET(result(i),1)
     SET extId = $LISTGET(result(i),2)
     WRITE "source ",intId," ",extId
     WRITE " has ",numSentS," sentences",!
     SET i=i+1 } 
     WRITE i-1," sources listed"
 
To filter using both minimum and maximum number of sentences, invoke both instance methods. The following filter selects those sources containing between 10 and 25 sentences (inclusive):
MinMaxFilter
  SET min=10
  SET filt=##class(%iKnow.Filters.SentenceCountFilter).%New(domId)
  DO filt.MinSentenceCountSet(min)
  DO filt.MaxSentenceCountSet(min+15)
Filtering by Entity Match
The following entity filters are provided:
Filtering by Dictionary Match
The %iKnow.Filters.DictionaryMatchFilter class allows you to select sources based on the contents of one or more user-defined dictionaries. iKnow also supports filtering by matching to a list of dictionary terms (%iKnow.Filters.DictionaryTermMatchFilter) or to a list of dictionary items (%iKnow.Filters.DictionaryItemMatchFilter).
In the following simple example, the second filter parameter specifies that only one dictionary is applied; multiple dictionaries can be specified as elements of a %List. The third parameter is set to the default of 1, which means a single match of any dictionary item selects the source for inclusion. You might wish to set this higher to avoid selecting sources where a single dictionary match may be coincidental rather than significant, due to either a large number of items in the dictionary and/or querying sources that contain a large amount of text. The fourth parameter takes the default (-1), which puts no maximum limit on number of matches. If the max parameter is smaller than the min parameter, all sources are selected by the filter, regardless of dictionary matches. The fifth parameter also takes its default: matching based on count of matches, not match score. The sixth parameter is the ensureMatched flag; here ensureMatched=2, so that instantiating the filter generates static match results which are then used each time the filter in invoked. This is the preferred usage. You would need to set ensureMatched to 1 if a dictionary is modified after the filter is instantiated; ensureMatched=1 takes into account changing dictionary contents by matching before every invocation of the filter. However, use of ensureMatched=1 can result is significantly slower performance.
DomainCreateOrOpen
  SET dname="mydomain"
  IF (##class(%iKnow.Domain).Exists(dname))
      { SET domoref=##class(%iKnow.Domain).Open(dname)
        GOTO DeleteOldData }
  ELSE { SET domoref=##class(%iKnow.Domain).%New(dname)
         DO domoref.%Save()
         GOTO SetEnvironment }
DeleteOldData
  SET stat=domoref.DropData()
  IF stat { GOTO SetEnvironment }
  ELSE    { WRITE "DropData error ",$System.Status.DisplayError(stat)
            QUIT}
SetEnvironment
  SET domId=domoref.Id
ListerAndLoader
  SET domId=domoref.Id
  SET flister=##class(%iKnow.Source.SQL.Lister).%New(domId)
  SET myloader=##class(%iKnow.Source.Loader).%New(domId)
QueryBuild
   SET myquery="SELECT TOP 50 ID AS UniqueVal,Type,NarrativeFull FROM Aviation.Event"
   SET idfld="UniqueVal"
   SET grpfld="Type"
   SET dataflds=$LB("NarrativeFull")
UseListerAndLoader
  SET stat=flister.AddListToBatch(myquery,idfld,grpfld,dataflds)
      IF stat '= 1 {WRITE "The lister failed: ",$System.Status.DisplayError(stat) QUIT }
  SET stat=myloader.ProcessBatch()
      IF stat '= 1 {WRITE "The loader failed: ",$System.Status.DisplayError(stat) QUIT }
CreateDictionary
  SET dictname="EngineTerms"
  SET dictdesc="A dictionary of aviation engine terms"
  SET dictId=##class(%iKnow.Matching.DictionaryAPI).CreateDictionary(domId,dictname,dictdesc)
  IF dictId=-1 {WRITE "Dictionary ",dictname," already exists",!
                GOTO ResetForNextTime }
  ELSE {WRITE "created dictionary ",dictId,!}
PopulateDictionaryItem1
   SET itemId=##class(%iKnow.Matching.DictionaryAPI).CreateDictionaryItem(domId,dictId,
       "engine parts",domId_dictId_1)
     SET term1Id=##class(%iKnow.Matching.DictionaryAPI).CreateDictionaryTerm(domId,itemId,
         "piston")
     SET term2Id=##class(%iKnow.Matching.DictionaryAPI).CreateDictionaryTerm(domId,itemId,
         "cylinder")
     SET term2Id=##class(%iKnow.Matching.DictionaryAPI).CreateDictionaryTerm(domId,itemId,
         "crankshaft")
     SET term2Id=##class(%iKnow.Matching.DictionaryAPI).CreateDictionaryTerm(domId,itemId,
         "camshaft")
DefineAFilter
  SET filt=##class(%iKnow.Filters.DictionaryMatchFilter).%New(domId,$LB(dictId),1,,,2)
SourceCountQuery
  SET numSrcD=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId)
  WRITE "The ",dname," domain contains ",numSrcD," sources",!
SourcesFilteredByDictionaryMatch
  SET numSrcFD=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId,filt)
  WRITE "Of these ",numSrcD,", ",numSrcFD," match the ",dictname," dictionary:",!!
  DO ##class(%iKnow.Queries.SourceAPI).GetByDomain(.result,domId,1,50,filt)
  SET i=1
  WHILE $DATA(result(i)) {
     SET intId = $LISTGET(result(i),1)
     SET extId = $LISTGET(result(i),2)
     WRITE "dictionary matches ",intId," ",extId,!
     SET i=i+1 }
  WRITE !,i-1," sources included by dictionary match",!
ResetForNextTime
  IF dictId = -1 {
     SET dictId=##class(%iKnow.Matching.DictionaryAPI).GetDictionaryId(domId,dictname)}
  SET stat=##class(%iKnow.Matching.DictionaryAPI).DropDictionary(domId,dictId)
  IF stat {WRITE "deleted dictionary ",dictId,! }
  ELSE    { WRITE "DropDictionary error ",$System.Status.DisplayError(stat) }  
 
Filtering by Indexing Date Metadata
Every iKnow source is assigned the DateIndexed metadata field. The value of this field is the date and time that a source was indexed by iKnow, in Coordinated Universal Time format (UTC) represented in $HOROLOG format. This is the same as the $ZTIMESTAMP time, except that DateIndexed does not include fractional seconds.
You can create a filter using DateIndexed to include or exclude sources based on when iKnow loaded the source. You can filter using a specific date and time, or filter for a specific date, which encompass all time values within that date. You can use BETWEEN logic to filter for a range of dates.
The following example filters for sources loaded today:
DomainCreateOrOpen
  SET dname="mydomain"
  IF (##class(%iKnow.Domain).Exists(dname))
      { SET domoref=##class(%iKnow.Domain).Open(dname)
        GOTO DeleteOldData }
  ELSE { SET domoref=##class(%iKnow.Domain).%New(dname)
         DO domoref.%Save()
         GOTO SetEnvironment }
DeleteOldData
  SET stat=domoref.DropData()
  IF stat { GOTO SetEnvironment }
  ELSE    { WRITE "DropData error ",$System.Status.DisplayError(stat)
            QUIT}
SetEnvironment
  SET domId=domoref.Id
ListerAndLoader
  SET domId=domoref.Id
  SET flister=##class(%iKnow.Source.SQL.Lister).%New(domId)
  SET myloader=##class(%iKnow.Source.Loader).%New(domId)
QueryBuild
   SET myquery="SELECT TOP 50 ID AS UniqueVal,Type,NarrativeFull FROM Aviation.Event"
   SET idfld="UniqueVal"
   SET grpfld="Type"
   SET dataflds=$LB("NarrativeFull")
UseListerAndLoader
  SET stat=flister.AddListToBatch(myquery,idfld,grpfld,dataflds)
      IF stat '= 1 {WRITE "The lister failed: ",$System.Status.DisplayError(stat) QUIT }
  SET stat=myloader.ProcessBatch()
      IF stat '= 1 {WRITE "The loader failed: ",$System.Status.DisplayError(stat) QUIT }
DefineAFilter
  SET tday = $PIECE($ZTIMESTAMP,",",1)
  SET filt=##class(%iKnow.Filters.SimpleMetadataFilter).%New(domId,"DateIndexed","=",tday)
DateIndexedValue
  WRITE "Today is ",$PIECE($ZTIMESTAMP,",",1),!
  DO ##class(%iKnow.Queries.SourceAPI).GetByDomain(.result,domId,1,50)
  SET i=1
  WHILE $DATA(result(i)) {
     SET srcId = $LISTGET(result(i),1)
     SET extId = $LISTGET(result(i),2)
     SET idate = ##class(%iKnow.Queries.MetadataAPI).GetValue(domId,"DateIndexed",extId)
     WRITE "Source ",srcId," was indexed ",idate,!
     SET i=i+1 }
SourceSentenceQueries
  SET numSrcD=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId)
  WRITE "The ",dname," domain contains ",numSrcD," sources",!
  SET numSrcFD=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId,filt)
  WRITE "Of these sources ",numSrcFD," were indexed today"
 
Filtering by User-defined Metadata
In iKnow, data is the contents of a source that iKnow processes and indexes. In iKnow, metadata can be any data associated with a source that is not iKnow indexed data. You use iKnow metadata to identify iKnow data. A metadata filter uses the value of a metadata field to determine which sources to supply to a query.
Note:
The iKnow definition of “metadata” describes how data is used, not the intrinsic nature of the data. This concept differs somewhat from the way this word is used elsewhere in Caché software.
iKnow provides a default metadata management system that is independent of the query APIs. The %iKnow.Queries.MetadataAPI class and accompanying %iKnow.Filters.SimpleMetadataFilter provide implementations for basic metadata filtering. If you wish to implement a custom Metadata API, you should implement (at least) the %iKnow.Queries.MetadataI interface and register your class as the "MetadataAPI" domain parameter: DO domain.SetParameter("MetadataAPI","Your.Metadata.Class"). The example that follows uses the %iKnow.Filters.SimpleMetadataFilter class.
In Caché SQL, each record of an SQL table constitutes an iKnow source. Through the ProcessList() method (for small numbers of records) or AddListToBatch() method (for large numbers of records), you define the Lister parameters:
Note that it is possible to specify the same field as both one of the data fields and as a metadata field. You can optionally also define metakey fields that correspond to the metadata fields.
This is shown in the following example. The Aviation.Event table contains various fields in addition to the NarrativeFull text field. In this example, InjuriesTotal is used as a metadata field. This metadata field is used in three filters: two equality filters, which filter for InjuriesTotal>2 and InjuriesTotal=3, and a BETWEEN filter that filters for InjuriesTotal between 3 and 5 (inclusive). This example uses the DropData(1) method, because DropData() with no argument does not delete metadata. Also note that the AddField() method must be invoked before listing and loading the data.
#Include %IKPublic
DomainCreateOrOpen
  SET dname="mydomain"
  IF (##class(%iKnow.Domain).Exists(dname))
      { SET domoref=##class(%iKnow.Domain).Open(dname)
        GOTO DeleteOldData }
  ELSE { SET domoref=##class(%iKnow.Domain).%New(dname)
         DO domoref.%Save()
         GOTO SetEnvironment }
DeleteOldData
  SET stat=domoref.DropData(1)
  IF stat { GOTO SetEnvironment }
  ELSE    { WRITE "DropData error ",$System.Status.DisplayError(stat)
            QUIT}
SetEnvironment
  SET domId=domoref.Id
ListerAndLoader
  SET domId=domoref.Id
  SET flister=##class(%iKnow.Source.SQL.Lister).%New(domId)
  SET myloader=##class(%iKnow.Source.Loader).%New(domId)
QueryBuild
   SET myquery="SELECT TOP 100 ID AS UniqueVal,Type,NarrativeFull,InjuriesTotal,InjuriesTotalFatal FROM Aviation.Event"
   SET idfld="UniqueVal"
   SET grpfld="Type"
   SET dataflds=$LB("NarrativeFull")
   SET metaflds=$LB("InjuriesTotal","InjuriesTotalFatal")
AddMetaFields
  SET val=##class(%iKnow.Queries.MetadataAPI).AddField(domId,"InjuriesTotal",
                   $LB("=","<",">","BETWEEN"),$$$MDDTNUMBER)
  SET val=##class(%iKnow.Queries.MetadataAPI).AddField(domId,"InjuriesTotalFatal",
                   $LB("=","<",">","BETWEEN"),$$$MDDTNUMBER)
UseListerAndLoader
  SET stat=flister.AddListToBatch(myquery,idfld,grpfld,dataflds,metaflds)
      IF stat '= 1 {WRITE "The lister failed: ",$System.Status.DisplayError(stat) QUIT }
  SET stat=myloader.ProcessBatch()
      IF stat '= 1 {WRITE "The loader failed: ",$System.Status.DisplayError(stat) QUIT }
CountSources
   SET numsrc=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId)
ApplyFilter
  SET filt2=##class(%iKnow.Filters.SimpleMetadataFilter).%New(domId,"InjuriesTotal",
                    ">",2)
  SET numSrcF2=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId,filt2)
  WRITE "Of these ",numsrc," sources ",numSrcF2," had three or more injuries",!
  SET filt3=##class(%iKnow.Filters.SimpleMetadataFilter).%New(domId,"InjuriesTotal",
                    "=",3)
  SET numSrcF3=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId,filt3)
  WRITE "Of these ",numsrc," sources ",numSrcF3," had three injuries",!
  SET filtb=##class(%iKnow.Filters.SimpleMetadataFilter).%New(domId,"InjuriesTotal",
                    "BETWEEN","3;5")
  SET numSrcFb=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId,filtb)
  WRITE "Of these ",numsrc," sources ",numSrcFb," had between 3 and 5 injuries",!
 
Metadata Filter Operators
You assign to each filter one or more equality operators. If the filter is matching against a string value, use the “=” equality operator. If the filter is matching against a numeric value, you can use one or more of the following operators: “=”, “<”, “<=”, “>”, “>=”. Equality operators are always specified as quoted string elements in a list structure. Equality operators are matched against a single value. This is shown in the following example:
  SET filt=##class(%iKnow.Filters.SimpleMetadataFilter).%New(domId,metafldname,"=",today)
The BETWEEN operator is matched against a parameter string containing a pair of values that are separated by $$$MDVALSEPARATOR (the semicolon character). This is shown in the following example:
  SET filt=##class(%iKnow.Filters.SimpleMetadataFilter).%New(domId,metafldname,
                   "BETWEEN","yesterday;tomorrow")
Filtering by SQL Query
The %iKnow.Filters.SqlFilter class allows you to select SQL sources based on the results of an SQL query. This query can select on any of the following fields:
Note that these result column names are case sensitive.
For example, the following filter selects for the SourceId of the 6th source retrieved (in this case SourceId 45):
DomainCreateOrOpen
  SET dname="mydomain"
  IF (##class(%iKnow.Domain).Exists(dname))
      { SET domoref=##class(%iKnow.Domain).Open(dname)
        GOTO DeleteOldData }
  ELSE { SET domoref=##class(%iKnow.Domain).%New(dname)
         DO domoref.%Save()
         GOTO SetEnvironment }
DeleteOldData
  SET stat=domoref.DropData()
  IF stat { GOTO SetEnvironment }
  ELSE    { WRITE "DropData error ",$System.Status.DisplayError(stat)
            QUIT}
SetEnvironment
  SET domId=domoref.Id
  IF ##class(%iKnow.Configuration).Exists("myconfig") {
         SET cfg=##class(%iKnow.Configuration).Open("myconfig") }
  ELSE { SET cfg=##class(%iKnow.Configuration).%New("myconfig",0,$LISTBUILD("en"),"",1)
         DO cfg.%Save() }
ListerAndLoader
  SET domId=domoref.Id
  SET flister=##class(%iKnow.Source.SQL.Lister).%New(domId)
  SET myloader=##class(%iKnow.Source.Loader).%New(domId)
QueryBuild
   SET myquery="SELECT TOP 50 ID AS UniqueVal,Type,NarrativeFull FROM Aviation.Event"
   SET idfld="UniqueVal"
   SET grpfld="Type"
   SET dataflds=$LB("NarrativeFull")
UseListerAndLoader
  SET stat=flister.AddListToBatch(myquery,idfld,grpfld,dataflds)
      IF stat '= 1 {WRITE "The lister failed: ",$System.Status.DisplayError(stat) QUIT }
  SET stat=myloader.ProcessBatch()
      IF stat '= 1 {WRITE "The loader failed: ",$System.Status.DisplayError(stat) QUIT }
   DO ##class(%iKnow.Queries.SourceAPI).GetByDomain(.result,domId,1,50)
FilterSources
  SET filter=##class(%iKnow.Filters.SqlFilter).%New(domId,
       "SELECT '"_$LIST(result(6),1)_"' AS SourceId")
  DO ##class(%iKnow.Queries.SourceAPI).GetByDomain(.fresult,domId,0,0,filter)
  WRITE !,"Filtered results:",!
  SET j=1
  WHILE $DATA(fresult(j)) {
    WRITE $LISTTOSTRING(fresult(j)),!
    SET j=j+1 }
 
The following filter selects sources by ExternalId:
  SET filter=##class(%iKnow.Filters.SqlFilter).%New(domId,
          "SELECT '"_$LIST(result(1),2)_"' AS ExternalId")
  DO ##class(%iKnow.Queries.SourceAPI).GetByDomain(domId,0,0,filter)
  ZWRITE result  
Filter Modes
The FilterMode argument specifies what statistical reprocessing should be performed after applying a filter. If no reprocessing is performed, the filter is applied but the frequency and spread statistics are the values calculated for the item before filtering and the sort sequence remains unchanged. The following are the available filter modes:
FilterMode Integer code Filter? Recalculate Frequency? Recalculate Spread? Re-sort Results?
$$$FILTERONLY 1 YES NO NO NO
$$$FILTERFREQ 3 YES YES NO NO
$$$FILTERSPREAD 5 YES NO YES NO
$$$FILTERALL 7 YES YES YES NO
$$$FILTERFREQANDSORT 11 YES YES NO YES
$$$FILTERSPREADANDSORT 13 YES NO YES YES
$$$FILTERALLANDSORT 15 YES YES YES YES
The default is $$$FILTERONLY.
Refer to Constants in the “iKnow Implementation” chapter for use of $$$ macros.
Using GroupFilter to Combine Multiple Filters
iKnow supplies a GroupFilter class that allows you to supply logic to combine the results of other filters. You must first create a GroupFilter instance that provides a defined logic, and then use the AddSubFilter() method to assign it one or more subfilter objects that are combined according to the GroupFilter logic. In this way you can combine multiple existing filters to select which sources are supplied to an iKnow query.
The simplest GroupFilter logic returns the inverse of a single filter. In the following example, the RandFilt selects 33% of the sources. The GroupFilter defines AND logic with a Negated boolean operator. When applied to a single filter, this logic returns all of the sources not selected by the filter:
DomainCreateOrOpen
  SET dname="mydomain"
  IF (##class(%iKnow.Domain).Exists(dname))
      { SET domoref=##class(%iKnow.Domain).Open(dname)
        GOTO DeleteOldData }
  ELSE { SET domoref=##class(%iKnow.Domain).%New(dname)
         DO domoref.%Save()
         GOTO SetEnvironment }
DeleteOldData
  SET stat=domoref.DropData()
  IF stat { GOTO SetEnvironment }
  ELSE    { WRITE "DropData error ",$System.Status.DisplayError(stat)
            QUIT}
SetEnvironment
  SET domId=domoref.Id
QueryBuild
   SET myquery="SELECT TOP 50 ID AS UniqueVal,Type,NarrativeFull FROM Aviation.Event"
   SET idfld="UniqueVal"
   SET grpfld="Type"
   SET dataflds=$LB("NarrativeFull")
ListerAndLoader
  SET domId=domoref.Id
  SET flister=##class(%iKnow.Source.SQL.Lister).%New(domId)
  SET myloader=##class(%iKnow.Source.Loader).%New(domId)
  SET stat=flister.AddListToBatch(myquery,idfld,grpfld,dataflds)
      IF stat '= 1 {WRITE "The lister failed: ",$System.Status.DisplayError(stat) QUIT }
  SET stat=myloader.ProcessBatch()
      IF stat '= 1 {WRITE "The loader failed: ",$System.Status.DisplayError(stat) QUIT }
DefineARandomFilter
  SET randfilt=##class(%iKnow.Filters.RandomFilter).%New(domId,.33)
SampledSourceQueries
  SET numSrcD=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId)
  WRITE "The ",dname," domain contains ",numSrcD," sources",!
  SET numSrcFD=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId,randfilt)
  WRITE "From ",numSrcD," sources randfilt sampled ",numSrcFD,!
GroupFilter
    SET grpfilt=##class(%iKnow.Filters.GroupFilter).%New(domId,"AND",1)
    DO grpfilt.AddSubFilter(randfilt)
  SET numSrcGrp=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId,grpfilt)
  WRITE "From ",numSrcD," sources grpfilt sampled ",numSrcGrp
 
The following example uses a GroupFilter to combine the results of two other filters (in this case, both are random filters). Because the GroupFilter logic is AND, and the Negated=0, the GroupFilter results are those sources that are found in both RandomFilter sets. Because the results of these filters are random, the number of GroupFilter AND results will likely differ each time this example is executed:
DomainCreateOrOpen
  SET dname="mydomain"
  IF (##class(%iKnow.Domain).Exists(dname))
      { SET domoref=##class(%iKnow.Domain).Open(dname)
        GOTO DeleteOldData }
  ELSE { SET domoref=##class(%iKnow.Domain).%New(dname)
         DO domoref.%Save()
         GOTO SetEnvironment }
DeleteOldData
  SET stat=domoref.DropData()
  IF stat { GOTO SetEnvironment }
  ELSE    { WRITE "DropData error ",$System.Status.DisplayError(stat)
            QUIT}
SetEnvironment
  SET domId=domoref.Id
QueryBuild
   SET myquery="SELECT TOP 50 ID AS UniqueVal,Type,NarrativeFull FROM Aviation.Event"
   SET idfld="UniqueVal"
   SET grpfld="Type"
   SET dataflds=$LB("NarrativeFull")
ListerAndLoader
  SET domId=domoref.Id
  SET flister=##class(%iKnow.Source.SQL.Lister).%New(domId)
  SET myloader=##class(%iKnow.Source.Loader).%New(domId)
  SET stat=flister.AddListToBatch(myquery,idfld,grpfld,dataflds)
      IF stat '= 1 {WRITE "The lister failed: ",$System.Status.DisplayError(stat) QUIT }
  SET stat=myloader.ProcessBatch()
      IF stat '= 1 {WRITE "The loader failed: ",$System.Status.DisplayError(stat) QUIT }
DefineTwoRandomFilters
  SET randfilt1=##class(%iKnow.Filters.RandomFilter).%New(domId,.33)
  SET randfilt2=##class(%iKnow.Filters.RandomFilter).%New(domId,.25)
  SET numSrcD=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId)
  WRITE "The ",dname," domain contains ",numSrcD," sources",!
  SET numSrcFD=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId,randfilt1)
  WRITE "From ",numSrcD," sources randfilt1 sampled ",numSrcFD,!
  SET numSrcFD=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId,randfilt2)
  WRITE "From ",numSrcD," sources randfilt2 sampled ",numSrcFD,!
GroupFilter
    SET grpfilt=##class(%iKnow.Filters.GroupFilter).%New(domId,"AND",0)
    DO grpfilt.AddSubFilter(randfilt1)
    DO grpfilt.AddSubFilter(randfilt2)
  SET numSrcGrp=##class(%iKnow.Queries.SourceAPI).GetCountByDomain(domId,grpfilt)
  WRITE "From ",numSrcD," sources grpfilt sampled ",numSrcGrp
 
You assign each GroupFilter either the AND ($$$GROUPFILTERAND) or the OR ($$$GROUPFILTEROR) logical operator. Therefore, to create a compound filter involving both AND and OR logic, you must create a GroupFilter with AND logic and a GroupFilter with OR logic.
The following example corresponds to the Boolean expression "(filter1 AND !(filter2 OR filter3))":
#Include %IKPublic
   SET domoref=##class(%iKnow.Domain).%New("MyDomain")
   DO domoref.%Save()
   SET domId=domoref.Id
   /* . . . */
Create3Filters
    /* . . . */
GroupFilters
  SET group1=##class(%iKnow.Filters.GroupFilter).%New(domId,$$$GROUPFILTERAND,0)
  SET group2=##class(iKnow.Filters.GroupFilter).%New(domId,$$$GROUPFILTEROR,1)
  DO group1.AddSubFilter(filter1)
  DO group2.AddSubFilter(filter2)
  DO group2.AddSubFilter(filter3)
  DO group1.AddSubFilter(group2)