Using iKnow
Alternatives for Creating an iKnow Environment
[Back] [Next]
Go to:

Before using iKnow on source data, you must create an instance of the objects that define the iKnow environment. You then load source texts into this environment. The recommended way to do this is to use iKnow Architect. This chapter describes the iKnow environment objects in greater depth, and provides alternative ways to create and extend them.

The following comprise the iKnow environment:
You can create multiple instances of Domains, Configurations, and UserDictionaries. These environment objects are independent of one another, and are independent of any specified set of source data.
iKnow Domains
All iKnow operations occur within a Domain. A domain is a iKnow defined unit within a Caché namespace. All source data to be used by iKnow is listed and loaded into a domain. A Caché namespace can contain multiple iKnow domains.
You can define, modify and delete an iKnow domain in three ways:
Defining a Domain as a Subclass
When a user creates and compiles a class inheriting from %iKnow.DomainDefinition, the compiler will automatically create an iKnow domain corresponding to the settings specified in XML representation in the class’s Domain XData block. The user can specify static elements, such as domain parameters, metadata field definitions, and an assigned Configuration, all of which are created automatically at compile time. In addition, the user can specify sources of text data to be loaded into the domain. Caché uses this source information to generate a dedicated %Build() method in a new class named [classname].Domain. This %Build() method can then be used to load the specified data into the domain.
Use the following steps to define a domain by inheriting from %iKnow.DomainDefinition:
  1. In Caché Studio, select the desired namespace (File->Change Namespace), then create a new class definition (File->New, select the Caché Class Definition icon from the General tab). This invokes the New Class Wizard.
  2. In the New Class Wizard specify a Package Name and Class Name of your choice. Press Next. In the Class Type box, select Extends and specify %iKnow.DomainDefinition as the name of the superclass. Press Finish.
  3. In Caché Studio, select Class->Refactor->Override. Select the XData tab. Select the Domain icon. Press OK. This creates an XData block.
  4. Place the cursor within the XData block curly braces and type “<”. Studio Assist immediately begins offering XML code options. Specify a domain name of your choice. You must specify at minimum the following:
     <domain name="AviationEvents">
    Using XML syntax and Studio Assist options you can add other properties to the XData block (as described below).
  5. In Caché Studio, save the file (in this case as “MyDomain”), then select Build->Compile. This creates the domain.
The following is an example of this kind of domain definition:
Class Aviation.MyDomain Extends %iKnow.DomainDefinition
/// An XML representation of the domain this class defines.
XData Domain [ XMLNamespace = "" ]
 <domain name="AviationEvents">
   <parameter name="Status" value="1" />
   <configuration name="MyConfig" detectLanguage="1" languages="en,es" />
   <parameter name="DefaultConfig" value="MyConfig" />
     <field name="EventDate" dataType="DATE"/>
        <field name="Type" dataType="STRING" /> 
  <data dropBeforeBuild="true">
    <files path="C:\MyDocs\" encoding="utf-8" recursive="1" extensions="txt"
       configuration="MyConfig" />
     <query sql="SELECT %ID,EventDate,Type,NarrativeFull
                    FROM Aviation.Event"
                    idField="ID" groupField="ID" dataFields="NarrativeFull"
      metadataFields="EventDate,Type" metadataColumns="EventDate,Type" />
At compile-time, this definition creates a domain "AviationEvents", with a Status parameter set to 1, and two metadata fields. It defines and assigns to the domain a Configuration "MyConfig" for processing English (en) and Spanish (es) texts.
This definition specifies the files to be loaded into this domain. It will load text (.txt) files from the C:\MyDocs\ directory, and it will load Caché SQL data from the Aviation.Event table. Refer to the %iKnow.Model.listFiles and %iKnow.Model.listQuery class properties for details.
Caché generates a %Build() method in a dependent class named Aviation.MyDomain.Domain that contains the logic to load data from the C:\MyDocs directory and the Aviation.Event table.
To load the specified text data sources into this domain:
  SET stat=##class(Aviation.MyDomain).%Build()
After using %Build(), you can check for errors:
  DO $SYSTEM.iKnow.ListErrors("AviationEvents",0)
This lists three types of errors: errors, failed sources, and warnings.
To display the domains defined in the current namespace and the number of sources loaded for each domain:
  DO $SYSTEM.iKnow.ListDomains() 
To display the metadata fields defined for this domain:
  DO $SYSTEM.iKnow.ListMetadata("AviationEvents")
iKnow assigns every domain one metadata field: DateIndexed. You can define additional metadata fields.
You can specify a <matching> element in the domain definition. The <matching> element describes dictionary information and specifies whether or not to automatically match loaded sources to these dictionaries. Caché performs basic validation on these objects during class compilation, but because they are loaded as part of the %Build() method, some name conflicts might only arise at runtime. Refer to the %iKnow.Model.matching class properties for details.
You can specify a <metrics> element in the domain definition. The <metrics> element adds custom metrics to the domain. No call to %iKnow.Metrics.MetricDefinition.Register() is required, because this is automatically performed by the Domain Definition code at compile time. Refer to the %iKnow.Model.metrics class properties for details.
Defining a Domain Programmatically
To define a new domain using class methods, invoke the %iKnow.Domain.%New() persistent method, supplying the domain name as the method parameter. A domain name can be any valid string; domain names are not case-sensitive. The name you assign to this domain must be unique for the current namespace. This method returns a domain object reference (oref) which is unique for all namespaces of the Caché instance. You must then save this instance using the %Save() method to make it persistent. The domain Id property (an integer value) is not defined until you save the instance as a persistent object, as shown in the following example:
   SET domOref=##class(%iKnow.Domain).%New("FirstExampleDomain")
   WRITE "Id before save: ",domOref.Id,!
   DO domOref.%Save()
   WRITE "Id after save: ",domOref.Id,!
   DO ##class(%iKnow.Domain).%DeleteId(domOref.Id)
   WRITE "All done"
There are two ways to create a domain if it doesn’t exist, or open the domain if it does exist:
The following example checks whether a domain exists. If the domain doesn’t exist, the program creates it. If the domain does exist, the program opens it. For the purpose of demonstration, this program then randomly either deletes or doesn’t delete the domain.
  SET domn="mydomain"
  IF (##class(%iKnow.Domain).Exists(domn))
     { WRITE "The ",domn," domain already exists",!
       SET domo=##class(%iKnow.Domain).Open(domn)
       SET domId=domo.Id
  ELSE {
     SET domo=##class(%iKnow.Domain).%New(domn)
     DO domo.%Save()
     SET domId=domo.Id
     WRITE "Created the ",domn," domain",!
     WRITE "with domain ID ",domId,! }
     SET x=domo.IsEmpty()
       IF x=1 {WRITE "Domain ",domn," contains no data",!}
       ELSE {WRITE "Domain ",domn," contains data",!}
    SET rnd=$RANDOM(2)
    IF rnd {
       SET stat=##class(%iKnow.Domain).%DeleteId(domId)
       IF stat {WRITE "Deleted the ",domn," domain" }
       ELSE { WRITE "Domain delete error:",stat }
    ELSE {WRITE "No delete this time" }
The following example uses the GetOrCreateId() shorthand method to create a domain if it doesn’t exist, or open the domain if it does exist. For the purpose of demonstration, this program then randomly either deletes or doesn’t delete the domain.
     SET domn="mydomain"
     SET domId=##class(%iKnow.Domain).GetOrCreateId(domn)
     SET domoref=##class(%iKnow.Domain).%OpenId(domId)
       WRITE "The ",domn," domain with domain ID ",domId,!
     SET x=domoref.IsEmpty()
       IF x=1 {WRITE "Domain ",domn," contains no data",!}
       ELSE {WRITE "Domain ",domn," contains data",!}
  SET rnd=$RANDOM(2)
  IF rnd {
       SET stat=##class(%iKnow.Domain).%DeleteId(domId)
       IF stat {WRITE "Deleted the ",domn," domain" }
       ELSE { WRITE "Domain delete error:",stat }
  ELSE {WRITE "No delete this time" }
The %iKnow.Domain class methods that create or open a domain are provided with an output %Status parameter. This parameter is set when the current system does not have license access to iKnow, and thus cannot create or open an iKnow domain.
Setting Domain Parameters
Domain parameters govern the behavior of a wide variety of iKnow operations. The specific parameters are described where applicable throughout this manual. For a list of available domain parameters, refer to the Appendix Domain Parameters.
In the examples that follow, domain parameters are referenced by their macro equivalent (for example, $$$IKPFULLMATCHONLY), not their parameter name (For example, FullMatchOnly). The recommended programming practice is to use these %IKPublic macros rather than the parameter names.
All domain parameters take a default value. Commonly, iKnow will give optimal results without specifically setting any domain parameters. iKnow determines the value for each parameter as follows:
  1. If you have specified a parameter value for the current domain, that value is used. Note that some parameters can only be set before loading data into a domain, while others can be set at any time. You can use the IsEmpty() method to determine if any data has been loaded into the current domain.
  2. If you have specified a systemwide parameter value, that value is used as a default for all domains, except for a domain where a domain-specific value has been set.
  3. If you have not specified a value for a parameter at either the domain level or the system level, iKnow uses its default value for that parameter.
Setting Parameters for the Current Domain
Once you have created a domain, you can set domain parameters for this specific domain using the SetParameter() instance method. SetParameter() returns a status indicating whether the parameter specified is valid and was set. GetParameter() returns the parameter value and the level at which the parameter was set (DEFAULT, DOMAIN, or SYSTEM). Note that GetParameter() does not check the validity of a parameter name; it returns DEFAULT for any parameter name it cannot identify as being set at the domain or system level.
The following example gets the default for the SortField domain parameter, sets this parameter for the current domain, then gets the value you set and the level at which it was set (DOMAIN):
#Include %IKPublic
  SET domn="paramdomain"
     SET domo=##class(%iKnow.Domain).%New(domn)
     WRITE "Created the ",domn," domain",!
     DO domo.%Save()
  SET sfval=domo.GetParameter($$$IKPSORTFIELD,.sf)
     WRITE "SortField before SET=",sfval," ",sf,!
  IF sfval=0 {WRITE "changing SortByFrequency to SortBySpread",!
           SET stat=domo.SetParameter($$$IKPSORTFIELD,1) 
           IF stat=0 {WRITE "SetParameter failed"  QUIT} }
  WRITE "SortField after SET=",domo.GetParameter($$$IKPSORTFIELD,.str)," ",str,!!
  SET stat=##class(%iKnow.Domain).%DeleteId(domo.Id)
  IF stat {WRITE "Deleted the ",domn," domain" }
  ELSE { WRITE "Domain delete error:",stat }
Setting Parameters Systemwide
You can set domain parameters for all domains systemwide using the SetSystemParameter() method. A parameter set using this method immediately becomes the default parameter value for all existing and subsequently created domains in all namespaces. This systemwide default is overriden for an individual domain using the SetParameter() instance method.
The SortField and Jobs domain parameters are exceptions. Setting these parameters at the system level has no effect on the domain settings.
You can determine if a domain parameter has been established as the system default using the GetSystemParameter() method. The initial value for a systemwide parameter is always the null string (no default).
If you wish to remove a systemwide default setting for a domain parameter, use the UnsetSystemParameter() method. Once a systemwide parameter setting has been established, you must unset it before you can set it to a new value. UnsetSystemParameter() returns a status of 1 (success) even when there was no parameter default value to unset.
The following example establishes a FullMatchOnly systemwide parameter value. If no systemwide default has been established, the program sets this systemwide parameter. If a systemwide default has been established, the program unsets this systemwide parameter, then sets it.
#Include %IKPublic
  /* Initial set */
  SET stat=##class(%iKnow.Domain).SetSystemParameter($$$IKPFULLMATCHONLY,1)
  IF stat=1 { 
    WRITE "FullMatchOnly set systemwide to: "
    WRITE ##class(%iKnow.Domain).GetSystemParameter($$$IKPFULLMATCHONLY),!
    QUIT }
  ELSE {
  /* Unset and Reset */
    SET stat=##class(%iKnow.Domain).UnsetSystemParameter($$$IKPFULLMATCHONLY)
    IF stat=1 {
      SET stat=##class(%iKnow.Domain).SetSystemParameter($$$IKPFULLMATCHONLY,1)
      IF stat=1 {
        WRITE "FullMatchOnly was unset systemwide",!,"then set to: "
        WRITE ##class(%iKnow.Domain).GetSystemParameter($$$IKPFULLMATCHONLY),!!
        GOTO CleanUpForNextTime }
      ELSE {WRITE "System Parameter set error",stat,!}
    ELSE {WRITE "System Parameter set error",stat,!}
  SET stat=##class(%iKnow.Domain).UnsetSystemParameter($$$IKPFULLMATCHONLY)
  IF stat '=1 {WRITE "   Unset error status:",stat}
The following example shows that setting a systemwide parameter value immediately sets the parameter value for all domains. After setting a systemwide parameter value, you can override this value for individual domains:
#Include %IKPublic
  SET stat=##class(%iKnow.Domain).UnsetSystemParameter($$$IKPFULLMATCHONLY)
  WRITE "Systemwide setting FullMatchOnly=",##class(%iKnow.Domain).GetSystemParameter($$$IKPFULLMATCHONLY),!!
  SET domn1="mysysdomain1"
     SET domo1=##class(%iKnow.Domain).%New(domn1)
     DO domo1.%Save()
     SET dom1Id=domo1.Id
     WRITE "Created the ",domn1," domain ",dom1Id,!
     WRITE "FullMatchOnly=",domo1.GetParameter($$$IKPFULLMATCHONLY,.str)," ",str,!!
  SET stat=##class(%iKnow.Domain).SetSystemParameter($$$IKPFULLMATCHONLY,1)
       IF stat=0 {WRITE "SetSystemParameter failed"  QUIT}
  WRITE "Set systemwide FullMatchOnly=",##class(%iKnow.Domain).GetSystemParameter($$$IKPFULLMATCHONLY),!!
  SET domn2="mysysdomain2"
     SET domo2=##class(%iKnow.Domain).%New(domn2)
     DO domo2.%Save()
     SET dom2Id=domo2.Id
     WRITE "Created the ",domn2," domain ",dom2Id,!
     WRITE "Domain setting FullMatchOnly=",domo2.GetParameter($$$IKPFULLMATCHONLY,.str)," ",str,!!
  WRITE "New domain ",dom2Id," FullMatchOnly=",domo2.GetParameter($$$IKPFULLMATCHONLY,.str)," ",str,!
  WRITE "Existing domain ",dom1Id," FullMatchOnly=",domo1.GetParameter($$$IKPFULLMATCHONLY,.str)," ",str,!!
  SET stat=domo1.SetParameter($$$IKPFULLMATCHONLY,0)
      IF stat=0 {WRITE "SetParameter failed"  QUIT}
  WRITE "Domain override FullMatchOnly=",domo1.GetParameter($$$IKPFULLMATCHONLY,.str)," ",str,!
  SET stat=##class(%iKnow.Domain).%DeleteId(dom1Id)
  SET stat=##class(%iKnow.Domain).%DeleteId(dom2Id)
  SET stat=##class(%iKnow.Domain).UnsetSystemParameter($$$IKPFULLMATCHONLY)
Assigning to a Domain
Once you have created a domain and (optionally) specified its domain parameters, you can assign various components to that domain:
These components are defined using various iKnow classes and methods. You can also use the Caché iKnow Architect to define metadata fields, load sources, and define blacklists.
Metadata fields must be defined before loading sources. Filters, blacklists, and dictionaries can be defined or modified at any time.
Deleting All Data from a Domain
Deleting or changing an original source text has no effect on the source data listed and loaded from that text into a iKnow domain. You must explicitly add or delete a source to the set of indexed sources.
The %DeleteId() persistent method deletes a domain and all source data that has been listed and loaded in that domain. You can use the DropData() method to delete all source data that has been loaded into a domain without deleting the domain itself. Either method deletes all indexed source data, allowing you to start over with a new set of data sources.
When deleting a domain that contains a significant number of sources, use DropData() to delete the data before using %DeleteId() to delete the domain. If you use %DeleteId() to delete a domain while it still has data in it, Caché will delete the data, but it will journal each data deletion, even if journaling has been disabled. Deleting the data, and then deleting the domain prevents the generation of these large journal files.
You can use the IsEmpty() method to determine if any data has been loaded into a domain.
The following example demonstrates deleting the data from a domain. If the named domain doesn’t exist, the program creates the domain. If the named domain does exist, the program tests for the presence of data. If there is data in the domain, the program opens the domain and deletes the data.
  SET dname="mytestdomain"
  IF (##class(%iKnow.Domain).Exists(dname))
     { WRITE "The ",dname," domain already exists",!
       SET domoref=##class(%iKnow.Domain).Open(dname)
       IF domoref.IsEmpty() {GOTO RestOfProgram}
       ELSE {GOTO DeleteData }
     { WRITE "The ",dname," domain does not exist",!
       SET domoref=##class(%iKnow.Domain).%New(dname)
       DO domoref.%Save()
       WRITE "Created the ",dname," domain with domain ID ",domoref.Id,!
       GOTO RestOfProgram }
  SET stat=domoref.DropData()
  IF stat { WRITE "Deleted the data from the ",dname," domain",!
            GOTO RestOfProgram }
  ELSE    { WRITE "DropData error",!
  WRITE "The ",dname," domain contains no data"
Listing All Domains
You can use the GetAllDomains query to list all current domains. This is shown in the following example:
  DO ##class(%ResultSet).RunQuery("%iKnow.Domain","GetAllDomains")
  WRITE !,"Domains in all namespaces"
Each domain is listed on a separate line, using the following format: domainId:domainName:namespace:version.
The Version property is an integer that shows what version of iKnow data structure was used when the domain was created. The iKnow system version number changes when a release contains a change to the iKnow data structures. Therefore, a new version of Caché or the introduction of new iKnow features may not change the iKnow system version number. If the Version property value for a domain is not the current iKnow system version, you may wish to upgrade the domain to take advantage of the latest features of iKnow. See Upgrading iKnow Data in the “iKnow Implementation” chapter.
By default, GetAllDomains lists all the current domains for all namespaces. You can specify a boolean argument to limit the listing of domains to the current namespace, as shown in the following example:
  DO ##class(%ResultSet).RunQuery("%iKnow.Domain","GetAllDomains",1)
  WRITE !,"Domains in the USER namespace",!!
  DO ##class(%ResultSet).RunQuery("%iKnow.Domain","GetAllDomains",1)
  WRITE !,"Domains in the SAMPLES namespace",!
A boolean value of 1 limits listing to domains in the current namespace. A boolean value of 0 (the default) lists all domains in all namespaces. (Note: listed Version property values may not be correct for domains other than the current domain.)
You can also list all domains in the current namespace using:
  DO ##class(%SYSTEM.iKnow).ListDomains()
This method lists the domain Ids, domain names, number of sources, and the domain version number.
Renaming a Domain
You can use the Rename() class method to change the name of an existing domain within the current namespace, as shown in the following example:
   SET stat=##class(%iKnow.Domain).Rename(oldname,newname)
   IF stat=1 {WRITE "renamed ",oldname," to ",newname,!}
   ELSE {WRITE "no rename",oldname," is unchanged",! }
Renaming a domain changes the name used to open the domain, assigning the existing Domain Id to the new name. Rename() does not change the name of a current instance of the domain. For a rename to occur, the old domain name must exist and the new domain name must not exist.
Copying a Domain
You can copy an existing domain to a new domain in the current namespace by using the CopyDomain() method of the %iKnow.Utils.CopyUtils class. The CopyDomain() method copies a domain definition to a new domain, assigning a unique domain name and domain Id; the existing domain is unchanged. If the new domain does not exist, this method creates a new domain. By default, this method copies the domain parameter settings and assigned domain components from the existing domain to the copy, if these components are present.
By default, the CopyDomain() method copies the source data from the existing domain to the copy. However, if source data copying is requested and no source data is present in the existing domain, the CopyDomain() operation fails.
The following example copies a the domain named “mydomain” and its parameter settings and source data to a new domain named “mydupdomain”. Because “mydomain” contains no source data, the 3rd argument (which specifies whether to copy source data) is set to 0:
  SET olddom="mydomain"_$PIECE($H,",",2)
  SET domo=##class(%iKnow.Domain).%New(olddom)
     DO domo.%Save()
  IF (##class(%iKnow.Domain).Exists(olddom))
     {WRITE "Old domain exists, proceed with copy",!!}
  ELSE {WRITE "Old domain does not exist" QUIT}
  SET newdom="mydupdomain"
  IF (##class(%iKnow.Domain).Exists(newdom))
     {WRITE "Domain copy overwriting domain ",newdom,!}
  ELSE {WRITE "Domain copy creating domain ",newdom,!}
  SET stat=##class(%iKnow.Utils.CopyUtils).CopyDomain(olddom,newdom,0)
  IF stat=1 {WRITE !!,"Copied ",olddom," to ",newdom," copying all assignments",!!}
  ELSE {WRITE "Domain copy failed with status ",stat,!}
  SET stat=##class(%iKnow.Domain).%DeleteId(domo.Id)
  WRITE "Deleted the old domain",!
  IF $RANDOM(2) {
       SET newId=##class(%iKnow.Domain).GetOrCreateId("mydupdomain")
       SET stat=##class(%iKnow.Domain).%DeleteId(newId)
       WRITE "Deleted the new domain" }
  ELSE {WRITE "No new domain delete this time" }
The CopyDomain() method allows you to quickly copy all of the domain settings, source data, and assigned components of an existing domain to a new domain. It provides boolean options for all-or-nothing copying of assigned components. Other methods in the %iKnow.Utils.CopyUtils class provide greater control in specifying which assigned components to copy from one existing domain to another.
iKnow Configurations
An iKnow configuration specifies behavior for handling source documents. It is only used during the source data loading operation. A configuration is specific to its namespace; you can create multiple configurations within a namespace. iKnow assigns each configuration in a namespace a configuration Id, a unique integer. Configuration Id values are not reused. You can apply the same configuration to different domains and source text loads. Defining or using an iKnow configuration is optional; if you don’t specify a configuration, iKnow uses the property defaults.
You can define an iKnow configuration in two ways:
Defining a Configuration
You can define a configuration using the %New() persistent method of the %iKnow.Configuration class.
You can determine if an iKnow configuration with that name already exists by invoking the Exists() method. If the configuration exists, you can open it using the Open() method, as shown in the following example:
 IF ##class(%iKnow.Configuration).Exists("EnFr") {
       SET cfg=##class(%iKnow.Configuration).Open("EnFr") }
  ELSE { SET cfg=##class(%iKnow.Configuration).%New("EnFr",1,$LB("en","fr"))
         DO cfg.%Save() }
Setting Configuration Properties
A configuration defines the following properties:
All configuration properties (except the Name) are assigned default values. You can get or set a configuration property by using property dispatch:
   IF cfgOref.DetectLanguage=0 {
     SET cfgOref.DetectLanguage=1
     DO cfgOref.%Save() }
Note that you must first %Save() the newly created configuration before you can change its properties using property dispatch, and then you must %Save() the configuration after changing the property values.
The following example creates a configuration that supports English and French with automatic language identification. It then changes the configuration to support English and Spanish:
  SET myconfig="Bilingual"
  IF ##class(%iKnow.Configuration).Exists(myconfig) {
       SET cfg=##class(%iKnow.Configuration).Open(myconfig)
       WRITE "Opened existing configuration ",myconfig,! }
  ELSE { SET cfg=##class(%iKnow.Configuration).%New(myconfig,1,$LB("en","fr"))
         DO cfg.%Save()
       WRITE "Created new configuration ",myconfig,! }
     WRITE "that supports ",$LISTTOSTRING(cfg.Languages),!
     SET cfg.Languages=$LISTBUILD("en","sp")
     DO cfg.%Save()
     WRITE "changed ",myconfig," to support ",$LISTTOSTRING(cfg.Languages),!
  SET rnd=$RANDOM(2)
  IF rnd {
       SET stat=##class(%iKnow.Configuration).%DeleteId(cfg.Id)
       IF stat {WRITE "Deleted the ",myconfig," configuration" }
  ELSE {WRITE "No delete this time",! }
For a description of using multiple languages and automatic language identification, refer to the Language Identification chapter of this manual.
Using a Configuration
You can apply a defined configuration in any of the following ways:
Listing All Configurations
You can use the GetAllConfigurations query to list all defined configurations in the current namespace. This is shown in the following example:
  WRITE "The current namespace is: ",$NAMESPACE,!
  WRITE "It contains the following configurations: ",!
  DO ##class(%ResultSet).RunQuery("%iKnow.Configuration","GetAllConfigurations")
Each configuration is listed on a separate line, listing the configuration Id followed by the configuration parameter values. Listed values are separated by colons. If the configuration is defined with a list of supported languages, GetAllConfigurations displays these language abbreviations separated by commas.
You can also list all configurations in the current namespace using:
  DO ##class(%SYSTEM.iKnow).ListConfigurations()
Using a Configuration to Normalize a String
Using a defined iKnow configuration, you can perform iKnow text normalization on a string using the Normalize() method. This method both normalizes the string characters and (optionally) applies a UserDictionary, as shown in the following example:
  SET time=$PIECE($H,",",2)
  SET udname="Abbrev"_time
  SET udict=##class(%iKnow.UserDictionary).%New(udname) 
  DO udict.%Save() 
  DO udict.AddEntry("Dr.","Doctor")
  DO udict.AddEntry("Mr.","Mister")
  DO udict.AddEntry("\&\","and")
  DO udict.GetEntries(.dictlist)
  SET i=1
  WHILE $DATA(dictlist(i)) {
    WRITE $LISTTOSTRING(dictlist(i),",",1),!
    SET i=i+1 }
  WRITE "End of UserDictionary",!!
   SET cfg=##class(%iKnow.Configuration).%New("EnUDict"_time,0,$LB("en"),udname)
   DO cfg.%Save()
   SET mystring="...The Strange Case  of Dr. Jekyll      & Mr. Hyde"
   SET normstring=cfg.Normalize(mystring)
   WRITE normstring
   DO ##class(%iKnow.UserDictionary).%DeleteId(udict.Id)
   DO ##class(%iKnow.Configuration).%DeleteId(cfg.Id)
You can perform iKnow text normalization on a string independent of a configuration using the NormalizeWithParams() method.
These methods perform these operations, in the following order:
  1. Apply a UserDictionary, if one is specified
  2. Perform iKnow language model preprocessing
  3. Convert all text to lowercase letters
  4. Replace multiple whitespace characters with a single space
iKnow UserDictionary
A UserDictionary specifies a set of user-defined paired terms applied to the source texts. iKnow substitutes each occurrence of the first term of the pair with the second term as part of source text listing. This operation changes the source text used by iKnow; all subsequent iKnow operations see only the substituted term. For example, if UserDictionary replaces the abbreviation “Dr.” with “Doctor”, every occurrence of “Dr.” is replaced by the word “Doctor” in the data indexed by iKnow. The original source file is not changed, but all representations of the source text within iKnow contain this substitution. Unlike all other components of iKnow, UserDictionary changes the source content before listing and loading.
You can use the UserDictionary to substitute one term for another, to expand acronyms and abbreviations (or the reverse), or to avoid or cause a sentence break.
Substitution pairs are applied before iKnow text normalization, which converts the iKnow internal text representation to lowercase letters. For this reason, substitution pairs are case sensitive. Thus, to replace all instances of “physician” with “doctor” you will need the substitution pairs "physician","doctor", "Physician","Doctor", and perhaps "PHYSICIAN","DOCTOR".
Defining a UserDictionary is optional. A UserDictionary exists independent of any specific configuration or domain. A defined UserDictionary can be assigned as a Configuration property. Only one UserDictionary can be assigned to a Configuration. The same UserDictionary can be assigned to multiple Configurations.
A defined UserDictionary can also be specified to the NormalizeWithParams() method, independent of any Configuration.
You cannot modify an existing configuration; a %New() does not delete/replace an existing configuration. Therefore, to add a UserDictionary to an existing configuration you must explicitly delete then re-create the named configuration. Alternatively, you can create a new configuration with a new configuration name.
The UserDictionary is applied to sources when the sources are listed; already indexed sources are not affected by changes to UserDictionary.
UserDictionary Format
UserDictionary pairs often perform the simple substitution of a term for an equivalent term. For example, replacing every occurrence of “physician” with “doctor”. Using the backslash character provides additional formatting options:
Format Meaning
\ Only perform substitution if a blank space occurs here.
\noend Do not issue a sentence break.
\end Issue a sentence break.
These are shown in the following sample UserDictionary pairs:
\UK,United Kingdom
Defining a UserDictionary as an Object Instance
You must first create a UserDictionary object, then populate that instance.
  SET udict=##class(%iKnow.UserDictionary).%New("MyUserDict") 
  DO udict.%Save()
  DO udict.AddEntry("Dr.","Doctor")
  DO udict.AddEntry("physician","doctor")
  DO udict.AddEntry("Physician","Doctor")
To populate a UserDictionary object, you use the AddEntry() method to specify substitution pairs. Each substitution pair requires a separate AddEntry() with the following format: AddEntry(oldstring,newstring). Note that substitution is string substitution, and that pairs are case sensitive. You can, optionally, specify the position at which to add the UserDictionary entry (the position default is to add the entry at the end of the UserDictionary). Because iKnow applies substitution pairs in UserDictionary order, you can use position to perform additive substitutions. For example, first replace “PA” with “physician’s assistant”, then replace “physician” with “doctor”.
To assign a UserDictionary object, you supply the UserDictionary name as the 4th argument in the Configuration %New() method:
  SET cfg=##class(%iKnow.Configuration).%New("MyConfig",0,$LISTBUILD("en"),"MyUserDict",1)
  DO cfg.%Save()
Defining a UserDictionary as a File
You must first create a UserDictionary file, populate it, then assign this UserDictionary file to a Configuration.
A UserDictionary file must be a text file in UTF-8 format encoding.
To populate a UserDictionary file, you specify substitution pairs in a text file. Each substitution pair is a separate line with the following format: oldstring,newstring. Note that substitution is string substitution, and that pairs are case sensitive. The following is a sample UserDictionary file:
\UK,United Kingdom
To assign a UserDictionary file, you supply the full pathname as the 4th argument in the Configuration %New() method:
  SET cfg=##class(%iKnow.Configuration).%New(myconfig,0,$LISTBUILD("en"),"C:\temp\udict.txt",1)
  DO cfg.%Save()