Blacklists
A Blacklist is a list of entities that you do not want a query to return. For example, if your source texts include greetings and salutations, you might want to put “many thanks”, “best regards”, and other stock phrases with no real information content in your Blacklist. A Blacklist might also be used to suppress top concepts that are too general and widespread to be of interest when analyzing query results.
When displaying results of a query that applied a Blacklist, it is important to note that use of a Blacklist silently changes the query results by suppressing some information. Make sure the Blacklists used are relevant for the data contents, for the user looking at the query results, and for the context in which query results are displayed.
Creating a Blacklist
You can define a Blacklist that is assigned to a specific domain, or define a Blacklist that is domain-independent (cross-domain) and can be used by any domain in the current namespace. You can define a Blacklist in two ways:
-
Using the Caché iKnow Architect. This interface can be used define a Blacklist within a domain, add and delete Blacklist entities, delete a Blacklist, or list all Blacklists defined for a domain. This interface supports populating a Blacklist by specifying entities as strings.
-
Using the %iKnow.Utils.MaintenanceAPIOpens in a new tab class methods to define, populate, and maintain Blacklists. This class allows you to create both domain-specific and cross-domain Blacklists. This class provides methods for populating a Blacklist either by specifying entities as strings or by specifying entities by entity Id. Use of some of the Blacklist %iKnow.Utils.MaintenanceAPIOpens in a new tab class methods is shown in this chapter.
You can use the CopyBlackLists()Opens in a new tab method of the %iKnow.Utils.CopyUtilsOpens in a new tab class to copy all defined Blacklists in a domain to another domain.
The Knowledge Portal and the Basic Portal user interfaces support the use of Blacklists.
The following example creates a domain-specific Blacklist and populates it with elements. It then lists all Blacklists for the domain, and all of the elements in this domain. Finally, it deletes the blacklist.
DomainCreateOrOpen
SET domn="mydomainwithbl"
IF (##class(%iKnow.Domain).NameIndexExists(domn))
{ SET domo=##class(%iKnow.Domain).NameIndexOpen(domn)
SET domId=domo.Id }
ELSE { SET domo=##class(%iKnow.Domain).%New(domn)
DO domo.%Save()
SET domId=domo.Id }
CreateBlackList
SET blname="AviationBlacklist"
SET blId=##class(%iKnow.Utils.MaintenanceAPI).CreateBlackList(domId,blname,
"Aviation non-mechanical terms Blacklist")
PopulateBlackList
SET black=$LB("aircraft","airplane","flight","accident","event","incident","pilot",
"student pilot","flight instructor","runway","accident site","ground","visibility","faa")
SET ptr=0
FOR x=0:1:100 {
SET moredata=$LISTNEXT(black,ptr,val)
IF moredata=1 {
SET stat=##class(%iKnow.Utils.MaintenanceAPI).AddStringToBlackList(domId,blId,val)
}
ELSE { WRITE x," entities in Blacklist",!!
GOTO ListBlacklist }
}
ListBlacklist
SET stat=##class(%iKnow.Utils.MaintenanceAPI).GetBlackLists(.bl,domId,0)
SET i=1
WHILE $DATA(bl(i)) {
WRITE $LISTTOSTRING(bl(i),",",1),!
SET i=i+1 }
WRITE "Printed the ",i-1," Blacklists",!
SET stat=##class(%iKnow.Utils.MaintenanceAPI).GetBlackListElements(.ble,domId,blId)
/* IF stat=1 {WRITE "success",!}
ELSE {WRITE "GetBlackListElements failed" QUIT } */
SET j=1
WHILE $DATA(ble(j)) {
WRITE $LISTTOSTRING(ble(j),",",1),!
SET j=j+1 }
WRITE "Printed the ",j-1," Blacklist elements",!
CleanUp
SET stat=##class(%iKnow.Utils.MaintenanceAPI).DropBlackList(domId,blId)
IF stat=1 {WRITE "Blacklist deleted",!}
ELSE {WRITE "DropBlackList failed" QUIT }
The CreateBlackList()Opens in a new tab method allows you to specify both a name and a description for your Blacklist. A Blacklist name can be any valid string of any length; Blacklist names are case-sensitive. The name you assign to a Blacklist must be unique: for a domain-specific Blacklist it must be unique within the domain; for a cross-domain Blacklist it must be unique within the namespace. Specifying a duplicate Blacklist name generates ERROR #8091. The Blacklist description is optional; it can be a string of any length.
Blacklists and Domains
Each Blacklist you create can either be specific to a domain, or can be cross-domain (domain-independent) and usable by any domain in the current namespace:
-
A domain-specific Blacklist is assigned to a domain by specifying a domainId in the CreateBlackList() method. This method returns a blacklistId as a sequential positive integer. Query methods that use this Blacklist reference it by this blacklistId. A domain-specific Blacklist can support stemming.
-
A cross-domain Blacklist is not assigned to a domain. Instead, you specify a domainId of 0 in the CreateBlackList() method. This method returns a blacklistId as a sequential positive integer. Query methods that use this Blacklist reference it by a negative blacklistId; for example, the Blacklist identified by blacklistId 8 is referenced by the blacklistId value -8.
Note:Cross-domain Blacklists are only available to domains with version 4 (or higher). Domains created prior to Caché 2014.1 must be upgraded to version 4 to use cross-domain Blacklists.
To populate a domain-specific Blacklist, you can use either AddEntityToBlackList()Opens in a new tab or AddStringToBlackList()Opens in a new tab. To populate a cross-domain Blacklist, you can only use AddStringToBlackList()Opens in a new tab.
GetBlackListElements()Opens in a new tab returns the empty string for the entUniId value for a cross-domain Blacklist.
The following example creates and populates two Blacklists, a domain-specific blacklist (AviationTermsBlacklist) and a cross-domain blacklist (JobTitleBlacklist). The GetBlackLists()Opens in a new tab method returns both blacklists, because the pIncludeCrossDomain boolean is set to 1. Note that GetBlackLists() returns the blacklist Id for the cross-domain blacklist as a negative integer.
DomainCreateOrOpen
SET domn="mydomainwithbl"
IF (##class(%iKnow.Domain).NameIndexExists(domn))
{ SET domo=##class(%iKnow.Domain).NameIndexOpen(domn)
SET domId=domo.Id }
ELSE { SET domo=##class(%iKnow.Domain).%New(domn)
DO domo.%Save()
SET domId=domo.Id }
CreateBlackList1
SET blname="AviationTermsBlacklist"
SET blId=##class(%iKnow.Utils.MaintenanceAPI).CreateBlackList(domId,blname,
"Common aviation terms Blacklist")
PopulateBlackList1
SET black=$LB("aircraft","airplane","flight","accident","event","incident","airport","runway")
SET ptr=0
FOR x=0:1:100 {
SET moredata=$LISTNEXT(black,ptr,val)
IF moredata=1 {
SET stat=##class(%iKnow.Utils.MaintenanceAPI).AddStringToBlackList(domId,blId,val)
}
}
WRITE "Blacklist ",blname," populated",!
CreateBlackList2
SET bl2name="JobTitleBlacklist"
SET bl2Id=##class(%iKnow.Utils.MaintenanceAPI).CreateBlackList(0,bl2name,
"Aviation personnel Blacklist")
PopulateBlackList2
SET jobblack=$LB("pilot","copilot","student pilot","flight instructor","passenger")
SET ptr=0
FOR x=0:1:100 {
SET moredata=$LISTNEXT(jobblack,ptr,val)
IF moredata=1 {
SET stat=##class(%iKnow.Utils.MaintenanceAPI).AddStringToBlackList(0,bl2Id,val)
}
}
WRITE "Blacklist ",bl2name," populated",!!
ListBlacklists
SET pIncludeCrossDomain=1
SET stat=##class(%iKnow.Utils.MaintenanceAPI).GetBlackLists(.bl,domId,pIncludeCrossDomain)
SET i=1
WHILE $DATA(bl(i)) {
IF $LIST(bl(i),1)<0 {
WRITE "cross-domain:",!,$LISTTOSTRING(bl(i),",",1),! }
ELSE { WRITE "domain-specific:",!,$LISTTOSTRING(bl(i),",",1),! }
SET i=i+1 }
WRITE "Printed the ",i-1," Blacklists",!!
CleanUp
SET stat=##class(%iKnow.Utils.MaintenanceAPI).DropBlackList(domId,blId)
IF stat=1 {WRITE "domain Blacklist deleted",!}
ELSE {WRITE "first DropBlackList failed" }
SET stat=##class(%iKnow.Utils.MaintenanceAPI).DropBlackList(0,bl2Id)
IF stat=1 {WRITE "cross-domain Blacklist deleted",!}
ELSE {WRITE "second DropBlackList failed" }
Queries that Support Blacklists
The following query methods provide a parameter to specify Blacklists. You can specify multiple Blacklists to any of these methods by specifying the Blacklist Ids as elements of a %List structure, using the $LISTBUILD function. You specify a domain-specific blacklist as a positive integer blacklistId value; you specify a cross-domain blacklist as a negative integer blacklistId value.
Entity Queries:
-
%iKnow.Queries.EntityAPI.GetCountByDomain()Opens in a new tab
-
%iKnow.Queries.EntityAPI.GetCountBySource()Opens in a new tab
-
%iKnow.Queries.EntityAPI.GetSimilarCounts()Opens in a new tab
Sentence Queries:
Source Queries:
-
%iKnow.Queries.SourceAPI.GetSimilar()Opens in a new tab: iKnow ignores blacklisted entities both when selecting similar entities and when calculating similarity scores.
Blacklist Query Example
The following example suppresses non-mechanical aviation terms that are too general to be of interest. It uses CreateBlackList()Opens in a new tab to create a Blacklist, uses AddStringToBlackList()Opens in a new tab to add entities to the Blacklist, then supplies the Blacklist to the GetTop()Opens in a new tab method:
#include %IKPublic
ZNSPACE "Samples"
DomainCreateOrOpen
SET dname="mydomain"
IF (##class(%iKnow.Domain).NameIndexExists(dname))
{ WRITE "The ",dname," domain already exists",!
SET domoref=##class(%iKnow.Domain).NameIndexOpen(dname)
GOTO DeleteOldData }
ELSE
{ WRITE "The ",dname," domain does not exist",!
SET domoref=##class(%iKnow.Domain).%New(dname)
DO domoref.%Save()
WRITE "Created the ",dname," domain with domain ID ",domoref.Id,!
GOTO ListerAndLoader }
DeleteOldData
SET stat=domoref.DropData()
IF stat { WRITE "Deleted the data from the ",dname," domain",!!
GOTO ListerAndLoader }
ELSE { WRITE "DropData error ",$System.Status.DisplayError(stat)
QUIT}
ListerAndLoader
SET domId=domoref.Id
SET flister=##class(%iKnow.Source.SQL.Lister).%New(domId)
SET myloader=##class(%iKnow.Source.Loader).%New(domId)
CreateBlackList1
SET blname="AviationTermsBlacklist"
SET blId=##class(%iKnow.Utils.MaintenanceAPI).CreateBlackList(domId,blname,
"Common aviation terms Blacklist")
PopulateBlackList
SET black=$LB("aircraft","airplane","flight","accident","event","incident","pilot","airport",
"student pilot","flight instructor","runway","accident site","ground","visibility","faa")
SET ptr=0
FOR x=0:1:100 {
SET moredata=$LISTNEXT(black,ptr,val)
IF moredata=1 {
SET stat=##class(%iKnow.Utils.MaintenanceAPI).AddStringToBlackList(domId,blId,val)
}
}
WRITE "Blacklist ",blname," populated",!
QueryBuild
SET myquery="SELECT TOP 100 ID AS UniqueVal,Type,NarrativeFull FROM Aviation.Event"
SET idfld="UniqueVal"
SET grpfld="Type"
SET dataflds=$LB("NarrativeFull")
UseLister
SET stat=flister.AddListToBatch(myquery,idfld,grpfld,dataflds)
IF stat '= 1 {WRITE "The lister failed: ",$System.Status.DisplayError(stat) QUIT }
UseLoader
SET stat=myloader.ProcessBatch()
IF stat '= 1 {WRITE "The loader failed: ",$System.Status.DisplayError(stat) QUIT }
SourceCountQuery
SET numSrcD=##class(%iKnow.Queries.SourceQAPI).GetCountByDomain(domId)
WRITE "The domain contains ",numSrcD," sources",!
TopEntitiesQuery
DO ##class(%iKnow.Queries.EntityAPI).GetTop(.result,domId,1,20,"",0,0,0,0,$LB(blId))
WRITE "NOTE: the ",blname," Blacklist",!,
"has been applied to this list of top entities",!
SET i=1
WHILE $DATA(result(i)) {
SET outstr = $LISTTOSTRING(result(i),",",1)
SET entity = $PIECE(outstr,",",2)
SET freq = $PIECE(outstr,",",3)
SET spread = $PIECE(outstr,",",4)
WRITE "[",entity,"] appears ",freq," times in ",spread," sources",!
SET i=i+1 }
WRITE "Printed the top ",i-1," entities"