docs.intersystems.com
Home  /  First Look: SQL Search with InterSystems IRIS


Articles
First Look: SQL Search with InterSystems IRIS
InterSystems: The power behind what matters   
Search:  


This First Look guide introduces you to InterSystems IRIS Data Platform™ support for SQL text search, which provides semantic context searching of unstructured text data in a variety of languages. It covers the following topics:
This First Look guide presents an introduction to SQL context-aware text searching and walks through some initial tasks associated with indexing text data for searching and performing SQL Search. Once you’ve completed this exploration, you will have indexed text in an SQL column for text searching and performed several types of searches. These activities are designed to use only the default settings and features, so that you can acquaint yourself with the fundamentals of the feature. For the full documentation on SQL Search, see the SQL Search Guide.
A related, but separate, tool for handling unstructured texts is Natural Language Processing (NLP). SQL Search presupposes that you know what you are looking for. NLP text analysis allows you to analyze the contents of texts with no prior knowledge of the text contents.
Why SQL Search Is Important
The ability to rapidly search unstructured text data is fundamental to accessing the content of the huge volume of text commonly stored by many companies and institutions. Any search facility for such data must have the following functionality:
How InterSystems IRIS Implements SQL Search
SQL Search can search text data found in a column in an SQL table. In order to do this, you must create an SQL Search index for the column containing the text data. InterSystems implements a table column as a property in a persistent class.
There are three levels of index available, each supporting additional features as well as all of the features of the lower levels: Basic, Semantic, and Analytic:
Populating the index. Like all SQL indices, you can either build the index directly after the table has been populated with data, or have SQL automatically build the index entries as you insert records into an empty table. In either case, SQL automatically updates this index as part of subsequent insert, update, or delete operations.
You perform an SQL search you write a SELECT query in which the WITH clause contains %ID %FIND search_index() syntax. The search_index() function parameters include the name of the SQL Search index and a search string. This search string can include wildcard, positional phrase, and entity syntax characters. The search string can also include AND, OR, and NOT logical operators.
Trying SQL Search for Yourself
It’s easy to use InterSystems IRIS SQL Search. This simple procedure walks you through the basic steps of searching text data stored as a string in an SQL table column.
  1. Preliminaries
    You need to have an InterSystems IRIS instance that is up and running and has an active license key. (You can view the licence key from the Management Portal: select System Administration > Licensing.)
    This documentation uses the Aviation.Event SQL table, which is available on GitHub at https://github.com/intersystems/Samples-Aviation. (You do not need to know anything about GitHub or have a GitHub account.) To install these samples, InterSystems recommends that you create a dedicated namespace called (for example) TESTSAMPLES and then load the samples into that namespace (or you can use an existing namespace; however, you cannot use the %SYS namespace). To create a namespace, use the Management Portal options System Administration > Configuration > System Configuration > Namespaces. For the general process of downloading from GitHub, see Downloading Samples for Use with InterSystems IRIS. After you download a sample, be sure to open the README file and follow the setup instructions.
  2. Create an index for SQL Search. The following class definition example creates a Basic SQL Search index:
       Class Aviation.TestSQLSrch Extends %Persistent [ ClassType=persistent,
          DdlAllowed,Owner={UnknownUser},ProcedureBlock,SqlRowIdPrivate,
          SqlTableName=TestSQLSrch ]
      { 
      Property UniqueNum As %Integer;
      Property Narrative As %String(MAXLEN=100000) [ SqlColumnNumber=3 ];
      Index NarrBasicIdx On (Narrative) As %iFind.Index.Basic(INDEXOPTION=0,
         LANGUAGE="en",LOWER=1);
      Index UniqueNumIdx On UniqueNum [ Type=index,Unique ];
      }
    Make sure to build this class definition in the TESTSAMPLES namespace. This example creates a persistent class (table) that contains a Narrative property (column). It defines an SQL Search Basic index for this property. Because this is a new class, you must populate this table with text data.
  3. Populate the table with text data and build the SQL Search index. An SQL Search index is built and maintained like any other SQL index.
    The following example populates the new table with text data from an existing SQL table. In this example, the SQL Search index is automatically built as each record is added:
      ZNSPACE "TESTSAMPLES"
      SET in1="INSERT OR UPDATE INTO Aviation.TestSQLSrch (UniqueNum,Narrative) "
      SET in2="SELECT %ID,NarrativeFull FROM Aviation.Event WHERE %ID < 100"
      SET myinsert=in1_in2
      SET tStatement=##class(%SQL.Statement).%New()
      SET qStatus=tStatement.%Prepare(myinsert)
        IF qStatus'=1 {WRITE "%Prepare failed:" DO $System.Status.DisplayError(qStatus) QUIT}
      SET rset=tStatement.%Execute()
      WRITE !,"Total rows inserted=",rset.%ROWCOUNT
    For performance reasons, you may wish to use the %NOINDEX option to defer building indices until the table is fully populated, and then build the SQL Search index (and any other defined indices) using the %Build() method.
    Alternatively, you could add an SQL Search index to an existing persistent class that already contains text data, and then populate the SQL Search index using the %Build() method.
  4. Use SQL Search as a WHERE clause condition of a SELECT query. The WHERE clause can contain other conditions associated by AND logic. Run the following SQL Query in the TESTSAMPLES namespace:
    SELECT %iFind.Highlight(Narrative,'"visibility [1-4] mile*" AND "temp* ? degrees"') 
    FROM Aviation.TestSQLSrch 
    WHERE %ID %FIND search_index(NarrBasicIdx,'"visibility [1-4] mile*" "temp* ? degrees"',0,'en')
This example also highlights the returned text by applying the same search_item to the returned records. This highlights every instance of either of these phrases by delimiting them with <b> and </b> tags.
This example is provided to give you some initial experience with InterSystems IRIS SQL Search. You should not use this example as the basis for developing a real application. To use SQL Search in a real situation you should fully research the available choices provided by the software, then develop your application to create robust and efficient code.
Learn More About SQL Search
InterSystems has other resources to help you learn more about SQL Search, including: