About This Book
This book describes how to use the iKnow semantic analysis engine to access and analyze unstructured data. Commonly, unstructured data consists of a large number of text sources, such as a collection of newspaper articles or a collection of doctors’ notes. You load this text data into iKnow, and then use iKnow to retrieve meaningful information. iKnow operates on data loaded from source texts; it does not modify source texts. iKnow can perform semantic analysis on texts in Dutch (nl), English (en), French (fr), German (de), Japanese (ja), Portuguese (pt), Russian (ru), Spanish (es), Swedish (sv), and Ukrainian (uk).
The book addresses a number of topics:
-
Introductory Chapters:
-
A Conceptual Overview, which describes the iKnow approach to unstructured data and iKnow architecture. It describes both what iKnow is and what iKnow is not, so that users can determine whether iKnow is the best fit for their text access application.
-
iKnow Implementation, which describes the implementation of iKnow software in the ObjectScript environment and describes data source considerations. Data sources can be text files, SQL records, globals, RSS feeds, or any other type of text source.
-
-
iKnow Interfaces:
-
iKnow Architect, which describes the Management Portal interface for creating iKnow components and populating them with data.
-
REST Interface, which describes the REST API interface for performing many iKnow operations.
The operations performed through these interfaces are described in greater detail in the following chapters, which are specific for that type of operation.
-
-
Setting Up an iKnow Environment and Loading Text Data into an iKnow Domain which describe how to write programs to get unstructured data into iKnow.
-
iKnow Queries describes how to write programs that can be run against data loaded into iKnow. Attributes allow iKnow queries to distinguish Negation and Sentiment attributes; terms flagged with these attributes affect query interpretation of a path or sentence. Dominance and Proximity provide more sophisticated calculations for comparing text elements, and Custom Metrics gives the user the means to extend and customize the calculations used for comparing text elements in queries.
-
Filtering Sources describes how to use various filters to limit the scope of queries to a subset of the data sources loaded into iKnow. Text Categorization describes how to automatically assign data sources to categories based on an analysis of the contents of the source. Once assigned, this category metadata value can be used to filter sources or when querying sources.
-
The two Smart Matching chapters describe how to create a dictionary, and then how to use a dictionary to match against data sources loaded into iKnow.
-
The User Interfaces chapter describes several sample GUI interfaces for retrieving information. These interfaces use the queries, filters, and dictionaries described in the prior chapters.
-
iKnow Tools describes the Terminal iKnow Shell interface for displaying iKnow components and data. This is a useful tool, but provides no additional iKnow functionality. This chapter also describes the Data Upgrade utility for use on iKnow domains created under an earlier version of iKnow.
-
The iKnow KPIs and DeepSee Dashboards chapter describes how to use iKnow ObjectScript queries as data sources for KPIs (key performance indicators) and how to display these KPIs on DeepSee dashboards.
-
The Web Services chapter describes using iKnow with Internet data.
-
The final two chapters describe advanced topics. Customizing iKnow describes how to create additional iKnow text processing facilities to supplement those supplied with iKnow. Language Identification describes how to work with source texts in more than one language and with texts in languages that need special processing.
-
The Domain Parameters appendix provides a comprehensive list of available domain parameters. You can set these parameters to customize a domain, or set them for all domains systemwide.
-
The DeepSee Cube Integration (Deprecated Form) appendix describes the deprecated form of integration between iKnow and DeepSee cubes.
This book also describes iFind, an SQL facility for performing text search operations.
For a detailed outline, see the Table of Contents.
For information on using the newer form of iKnow/DeepSee cube integration, see “Using Unstructured Data in Cubes” in the Advanced DeepSee Modeling Guide.
For general information, see Using InterSystems Documentation.