Skip to main content

Introducing InterSystems IRIS Document Database

InterSystems IRIS® Document Database (%DocDB in the InterSystems Class Library) is a facility for storing and retrieving database data as collections of JSON documents. Data storage in JSON (JavaScript Object Notation) format provides support for web-based data exchange. It is compatible with, but separate from, traditional SQL table and field (class and property) data storage and retrieval.

InterSystems supports developing Document Databases and applications in the several languages and environments. See the following sections for more information:

The word “document” is used here as a specific industry-wide technical term, as a dynamic data storage structure. “Document”, as used in Document Database, should not be confused with a text document, or with documentation.

Features and Benefits

By its nature, InterSystems IRIS Document Database is a schema-less data structure. That means that each document has its own structure, which may differ from other documents in the same database. This has several benefits when compared with SQL, which requires a predefined data structure.

Some of the key features of Document Database include:

  • Application Flexibility: Documents do not require a predefined schema. This allows applications to rapidly set up their data environments, adapt to changes in data structure, and rapidly capture data in various formats. Document Database can begin capturing data immediately, without having to define a structure for that data. This is ideal for unpredictable data feeds, as are often found in web-based and social media data sources. If in capturing a body of data, structures within that data become evident or emerge as useful, your document data structure can evolve. Existing captured data can co-exist with this more-structured data representation. It is up to your application to determine the data structure for each document and process it appropriately. One way to do this is to establish a key:value pair representing the document structure version. Thus, conversion of data from one JSON structure to another can be performed gradually, without interrupting data capture or access.

  • Sparse Data Efficiency: Document databases are very efficient at storing sparse data because attributes with a particular key do not have to appear in all documents of a collection. A document may have one set of defined keys; another document in the same collection may have a very different set of defined keys. In contrast, SQL requires that every record contain every key; in sparse data many records have keys with NULL values. For example, an SQL patient medical record provides fields for many diagnoses, conditions, and test; for most patients most of these fields are NULL. The system allocates space for all of these unused fields. In a Document Database patient medical record only those keys that contain actual data are present.

  • Hierarchical Data Storage: Document Database is very efficient at storing hierarchically structured data. In a key:value pair, data can be nested within the data to an unlimited number of levels. This means that hierarchical data can be stored de-normalized. In the SQL relational model, hierarchical data is stored normalized by using multiple tables.

  • Dynamic Data Types: A key does not have a defined data type. The value assigned to the key has an associated data type. Therefore a key:value pair in one document may have one data type; a key:value pair for the same key in another document may have a different data type. Because data types are not fixed, you can change the data type of a key:value pair in a document at runtime by assigning a new value that has a different data type (see Data Types and Values later in this document).

These features of Document Database have important implications for application development. In a traditional SQL environment, the database design establishes data structure that is followed in developing applications. In Document Database, data structure is largely provided in the applications themselves.

JSON Structure

Document Database supports InterSystems JSON Dynamic Objects and JSON Dynamic Arrays (see “Creating and Modifying Dynamic Entities” in Using JSON). The examples in this section create JSON structures using the ObjectScript SET command.

The following example shows how hierarchical data can be stored using JSON. The first SET creates a dynamic abstract object containing nested JSON-structured key:value pairs and arrays. The example then converts the dynamic abstract object to a JSON string, then inserts that JSON string into an existing document database as a document.

  SET dynAbObj = {
   "FullName":"John Smith",
   "FirstName":"John",
   "Address":{
              "street":"101 Main Street",
              "city":"Mapleville",
              "state":"NY",
              "postal code":10234
             },
   "PhoneNumber":
              [
               {"type":"home","number":"212-456-9876"},
               {"type":"cell","number":"401-123-4567"},
               {"type":"work","number":"212-444-5000"}
              ]
  }
  SET jstring = dynAbObj.%ToJSON() // dynamic abstract object to JSON string
  DO personDB.%FromJSON(jstring)   // JSON string inserted into document database

In this example, FullName is stored as a simple key:value pair. Address has a substructure which is stored as an object consisting of key:value pairs. PhoneNumber has a substructure which is stored as an array.

De-Normalized Data Structure

The following is a JSON example of a traditional SQL normalized relational data structure. It consists of two documents, which might be contained in two different collections:

{
   "id":123,
   "Name":"John Smith",
   "DOB":"1990-11-23",
   "Address":555
}
{
   "id":555,
   "street":"101 Main Street",
   "city":"Mapleville",
   "state":"NY",
   "postal code":10234
 }

The following is the same data de-normalized, specified as a single document in a collection containing a nested data structure:

{
   "id":123,
   "Name":"John Smith",
   "DOB":"1990-11-23",
   "Address":{
              "street":"101 Main Street",
              "city":"Mapleville",
              "state":"NY",
              "postal code":10234
             }
 }

In SQL converting from the first data structure to the second would involve changing the table data definition then migrating the data.

In Document Database, because there is no fixed schema, these two data structures can co-exist as different representations of the same data. The application code can access either data structure as required. You can either migrate the data to the new data structure, or leave the data unchanged in the old data structure format, in which case Document Database migrates data each time it accesses it using the new data structure.

Data Types and Values

In InterSystems IRIS Document Database, a key does not have a data type. However, a data value imported to Document Database may have an associated data type. Because the data type is associated with the specific value, replacing the value with another value may result in changing the data type of the key:value pair for that record.

Document Database does not have any reserved words or any special naming conventions. In a key:value pair, any string can be used as a key; any string or number can be used as a value. The key name can be the same as the value: "name":"name". A key name can be the same as its index name.

Document Database represents data values as JSON values. The following representations are used:

String values

All string values are represented by String.

Numbers

Numbers are represented in canonical form, with the following exception: JSON fractional numbers between 1 and -1 are represented with a leading zero integer (for example, 0.007); the corresponding InterSystems IRIS numbers are represented without the leading zero integer (for example, .007).

Non-printing characters

JSON provides escape code representations of the following non-printing characters (represented in ObjectScript by calls to the $CHAR function):

  • $CHAR(8): "\b"

  • $CHAR(9): "\t"

  • $CHAR(10): "\n"

  • $CHAR(12): "\f"

  • $CHAR(13): "\r"

All other non-printable characters are represented by an escaped hexadecimal notation. For example, $CHAR(11) as "\u000b". Printable characters can also be represented using escaped hexadecimal (Unicode) notation. For example, the Greek lowercase letter alpha can be represented as "\u03b1".

Other escaped characters

JSON escapes two printable characters, the double quote character and the backslash character:

  • $CHAR(34): "\""

  • $CHAR(92): "\\"

JSON Special Values

JSON special values can only be used within JSON objects and JSON arrays. They are different from the corresponding ObjectScript special values. JSON special values are specified without quotation marks (the same values within quotation marks is an ordinary data value). They can be specified in any combination of uppercase and lowercase letters; they are stored as all lowercase letters.

The following examples demonstrate special values as used in ObjectScript code.

  • JSON represents the absence of a value by using the null special value. Because Document Database does not normally include a key:value pair unless there is an actual value, null is only used in special circumstances, such as a placeholder for an expected value. This use of null is shown in the following example:

      SET jsonobj = {"name":"Fred","spouse":null}
      WRITE jsonobj.%ToJSON()
  • JSON represents a boolean value by using the true and false special values. This use of boolean values is shown in the following example:

      SET jsonobj = {"name":"Fred","married":false}
      WRITE jsonobj.%ToJSON()

    ObjectScript specifies boolean values using 0 and 1. (Actually “true” can be represented by 1 or by any non-zero number.) These values are not supported as boolean values within JSON documents.

In a few special cases, JSON uses parentheses to clarify syntax:

  • If you define a local variable with the name null, true, or false, you must use parentheses within JSON to have it treated as a local variable rather than a JSON special value. This is shown in the following example:

      SET true=1
      SET jsonobj = {"bool":true,"notbool":(true)}
      WRITE jsonobj.%ToJSON()
  • If you use the ObjectScript Follows operator (]) within an expression, you must use parentheses within JSON to have it treated as this operator, rather than as a JSON array terminator. In the following example, the expression b]a tests whether b follows a in the collation sequence, and returns an ObjectScript boolean value. The Follows expression must be enclosed in parentheses:

      SET a="a",b="b"
      SET jsonarray=[(b]a)]
      WRITE jsonarray.%ToJSON()
FeedbackOpens in a new tab