Objects, SQL, and the Unified Data Architecture

A powerful and unique feature of Caché is its unique Unified Data Architecture that provides simultaneous, high-performance object and relational access to data stored within Caché.

Unified Data Dictionary

Within Caché, you can model your application components as objects. Objects are organized by classes which define the data (properties) and behavior (methods) of the object.

Unified Data Architecture

The meta-information, or definition, of each class is stored within a common repository referred to as the Caché class dictionary. The class dictionary is itself an object database, stored within Caché, whose contents can be accessed using objects. The class dictionary, by means of a class compiler, defines the storage structure needed by persistent objects and converts class definitions into parallel sets of executable code that provide both the object and relational access to this storage structure. By means of this architecture, the object and relational code paths are efficient and automatically synchronized with one another.

Class definitions can be added to the class dictionary in a number of ways:

Interactively, using the Studio development environment.
Relationally, using DDL. Caché accepts standard SQL DDL statements and automatically creates corresponding class and table definitions.
Textually, using XML. Caché supports an external, XML representation of class definitions. Typically this is used for source code management, deployment, automatic code generation, and interoperation with other tools.
Programmatically, using objects. Using the Caché set of class definition objects, you can create programs that communicate directly with the class dictionary and create new classes at application runtime.
Using an XML Schema Wizard, included within Studio, that can create class definitions from most XML schema files.

Flexible Storage

The Caché object model differs from those of programming languages in that in addition to properties and methods, you can specify storage-related behavior such as indices, constraints, and storage structure.

The storage structure used by persistent objects is independent of the logical definition of a class and is quite flexible: developers can use the default structures provided by the class compiler or they can tune the structures for specific cases.

Objects

Caché includes a full-featured, next-generation object database specifically designed to meet the needs of complex, transaction oriented applications. The Caché object model includes the following features:

Classes — You can define classes that represent the state (data) and behavior (code) of your application components. Classes are used to create instances of objects as both runtime components and as items stored within the database.
Properties — Classes can include properties, which specify the data associated with each object instance. Properties can be simple literals (such as strings or integers), user-defined types (defined using data type classes), complex (or embedded) objects, collections, or references to other objects.
Relationships — Classes can define how instances of objects are related to one another. The system automatically provides navigational methods for relationships as well as referential integrity within the database.
Methods — Classes can define behavior by means of methods: executable code associated with an object. Object methods run within a Caché server process (though they can be invoked from a remote client). Object methods can be scripted using ObjectScript, SQL, or they can be generated using method generators, which are code that automatically creates customized methods according to user-defined rules.
Object persistence — Persistent objects have the ability to automatically store and retrieve themselves to a database. The persistence support includes complete database functionality including automatic transaction management, concurrency control, index maintenance, and data validation. Persistent objects are automatically visible through SQL queries.
Inheritance — By deriving new classes from existing ones, you can reuse previously written code as well as create specialized versions of classes.
Polymorphism — Caché supports complete object polymorphism. This means that applications can use a well-defined interface (a set of methods and properties provided by a superclass) and the system will automatically invoke the correct interface implementation based on the type of each object. This makes it much easier to develop flexible database applications.
Swizzling (also known as “lazy loading”) — Caché automatically swizzles (brings into memory from disk) any related persistent objects when they are referenced from other objects. This greatly simplifies working with complex data models.

The Caché object functionality is not a separate part of Caché; it is a central part of Caché programming and is fully integrated with relational access described elsewhere. However, for those who are interested specifically in object-oriented programming, the manual Using Caché Objects discusses Caché programming from this point of view.

Defining Classes

The simplest and most common way to define classes within Caché is to use the Studio development environment. The Studio lets you define classes either using a simple text format within a syntax-coloring editor or by using a graphical point-and-click interface. These two views are interchangeable and are automatically synchronized.

Here is the definition of an extremely simple persistent object, Component, as seen within Studio:

Class MyApp.Component Extends %Persistent
{
Property TheName As %String;
Property TheValue As %Integer;
}

This class is defined as a persistent class (that is, it can store itself within a database). In this case, the Caché-provided, %PersistentOpens in a new tab class (system class names start with a “%” character to distinguish them from application classes) provides all the needed persistence code via inheritance. The class belongs to the package, “MyApp”. Packages group related classes together and greatly simplify development of large applications. The class defines two properties: TheName, which has a string value, and TheValue, which has an integer value.

From within ObjectScript code, such as within a method, you can use this object syntax to manipulate instances of Component object:

 // Create a new component
 Set component = ##class(MyApp.Component).%New()
 Set component.TheName = "Widget"
 Set component.TheValue = 22

 // Save the new Component to the database
 Do component.%Save()

Using Basic, you can define a method to manipulate instances of the Component object:

' Create a new component
component = New Component()
component.TheName = "Widget"
component.TheValue = 22

' Save the new Component to the database
component.%Save()

At this point, a new instance of Component is stored within the database with a system-assigned unique object identifier. You can later retrieve this object by opening it (using its object identifier):

' Open an instance and double its value:
component = OpenId Component(id)

component.TheValue = component.TheValue * 2
component.%Save()

You can perform the exact same operations using native Java, C++, or other Caché client bindings. The class compiler can generate, and synchronize, any additional code required to access objects externally. For example, if you are using Caché with Java, you can specify that the class compiler automatically generate and maintain Java proxy classes that provide remote access to persistent database classes. Within a Java program you can use this object naturally:

// Get an instance of Component from the database
component = (MyApp.Component)MyApp.Component._open(database, new Id(id));

// Inspect some properties of this object
System.out.println("Name: " + component.getTheName());
System.out.println("Value: " + component.getTheValue());

SQL

Caché SQL is a full-featured relational database engine that is fully integrated with the Caché object technology. In addition to standard SQL-92 features, Caché SQL offers:

Support for streams (known in SQL as Binary Large Objects, or BLOBS).
Support for stored procedures (implemented as object methods).
A set of object-based extensions.
User-definable data types.
Support for Transactional Bitmap Indices.
Bitmap indices, typically used in large data warehousing and OLAP systems, offer the ability to perform high-speed searches based on complex combinations of conditions. Such bitmap indices cannot be updated in real-time, however and are typically updated as a batch process. Caché SQL supports bitmap indices that offer high-performance searching power combined with no loss in insert/update performance. This gives transaction processing applications the ability to perform data warehouse-style queries and gives data warehouse applications the ability to perform real-time updates. For more information, refer to the “Bitmap Indices” content in the Caché SQL Optimization Guide.

The Object/Relational Connection

All components within the Caché dictionary are defined as classes. The class compiler automatically projects persistent classes as relational tables. For every object feature, there is a corresponding relational equivalent, as illustrated in the following table:

Relational View of Object Features

Object Feature	Relational Equivalent
Package	Schema
Class	Table
Object instance	Row within a table
Property	Column
Relationship	Foreign key
Embedded object	Multiple columns
Method	Stored procedure
Index	Index

When Caché loads SQL DDL (Data Definition Language) statements, it uses the inverse of this projection to create classes that correspond to relational tables.

To demonstrate the object-to-relational projection, consider a simple example. Here is the definition of a simple, persistent Person class (part of a package called “MyApp”) containing two properties, Name and Home:

Class MyApp.Person Extends %Persistent
{
Property Name As %String(MAXLEN=100);
Property Home As Address;
}

The Person class gets its persistent behavior from the %PersistentOpens in a new tab superclass provided with Caché. The Name property is defined as a simple String of up to 100 characters.

The Home property illustrates the use of complex, user-defined data types, in this case the Address class, which is defined as:

Class MyApp.Address Extends %SerialObject
{
Property City As %String;
Property State As %String;
}

The Address class is derived from the %SerialObjectOpens in a new tab superclass. This class provides the ability to serialize itself (convert itself to a single-string representation) and embed itself within another containing class (as with the Person class).

When viewed via SQL, the Person class has the following structure:

SQL View of the Person class: SELECT * FROM Person

ID	Name	Home_City	Home_State
1	Smith,John	Cambridge	MA
2	Doe,Jane	Dallas	TX

Note that the object identifier is visible as a column. In addition, the fields of the embedded Address object are projected as separate fields. These fields are given the synthetic names Home_City and Home_State and behave exactly as if they were defined as two individual fields.

Inheritance and SQL

Inheritance is an important feature within object-based systems and is completely lacking within relational databases. Caché SQL makes it possible to use the power of inheritance using standard relational constructs. For example, we can derive a new Employee class from the Person class used in the previous example:

Class MyApp.Employee Extends Person
{
Property Salary As %Integer(MINVAL=0,MAXVAL=100000);
}

This new class extends the Person class by adding an additional property, Salary.

When viewed via SQL, the Employee class has the following structure:

SQL View of the Employee class: SELECT * FROM Employee

ID	Name	Home_City	Home_State	Salary
3	Divad, Gino	Irvine	CA	22000

Notice that all of the inherited properties are available as columns. Also note that only rows that are actual instances of Employee are included. If we again ask for all Person instances:

Revised SQL View of the Person class: SELECT * FROM Person

ID	Name	Home_City	Home_State
1	Smith,John	Cambridge	MA
2	Doe,Jane	Dallas	TX
3	Divad, Gino	Irvine	CA

In this case, we see that all the rows are returned because every Employee is defined to be an instance of Person. In this case, however, only the properties defined by Person are displayed.

Object Extensions to SQL

To make it easier to use SQL within object applications, Caché includes a number of object extensions to SQL.

One of the most interesting of these extensions is ability to follow object references using the reference (“–>”) operator. For example, suppose you have a Vendor class that refers to two other classes: Contact and Region. You can refer to properties of the related classes using the reference operator:

SELECT ID,Name,ContactInfo->Name
FROM Vendor
WHERE Vendor->Region->Name = 'Antarctica'

Of course, you can also express the same query using SQL JOIN syntax. The advantage of the reference operator syntax is that it is succinct and easy to understand at a glance.