Introduction to Caché
Objects, SQL, and the Unified Data Architecture
|
|
A powerful and unique feature of Caché is its unique Unified Data Architecture that provides simultaneous, high-performance object and relational access to data stored within Caché.
Within Caché, you can model your application components as objects. Objects are organized by classes which define the data (properties) and behavior (methods) of the object.
Unified Data Architecture
The meta-information, or definition, of each class is stored within a common repository referred to as the Caché class dictionary. The class dictionary is itself an object database, stored within Caché, whose contents can be accessed using objects. The class dictionary, by means of a class compiler, defines the storage structure needed by persistent objects and converts class definitions into parallel sets of executable code that provide both the object and relational access to this storage structure. By means of this architecture, the object and relational code paths are efficient and automatically synchronized with one another.
Class definitions can be added to the class dictionary in a number of ways:
-
Interactively, using the
Studio development environment.
-
Relationally, using DDL. Caché accepts standard SQL DDL statements and automatically creates corresponding class and table definitions.
-
Textually, using XML. Caché supports an external, XML representation of class definitions. Typically this is used for source code management, deployment, automatic code generation, and interoperation with other tools.
-
Programmatically, using objects. Using the Caché set of class definition objects, you can create programs that communicate directly with the class dictionary and create new classes at application runtime.
-
Using an XML Schema Wizard, included within
Studio, that can create class definitions from most XML schema files.
The Caché object model differs from those of programming languages in that in addition to properties and methods, you can specify storage-related behavior such as indices, constraints, and storage structure.
The storage structure used by persistent objects is independent of the logical definition of a class and is quite flexible: developers can use the default structures provided by the class compiler or they can tune the structures for specific cases.
Caché includes a full-featured, next-generation object database specifically designed to meet the needs of complex, transaction oriented applications. The Caché object model includes the following features:
-
Classes You can define classes that represent the state (data) and behavior (code) of your application components. Classes are used to create instances of objects as both runtime components and as items stored within the database.
-
Properties Classes can include properties, which specify the data associated with each object instance. Properties can be simple literals (such as strings or integers), user-defined types (defined using data type classes), complex (or embedded) objects, collections, or references to other objects.
-
Relationships Classes can define how instances of objects are related to one another. The system automatically provides navigational methods for relationships as well as referential integrity within the database.
-
Methods Classes can define behavior by means of methods: executable code associated with an object. Object methods run within a Caché server process (though they can be invoked from a remote client). Object methods can be scripted using ObjectScript, SQL, or they can be generated using method generators, which are code that automatically creates customized methods according to user-defined rules.
-
Object persistence Persistent objects have the ability to automatically store and retrieve themselves to a database. The persistence support includes complete database functionality including automatic transaction management, concurrency control, index maintenance, and data validation. Persistent objects are automatically visible through SQL queries.
-
Inheritance By deriving new classes from existing ones, you can reuse previously written code as well as create specialized versions of classes.
-
Polymorphism Caché supports complete object polymorphism. This means that applications can use a well-defined interface (a set of methods and properties provided by a superclass) and the system will automatically invoke the correct interface implementation based on the type of each object. This makes it much easier to develop flexible database applications.
-
Swizzling (also known as
lazy loading) Caché automatically swizzles (brings into memory from disk) any related persistent objects when they are referenced from other objects. This greatly simplifies working with complex data models.
The Caché object functionality is not a separate part of Caché; it is a central part of Caché programming and is fully integrated with relational access described elsewhere. However, for those who are interested specifically in object-oriented programming, the manual
Using Caché Objects discusses Caché programming from this point of view.
The simplest and most common way to define classes within Caché is to use the
Studio development environment. The Studio lets you define classes either using a simple text format within a syntax-coloring editor or by using a graphical point-and-click interface. These two views are interchangeable and are automatically synchronized.
Here is the definition of an extremely simple persistent object,
Component, as seen within Studio:
Class MyApp.Component Extends %Persistent
{
Property TheName As %String;
Property TheValue As %Integer;
}
This class is defined as a persistent class (that is, it can store itself within a database). In this case, the Caché-provided,
%Persistent class (system class names start with a
% character to distinguish them from application classes) provides all the needed persistence code via inheritance. The class belongs to the package,
MyApp. Packages group related classes together and greatly simplify development of large applications. The class defines two properties:
TheName, which has a string value, and
TheValue, which has an integer value.
From within
ObjectScript code, such as within a method, you can use this object syntax to manipulate instances of
Component object:
// Create a new component
Set component = ##class(MyApp.Component).%New()
Set component.TheName = "Widget"
Set component.TheValue = 22
// Save the new Component to the database
Do component.%Save()
Using
Basic, you can define a method to manipulate instances of the
Component object:
' Create a new component
component = New Component()
component.TheName = "Widget"
component.TheValue = 22
' Save the new Component to the database
component.%Save()
At this point, a new instance of
Component is stored within the database with a system-assigned unique object identifier. You can later retrieve this object by opening it (using its object identifier):
' Open an instance and double its value:
component = OpenId Component(id)
component.TheValue = component.TheValue * 2
component.%Save()
You can perform the exact same operations using native
Java,
C++, or other Caché client bindings. The class compiler can generate, and synchronize, any additional code required to access objects externally. For example, if you are using Caché with Java, you can specify that the class compiler automatically generate and maintain Java proxy classes that provide remote access to persistent database classes. Within a Java program you can use this object naturally:
// Get an instance of Component from the database
component = (MyApp.Component)MyApp.Component._open(database, new Id(id));
// Inspect some properties of this object
System.out.println("Name: " + component.getTheName());
System.out.println("Value: " + component.getTheValue());
Caché SQL is a full-featured relational database engine that is fully integrated with the Caché object technology. In addition to standard SQL-92 features, Caché SQL offers:
-
Support for streams (known in SQL as Binary Large Objects, or BLOBS).
-
Support for stored procedures (implemented as object methods).
-
A set of object-based extensions.
-
User-definable data types.
-
Support for Transactional Bitmap Indices.
Bitmap indices, typically used in large data warehousing and OLAP systems, offer the ability to perform high-speed searches based on complex combinations of conditions. Such bitmap indices cannot be updated in real-time, however and are typically updated as a batch process. Caché SQL supports bitmap indices that offer high-performance searching power combined with no loss in insert/update performance. This gives transaction processing applications the ability to perform data warehouse-style queries and gives data warehouse applications the ability to perform real-time updates. For more information, refer to the
Bitmap Indices content in the
Caché SQL Optimization Guide.
All components within the Caché dictionary are defined as classes. The class compiler automatically projects persistent classes as relational tables. For every object feature, there is a corresponding relational equivalent, as illustrated in the following table:
Relational View of Object Features
Object Feature |
Relational Equivalent |
Package |
Schema |
Class |
Table |
Object instance |
Row within a table |
Property |
Column |
Relationship |
Foreign key |
Embedded object |
Multiple columns |
Method |
Stored procedure |
Index |
Index |
When Caché loads SQL DDL (Data Definition Language) statements, it uses the inverse of this projection to create classes that correspond to relational tables.
To demonstrate the object-to-relational projection, consider a simple example. Here is the definition of a simple, persistent
Person class (part of a package called
MyApp) containing two properties,
Name and
Home:
Class MyApp.Person Extends %Persistent
{
Property Name As %String(MAXLEN=100);
Property Home As Address;
}
The
Person class gets its persistent behavior from the
%Persistent superclass provided with Caché. The
Name property is defined as a simple String of up to 100 characters.
The
Home property illustrates the use of complex, user-defined data types, in this case the
Address class, which is defined as:
Class MyApp.Address Extends %SerialObject
{
Property City As %String;
Property State As %String;
}
The
Address class is derived from the
%SerialObject superclass. This class provides the ability to serialize itself (convert itself to a single-string representation) and embed itself within another containing class (as with the
Person class).
When viewed via SQL, the
Person class has the following structure:
SQL View of the Person class: SELECT * FROM Person
ID |
Name |
Home_City |
Home_State |
1 |
Smith,John |
Cambridge |
MA |
2 |
Doe,Jane |
Dallas |
TX |
Note that the object identifier is visible as a column. In addition, the fields of the embedded
Address object are projected as separate fields. These fields are given the synthetic names
Home_City and
Home_State and behave exactly as if they were defined as two individual fields.
Inheritance is an important feature within object-based systems and is completely lacking within relational databases. Caché SQL makes it possible to use the power of inheritance using standard relational constructs. For example, we can derive a new
Employee class from the
Person class used in the previous example:
Class MyApp.Employee Extends Person
{
Property Salary As %Integer(MINVAL=0,MAXVAL=100000);
}
This new class extends the
Person class by adding an additional property,
Salary.
When viewed via SQL, the
Employee class has the following structure:
SQL View of the Employee class: SELECT * FROM Employee
ID |
Name |
Home_City |
Home_State |
Salary |
3 |
Divad, Gino |
Irvine |
CA |
22000 |
Notice that all of the inherited properties are available as columns. Also note that only rows that are actual instances of
Employee are included. If we again ask for all
Person instances:
Revised SQL View of the Person class: SELECT * FROM Person
ID |
Name |
Home_City |
Home_State |
1 |
Smith,John |
Cambridge |
MA |
2 |
Doe,Jane |
Dallas |
TX |
3 |
Divad, Gino |
Irvine |
CA |
In this case, we see that all the rows are returned because every
Employee is defined to be an instance of
Person. In this case, however, only the properties defined by
Person are displayed.
To make it easier to use SQL within object applications, Caché includes a number of object extensions to SQL.
One of the most interesting of these extensions is ability to follow object references using the reference (
>) operator. For example, suppose you have a
Vendor class that refers to two other classes:
Contact and
Region. You can refer to properties of the related classes using the reference operator:
SELECT ID,Name,ContactInfo->Name
FROM Vendor
WHERE Vendor->Region->Name = 'Antarctica'
Of course, you can also express the same query using SQL JOIN syntax. The advantage of the reference operator syntax is that it is succinct and easy to understand at a glance.