|Database Concepts and Database Files|
The Caché global database is the data storage structure that underlies Caché. This chapter discusses how globals are stored and accessed.
Caché Data Structures
The Global Database
Physical Structure of a Caché Database
|Caché Data Structures|
Caché categorizes data in terms of two contrasts:
A local variable resides in the partition space of the process in which it is created. It is transient, existing only during the life of a process and available only to that process.
A global variable ("global") is permanent data and is potentially available to all processes. Globals are stored in Caché database files called CACHE.DAT (or CACHE.EXT in multi-extent databases).
A scalar variable stores a single value. An array stores more than one value.
A local variable can be a scalar or an array. A global variable can be a scalar or an array.
The two contrasts allow for four types of variables, as illustrated in the following table.
|Scalar||X John||^X ^John|
|Array||X(1) JOHN("ADDR","CITY")||^X(1) ^JOHN("ADDR","CITY")|
A local variable can be a scalar or an array. It exists only as long as the process in which it is created. The name of a local variable never begins with a caret ("^").
You can display all local variables on your current device (usually your terminal) by using an argumentless WRITE command. This command displays local variables according to their ASCII collation sequence.
address("John Jones")="12 Main Street"
In Caché, the partition is divided into blocks of memory. The blocks are sized 8, 16, 32, 64, etc. bytes. A variable and its data are stored in the smallest sized block that will contain them.
For example, a variable equal to a 100-character string would be stored in a 128-byte block. This block design allows very fast access to space in the process's partition, making Caché execution extremely fast.
<STORE> errors occur when there is no block large enough to store a variable. For example, a new variable set equal to a 258-character string requires a 512-byte or larger block; if there are none to be had, the job will get a <STORE> error. This error can occur even when $SPACE, the Caché special variable indicating the number of free bytes remaining in a partition, is non-zero. This might indicate that several smaller blocks are available but none large enough for the particular variable.
Some Caché ObjectScript commands work only on local variables. These include:
Note: Most Caché ObjectScript commands work on both local and global variables.
A global variable can be a scalar or an array. It is potentially available to all processes. Global variables are stored in Caché database files called CACHE.DAT (or CACHE.EXT in multi-extent databases). The name of a global variable always begins with a caret ("^").
The characteristics and storage of global variables, as well as how to name and reference them, are discussed in depth later in this chapter.
The commands NEW, exclusive NEW, KILL with no arguments, and exclusive KILL do not work on globals, only on local variables as mentioned in "Local Variables" .
A scalar variable stores one value. It can be local or global. You reference a scalar variable by simply using its name, with no subscripts, as in the examples in Table 2-1, "Local and Global Variable Examples," on page 2-2.
An array stores a collection of related data values. It can be local or global. For instance, you would store the names of customers in an array. Individual elements of an array are scalars. You reference them by following the array name with one or more subscripts enclosed in parentheses, as in the examples in Table 2-1, "Local and Global Variable Examples," on page 2-2.
The subscript can be a number, as in ^PATREC(1), or a descriptive alphanumeric string, as in ^PATREC("NAME").
Note: You must enclose a string subscript in double quotes.
Although subscripts are most efficient as integers in local arrays, using descriptive strings can provide internal documentation. For instance, the following subscripts are much more informative than would be the integers 1, 2, and 3:
Multiple subscripts are allowed, separated by commas.
Using numeric subscripts:
Using descriptive subscripts:
Array nodes are arranged hierarchically by subscript. Nodes with the same number of subscripts are said to be at the same level. All nodes with one subscript are at the same level, those with two subscripts are at the same level, one level lower, and so on. A node with the same subscripts as another node together with one or more further subscripts is said to be a descendant of that node.
^PATREC(1) and ^PATREC(2) are at the same level, because each has one subscript.
^PATREC(1,1) and ^PATREC(1,2) are at the same level, one layer down, because each has two subscripts.
Each of ^PATREC(1,1) and ^PATREC(1,2) is a descendant of ^PATREC(1) because each has the subscript 1, together with a further subscript 1 or 2.
^PATREC(1,2,1) is a descendant of ^PATREC(1,2) (but not of ^PATREC(1,1)), because it has the same subscripts 1 and 2 together with the further subscript 1.
There are no constraints on the size of a global array, other than the amount of disk space available and free room within the Caché database file. In addition, the number of subscripts it can contain is dynamic; that is, its content and structure can change while in use.
A global reference is limited to 255 characters (length of global name + length of subscripts).
You can use the exclusive NEW, exclusive KILL, and KILL commands only with local variables. This includes local arrays; however, you must not use subscripted array variable names as arguments with these commands.
Caution: A KILL of an array node deletes that node and all of its descendant nodes. Use extreme care when using the KILL command on an array node to avoid undesirable results.
Suppose you have the variables in Table 2-1, "Local and Global Variable Examples," on page 2-2.
You can enter the following command and it will create new versions of all local variables except the scalar and array variables named X:
However, you could NOT use the following command to create new versions of all variables except the array node X(1):
Global and local arrays are sparse arrays; that is, Caché allocates space for elements only when data is defined for them. Caché never sets aside space for empty data fields or dummy values. This approach differs from many other languages, where space must be declared and set aside for all possible array nodes.
Array names sort according to ASCII collation sequence, just like other variables. However, within a single array, array nodes sort according to Caché collation sequence, by default.
Variable names are strings of alphanumeric characters. Only the first 31 characters are significant.
The first character must be an alphabetic character or a "%" character. Names are case sensitive. Thus, COUNT is a different variable from Count.
Global variables are distinguished from local variables by the addition of a leading "^" character. Thus COUNT is a local variable and ^COUNT is a global variable.
Any global whose first character after the "^" is the "%" character is stored in the CACHE.DAT file in the system manager's namespace. You must have the proper access privileges to create % globals.
Caché stores globals without a "%" character in the current namespace, unless you explicitly specify another namespace. See the section "Referencing Globals" .
The following are examples of global variable names:
|Variable Name||Variable Type||Storage location|
||local variable(scalar or array)||process partition|
||local array node||process partition|
||global variable(scalar or array)||current namespace|
||global array node||current namespace|
||global variable(scalar or array)||system manager's namespace|
||global array node||system manager's namespace|
The length of data in a Caché variable can range from 1 to 32,767 characters. This means that string data can contain up to 32,767 characters and numeric data can have up to 17 significant digits.
Caché continues to work with that maximum number, but low-order digits beyond that number which are significant are zero.
Caché has no data types. (Advanced Data Type elements used with Caché object technology or with Caché SQL are coded to preserve the data type when used with those interfaces. However, the underlying database has no data types.)
Caché local and global variables are never declared. When a variable is first assigned a value, it is created in memory if local, or in the global database. Caché interprets the data type of a variable based on the context in which it is used:
|Use $DATA to Determine Value and Descendants|
Three pieces of information are maintained for each local and global variable:
You use the function $DATA to determine whether a variable has a value and/or descendants. You call $DATA by entering:
%SYS> SET X=$DATA(variable_name)
%SYS> WRITE X
The following table summarizes the meaning of the output you obtain from $DATA.
Local and global variables are created with the SET command and deleted with the explicit KILL command.
The syntax used to refer to a local variable is its name only. The syntax used to reference globals always includes a "^" (caret). However, other aspects of global reference syntax vary depending on where the global is located and whether it has been set up as a mapped or replicated global. See the final section in this chapter, "Referencing Globals" .
Anywhere the ObjectScript language permits an unsubscripted variable, it also permits a reference to an array node, except in the exclusive KILL and exclusive NEW commands. You refer to individual array elements by specifying the array name and one or more subscripts.
|The Global Database|
Globals form the common database that is available to all Caché users with appropriate system privileges.
|Characteristics of Globals|
Four characteristics affect how users access a global and where modifications to a global are stored. The following table lists these characteristics, the default value that each characteristic is given when a global is created, and the utility you use to change the characteristic.
|Characteristic||Default Value||How to Change|
|Mapped||Not mapped||Configuration Manager|Namespace| Global Mapping|
|Replicated||Not replicated||Configuration Manager|Namespace| Global Mapping|
|Journaled||Not journaled, except cluster-mounted CACHE.DAT on VAX system||Control Panel|Local Database. Double-click the global to bring up the properties menu.|
|Protection|| · Owner: RWD
· Group: N
· World: N
· Network: RWD
|Control Panel|Local Database. Double-click the global to bring up the properties menu.|
The journaled and protection characteristics apply to individual globals and are stored in code form along with the global's name in the global directory. The other two characteristics apply to global mapping and are stored in system tables rather than specific globals.
Mapping characteristics are associated with globals or sets of globals and may be specified either with a wild card character (*) or as a range. See the Caché Basic System Management Guide for further details on global mapping.
You can display a global's characteristics from the Properties window in the Caché Explorer.
If you change any of a global's characteristics, the change becomes active immediately. Any user who was accessing the global before or during the change activation continues to attempt to use the old global information. This may result in an error. When users attempt to access the global after the error, they will access it with the correct new information. If it is not feasible to identify such users, the best policy is to stop and restart Caché.
|Automatic I/O for Globals|
Caché ObjectScript does not treat its global database as a separate entity. You need never be concerned with opening, closing, declaring, or defining a global. Caché automatically handles these tasks as global data is stored.
You never need to explicitly read data from or write data to the database. Rather, you can view the database as an extension of your local variable data. You can modify or retrieve elements directly from the global database with the same command syntax used for local data in memory.
When global data is deleted, Caché automatically releases disk storage that is no longer needed.
|Database Cache Reduce I/O Time|
Caché maintains a database cache to store global data for use by all processes on your system. When a process needs to read or write global data, Caché first looks to see if another process has caused the requested global block to be read into a global buffer. If it has, there is no need to read the global block again from disk. This greatly increases the performance of Caché, since I/O is typically the slowest point in software applications.
The system manager can define the amount of database cache Caché can create. See the Caché Advanced System Management Guide for techniques to determine the minimum and optimal number of database cache you need for your system.
|Logical Structure of Globals|
Global structures are based on a logical system of multi-dimensional, balanced trees. Each node in the database contains two pieces of information:
Nodes at lower subscript levels are pointed to by nodes at a higher subscript level. A node can exist to provide a path to a node at a lower level, even if it contains no data.
Suppose a program defines a node ^GLO(1,2) to contain the data value 215. It will create not only the node ^GLO(1,2), but also the top level node ^GLO to point to the next level node ^GLO(1), which in turn will point to ^GLO(1,2). ^GLO and ^GLO(1) will contain only pointers. ^GLO is the root node of the tree, which never contains data. No data storage space will be reserved for ^GLO(1) until you need it. ^GLO(1,2) will contain only data. This is illustrated in the following figure.
If later the program defines the node ^GLO(1,2,1) to be the string "JOHN DOE", ^GLO(1,2) will also contain a pointer to ^GLO(1,2,1) as well as its data value.
Suppose a program defines the following nodes in a new global array:
Caché creates each of those nodes to hold the data the program assigns to them. It also creates whatever higher nodes it needs, up to the root node, to point to nodes at a lower level. It reserves storage only for the nodes that the program assigns data to.
Figure 2-2 shows the logical structure Caché creates for this global array. The global nodes with assigned values, such as the ^GLO(1,3,1) global, have storage reserved for them. The global nodes created to complete the tree structure, such as the ^GLO(2,6) global, do not have storage reserved for them.
|Physical Structure of a Caché Database|
|Globals Stored in CACHE.DAT or CACHE.EXT Files|
Globals are stored in an Caché database file called CACHE.DAT. In a multi-extent database, the first extent is called CACHE.DAT; additional extents are called CACHE.EXT. Under Caché, each host operating system directory can have an CACHE.DAT or CACHE.EXT file, so there can be more than one Caché database file available on a system. No two globals can have the same name unless they are in different Caché databases (i.e., an CACHE.DAT file and any CACHE.EXT files associated with it).
|Creating an Caché Database|
Before you can assign globals values in an application, you must create a Caché database in the directory in which your process will run. Use the Database panel of the Caché Configuration Manager, as described in the Caché Basic System Management Guide.
A Caché database file is identified by the directory in which it resides. On platforms which support multi-extent databases, this is the directory name of its primary extent. When you name a namespace, you are really referring to a Caché database. The current namespace is the namespace in which a process is running.
A global resides in a Caché database. A database resides in a particular directory on a particular system. When a global is created, it resides in the database mapped to the user's current namespace. The namespace and network configuration defines which database is mapped to the user's namespace and on which system and directory the database is located.
At any time, the system manager can make a different configuration active (e.g., if there's a hardware failure). The physical location of the global is then defined by the new configuration.
Note: Even when a user makes an extended global reference, specifying a directory and (optionally) a system name where the global resides, Caché understands this to refer to a namespace. The directory and system together are considered to comprise an implied namespace. This is a change from previous products, where setting up implicit references was a convenient option for the user.
|Simple and Extended References|
You access globals that are located in a namespace other than your current namespace in one of two ways:
Creating a simple reference to the global ORDER in the namespace to which it currently has been mapped, use the following syntax:
The system or network manager uses the Caché Configuration Manager to define global mappings. See the Caché Basic System Management Guide for examples of global mappings to namespaces.
Typically, you use extended references when you want to specify the directory and system name where a global is located, because that global has not been mapped to a defined namespace. You can also include a defined namespace in an extended reference, when you need to override the current mapping for a global.
Caché understands the directory and system in an extended reference as an implied namespace. An implied namespace is one whose default directory maps to either:
Caché supports two forms of extended reference:
Bracket syntax can take either of the following forms:
where dir is a directory, sys is a system, and glob is a global.
where nmsp is a defined namespace that the global glob has not currently been mapped or replicated to, or an implied namespace.
You must include quotation marks around the directory and system names or the namespace name unless you specify them as variables. The directory and system together comprise an implied namespace.
There is a special case of the [dir,sys] syntax:
. It is interpreted to mean the directory "dir" on the local system. This has the same meaning as the namespace specification
. If dir is a variable containing the directory name, then
means the same as
The following examples use bracket syntax to access the global ORDER in the BUSINESS directory on your local machine. The directory name is an implied namespace.
Windows: The following examples use bracket syntax to access the global ORDER in the BUSINESS directory on a machine called SALES. The directory and machine name together are an implied namespace.
The following example uses bracket syntax to access the global ORDER in the defined namespace MARKETING, when ORDER is not currently mapped to MARKETING. This syntax is applicable on all systems.
The environment syntax is defined as:
env is used in one of four formats:
|empty string ("")||Current namespace on your local system|
|nmsp||Defined namespace that globalref is not currently mapped to|
|^^dir||Implied namespace whose default directory is the specified directory on your local system|
|^systemname^dir||Implied namespace whose default directory is the specified directory on the specified remote system|
To access the global ORDER in your current namespace on your current system, when no mapping has been defined for ORDER, use the syntax:
This has the same effect as the simple reference
The following examples use environment syntax to access the local global ORDER in the directory BUSINESS on your current system. The directory name is an implied namespace. The following examples use environment syntax to access a global ORDER in the BUSINESS directory on the system pointed to by the SALES directory set, use the syntax:
The following example uses environment syntax to access the global ORDER in the defined namespace BUSINESS on your local system, when ORDER is not currently mapped to BUSINESS.
|Global Replication and Shadowing|
Global replication is the process of automatically duplicating changes to a source global. It is defined as part of the namespace. These changes are made to the same global in one or more destination, or target, directories. These destination directories can be on the local or a remote node. Any SETs or KILLs you perform on the source global replicate automatically to the same global in the replication location(s).
However, replication is not integrated with network transaction processing, nor does replication deal reliably with replication servers that are not running (listed as "Disabled" in the Caché Control Panel). InterSystems recommends that you use shadowing to replicate your data on another computer rather than the replication technique.
Shadow system journaling enables a secondary computer to maintain a "shadow" version of selected databases on a primary machine. By continually dejournaling from the primary machine to the secondary machine, shadow system journaling allows successful failover to a database which is within only a few transactions of the primary database.
For more information about Shadow System journaling, see the Caché Networking Guide or the Caché Advanced System Management Guide.