Caché System Administration Guide
Caché Cluster Management
This chapter contains information about cluster management in Caché. It discusses the following topics:
Caché clusters can be configured on OpenVMS platforms. This chapter contains information about cluster management for OpenVMS. For more detailed information on other cluster-related topics, please see:
Overview of Caché Clusters
Caché systems may be configured as a cluster. Cluster configurations provide special benefits to their users:
Users can invisibly share disk storage and printers or maintain private access to these resources.
Cluster software can be configured to search for the least used resource, maximizing usage of resources while simultaneously increasing throughput.
In a cluster environment, each computer executes its own copy of software.
A Caché cluster is identified by its pre-image journal (PIJ
) directory. Nodes that specify the same PIJ
directory are all part of a cluster. A cluster session begins when the first cluster node starts and ends when the last cluster node shuts down.
You can customize the networking capabilities of Caché to allow for cluster failover: if one computer in the cluster goes down, the remaining members continue to function without database degradation. Cluster members can share databases; you can connect the computers in a cluster in the following ways:
Special purpose hardware, such as Memory Channels and Gigabit Ethernet, for high speed communication
Ethernet cables, for lower cost
A combination of the above
The functionality provided is the same, regardless of which connection mechanisms are used.
All instances of Caché in a cluster must have the same owner, security settings, and file ownership settings because they share databases and journals.
The following are system specifications for a cluster configuration:
Maximum number of cluster nodes in a cluster: 14
Maximum number of cluster-mounted databases: approximately 512
The first node running Caché that joins the cluster by attempting to mount a database in cluster mode becomes the cluster master. The cluster master performs the following functions:
Acts as a lock server to all cluster-mounted databases
Coordinates write image journaling cluster-wide
If the cluster master fails or shuts down, the next node that joined the cluster becomes the cluster master and assumes these functions.
A node joins a cluster when it starts its ENQ daemon system process (ENQDMN
). Caché activates this process the first time a node attempts to cluster-mount a database. At the same time, it also creates the Recovery daemon (RECOVERY
) on a node to manage cluster failover. Caché only creates the ENQDMN
system processes on systems that join a cluster.
Cluster Master as Lock Server
The cluster master acts as a lock server by managing access to the cluster-mounted database (CACHE.DAT
) files. Applications that run in a cluster must have mechanisms to coordinate access from multiple cluster nodes to cluster-mounted databases. Caché accomplishes this at two levels:
Caché manages block-level access to shared databases on disk for Caché applications running in a cluster environment. It prevents one node from reading or modifying a block from a disk which is simultaneously being changed in the memory of another node. Multiple nodes can read the same block, but only one can update it at a time.
Caché manages these simultaneous access requests at the block level with the Distributed Lock Manager (DLM), using the ENQ daemon (ENQDMN
Caché ObjectScript Level Locks
While each cluster member can directly access clustered databases, no member can independently process Caché ObjectScript Lock
commands for clustered databases. The cluster master acts as a lock server by coordinating all Caché ObjectScript Lock
requests to maintain the logical integrity of the cluster-mounted database.
Caché servers communicate these requests to the cluster master via network connections. Thus, ECP must be running on each computer that is participating in the cluster. Even if an application issues a Lock
command to a global using the extended bracket syntax of [dir_name,dirset_name]
, or via a namespace mapped to a cluster mounted database, the cluster master processes the command.
If you need to coordinate multiple global updates, you must use the Lock
command when updating globals in cluster-mounted databases. Caché journaling technology uses lock information to coordinate updates to these databases so that journal restores work correctly in the event of cluster failover or recovery after a cluster crash.
Configuring a Caché Cluster
Multiple Network Device Configuration
If your network configuration contains multiple network devices you must make sure that each cluster node is identified in every other cluster node.
For communication within the cluster, enter the host name or private IP address in the CommIPAddress
field; Caché converts the IP address and stores the machine name in the PIJ file. If you use the node name, it must resolve to the same network segment on all cluster nodes.
For communication with clients that are not part of the cluster, enter the public IP address of the node in the CliSysName
field using the following procedure:
Enter a value for CliSysName
, the node name for this Caché server. The name you enter here is recorded in the PIJ
and restart Caché for this information to take effect.
Managing Cluster Databases
The following sections provide information on database management within a cluster-networked system:
Creating Caché Database Files
To share a database across a cluster, all CACHE.DAT
database files must be created on disks that are cluster-accessible at the system level. Enter the device name and the directory name where you wish to create a new CACHE.DAT
file as follows:
On an OpenVMS cluster, the device portion contains a controller name, regardless of whether the CACHE.DAT
directories are cluster mounted.
For disks that are physically connected to one node in the cluster, the name is the same as the (system communications services) SCS node name of the computer serving the disk and the parts are separated by a dollar sign. For example: DKA100
, if physically served by node TEST
, is known as TEST$DKA100:
. Caché expands DKA100:
If the disk is served by an independent controller array, it has a number and is both preceded and separated by dollar signs. For example: DKA100:
on cluster controller 1 is $1$DKA100:
When a database is initially created, it is mounted privately on the system on which it is created.
Databases on clusters can be mounted either privately to the instance or cluster mounted so that other instances can share the data in it. If you wish to cluster mount a database after creating it, use the following procedure:
In the appropriate database row, if the database is mounted, click Dismount
, then click Mount
; if it is not mounted, click Mount
In the Mount screen that is displayed, click the Clustered
InterSystems recommends that you set your system to mount clustered databases at system startup. To mark cluster databases to be mounted at startup:
in the appropriate database row.
You can add an existing cluster-mounted database that is mounted on a different system to your configuration, as follows:
Add the database definition to the [Databases]
section of the CPF file; you can copy the database definition from the system where the database is already cluster mounted.
Load the updated CPF file and activate it, as follows:
Deleting a Cluster-Mounted Database
You cannot delete a cluster-mounted database (CACHE.DAT
) file. If you attempt to delete it, you see the following message:
## ERROR while Deleting. Cannot delete a cluster-mounted database
You must dismount or privately mount the database before you can delete it.
Once the network configuration is determined, the Caché startup procedure does the following:
Performs network initialization operations, including activation of the network daemons.
If Caché detects critical errors during the startup procedure, such as problems with the Caché Parameter File, it logs the error in the console log (cconsole.log
) and shuts down automatically.
Caché displays information about each database it mounts. For example:
Directory Mode VMS1$DKA0:[SYSM.V6D1-9206A] PvtVMS$DKA0:[DIR2] Clu
If mount error conditions occur, they are reported to the terminal and the cconsole.log
. If the ENQ daemon fails to start, see the cconsole.log
The first node to activate its ENQ daemon by cluster-mounting a Caché database becomes the cluster master for each cluster member. Normally, you include all cluster-mounted databases in the Database Mount List and they are mounted at startup.
Startup pauses with a message if you attempt to join a cluster during cluster failover.
Write Image Journaling and Clusters
Caché write image journaling allows remaining cluster members to continue to function without database degradation or data loss if one cluster member goes down.
In the cluster environment, the Write daemon on the first node to cluster-mount a database becomes the master Write daemon for the cluster; it creates the cluster-wide journal file, named CACHE.PIJ
. In addition, each node, including the master, has its own image journal file called CACHE.PIJxxx
In a cluster environment, writes throughout the entire cluster freeze until the cause of the freeze is fixed.
For privately mounted databases in a cluster, backups and journaling are the daily operations that allow you to recreate your database. In the event of a system failure that renders your database inaccessible, you can restore the backups and apply the changes in the journal to recreate it.
Always run a backup for the cluster mounted databases from the same machine in a cluster so the backup history is complete. Caché stores this information in a global in the manager’s database.
If you are doing a full backup on a database that is mounted in cluster mode from multiple computers, always perform the back up from the same computer. This maintains an accurate backup history for the database.
The BACKUP utility permits you to back up and restore databases that are shared by multiple CPUs in a cluster environment.
For cluster-mounted databases, InterSystems recommends another backup strategy, such as volume shadowing. Concurrent backup also works with clusters.
All databases must be mounted before you can back them up. The backup utility mounts any databases needed for the backup. It first tries to mount them privately; if that action fails, it mounts them for clustered access. If a private mount fails and the system is not part of the cluster, or if the cluster mount fails, then you cannot back up the database. You receive an error message, and can choose whether to continue or stop.
When backing up cluster-mounted databases, BACKUP must wait for all activity in the cluster to cease before it continues. For this reason, clustered systems may be suspended slightly longer during the various passes than when you back up a single node.
The DBSIZE utility gives you the option of suspending the system while it makes its calculation. It also lets you suspend the cluster if any of the databases in the backup list is cluster-mounted when the calculation takes place.
The incremental backup software uses a Lock
to prevent multiple backups from occurring at the same time. This method does not work across a cluster. You must ensure that only one backup at a time runs throughout an entire cluster whose members share the same database.
utility uses the same internal structures as the BACKUP
tests the lock used by BACKUP
. However, the same restriction applies: do not run DBSIZE
on one cluster member while another cluster member is running a backup. Otherwise, the backup will not be intact, and database degradation may result when you restore from that backup.
System Design Issues for Clusters
Please be aware of the following design issues when configuring your Caché cluster system.
Determining Database File Availability
In order to properly mount the database files to function most efficiently in the cluster, determine which CACHE.DAT
files need to be available to all users in the cluster. Mount these in cluster mode from within Caché. All WIJ
, and journal
files must be on cluster-mounted disks.
Also determine which CACHE.DAT
files only needed by users on only one cluster node. Mount these privately or specify that they are automaticthe system mounts them on reference.
Cluster Application Development Strategies
The key to performance in a cluster environment is to minimize disk contention among nodes for blocks in cluster-mounted directories.
If a directory is cluster-mounted, all computers can access data in it with a simple reference. More than one computer can access a given block in the database at a time to read its data.
However, if a computer wants to update a block, all other computers must first relinquish the block. If another computer wants to access that block prior to the completion of the Write daemon cycle, the computer that did the update must first write the changed block to disk (in such a way as the action can be reversed if that computer goes down). The other computers can again read that block until one of them wants to modify it.
If there is a great deal of modification done to a database from all cluster members, a significant amount of time-consuming I/O processing occurs to make sure each member sees the most recent copy of a block.
You can use various strategies to minimize the amount of disk I/O when a particular database is modified by multiple computers.
Mount Directories Privately
If a database is not used frequently by other nodes, mount the database privately from the node which uses it most frequently. When other nodes need to access it, they can use a remote network reference.
Use Local Storage of Counters
Contention for disk blocks is most common in the case of updating counters. To minimize this problem, code your applications so that groups of counters (for example, 10) are allocated per remote request to a local private directory. Thereafter, whenever a local process needs a new counter index number, it first checks the private directory to see if one of the ten is available. If not, it then goes to allocate a new set of ten counters from the clustered directory. You can use $INCREMENT to update counters and retrieve the value in a single operation.
This is also a good strategy for nonclustered networked systems.
In addition to reducing contention when accessing counters, this technique also enhances access of records that use those counters. Since a system obtains contiguous counters, block splitting combined with the Caché collating sequence work causes records created by different nodes to be located in different areas of the database. Therefore, processes on different nodes perform their Set
operations into different blocks and are no longer in contention, thus reducing disk I/O.
Caché ObjectScript Language Features
The following sections provide information about Caché ObjectScript language features with implications for cluster mounted database systems. Caché ObjectScript is a superset of the ISO 11756-1999 standard M programming language. If you are an M programmer, you can run your existing M applications on Caché with no change.
Remote Caché ObjectScript Locks
The following information details remote locks under Caché ObjectScript, with respect to a cluster environment.
Information about remote locks is stored in two places:
In the Lock Table on the system requesting the lock (the client)
In the Lock Table on the system to which the lock request is directed (the server)
The server for items in a cluster-mounted database is always the Lock Server (Cluster Master).
When a process on the client system needs to acquire a remote lock, it first checks to see if an entry is already present in the client lock table, indicating that another process on that same computer already has a lock on the remote item.
If a lock already exists for the desired global, the process queues up for that lock, just as it would for a local lock. No network transmissions are required.
If the needed remote lock is not present in the client’s lock table, the client process creates an entry in the local lock table and sends a network request for it.
If the reference resolves to a careted lock in a cluster-mounted database, the lock request is automatically sent to the Cluster Master.
Once the client process receives the lock acknowledgment from the remote computer, an entry identifying the process making the lock will be present both in its own (client) lock table and an entry identifying the remote computer (but not the process) which made the lock will exist in the server’s lock table.
If any of the network requests fail, the client process must remove all the locks from the local lock table. It must also send network unlock requests for any network locks it actually acquired when locking multiple items at one time.
When a process has completed its update it issues an UNLOCK
command. If it is an incremental unlock, it is handled in the local lock table. If it is the last incremental unlock, or if it is not an incremental unlock, then an unlock request is sent to the server.
If another process on the local machine has queued for the lock, rather than releasing the lock on the server, Caché may grant it to the waiting process. This is called lock conversion.
Remote Lock Commands by Extended Reference
All extended references used in remote lock commands should use the same directory specification. This includes consistency between uppercase and lowercase. For example, VMS2$SYS
is not equivalent to vms2$sys
If you use logicals, all processes and applications must use the same logical name, not just resolve to the same physical directory name. In addition, logicals must be defined the same way on all cluster members as well as by all processes running on each member. System managers and applications developers need to work together to maintain consistency.
This limitation is consistent with the ANSI standard regarding Caché ObjectScript locks and remote reference syntax.
In a cluster, references to remote globals on cluster-mounted databases can be made as a simple reference. However, certain techniques you may want to use to minimize disk contention require the use of extended reference.