Caché High Availability Guide
Using Veritas Cluster Server for Linux with Caché
[Back] [Next]
   
Server:docs2
Instance:LATEST
User:UnknownUser
 
-
Search:    

Caché can be configured as an application controlled by Veritas Cluster Server (VCS) on Linux. This appendix highlights the key portions of the configuration of VCS including how to incorporate the Caché high availability agent into the controlled service. Refer to your Veritas documentation and consult with your hardware and operating system vendor(s) on all cluster configurations.

When using Caché in a high availability environment controlled by Veritas Cluster Server:
  1. Install the hardware and operating system according to your vendor recommendations for high availability, scalability and performance; see Hardware Configuration.
  2. Configure VCS with shared disks and a virtual IP (VIP). Verify that common failures are detected and the cluster continues operating; see Linux and Veritas Cluster Server.
  3. Install the VCS control scripts (online, offline, clean, monitor) and the Caché agent type definition, see Installing the VCS Caché Agent.
  4. Install Caché and your application according to the guidelines in this appendix and verify connectivity to your application through the VIP; see Installing Caché in the Cluster.
  5. Test disk failures, network failures, and system crashes, and test and understand your application’s response to such failures; see Application Considerations and Testing and Maintenance.
Hardware Configuration
Configure the hardware according to best practices for your application. In addition to adhering to the recommendations of your hardware vendor, consider the following:
Disk and Storage
Create LUNs/partitions, as required, for performance, scalability, availability and reliability. This includes using appropriate RAID levels, battery-backed and mirrored disk controller cache, multiple paths to the disk from each node of the cluster, and a partition on fast shared storage for the cluster quorum disk.
Networks/IP Addresses
Where possible, use bonded multi-NIC connections through redundant switches/routers to reduce single-points-of-failure.
Linux and Veritas Cluster Server
Prior to installing Caché and your Caché-based application, follow the recommendations described below when configuring Linux and VCS. These recommendations assume a two-node cluster where both nodes are identical. Other configurations are possible; consult with your hardware vendor and the InterSystems Worldwide Response Center (WRC) for guidance.
Linux
When configuring Linux on the nodes in the cluster, use the following guidelines:
  1. All nodes in the cluster must have identical userids/groupids (that is, the name and ID number must be identical on all nodes); this is required for Caché.
    These two users and two groups need to be added and synchronized between members:
    1. Users
      1. Owner(s) of the instance(s) of Caché
      2. Effective user(s) assigned to each instance’s Caché jobs
    2. Groups
      1. Effective group(s) to which each instance’s Caché processes belong.
      2. Group(s) allowed to start and stop the instance(s).
  2. All volume groups required for Caché and the application are available to all nodes.
  3. Include all fully qualified public and private domain names in the hosts file on each node.
Veritas Cluster Server
This document assumes Veritas Cluster Server (VCS) version 5.1 or newer. Other versions may work as well, but likely have different configuration options. Consult with Symmantec/Veritas and the InterSystems Worldwide Response Center (WRC) for guidance.
In general you will follow these steps:
  1. Install and cable all hardware, disk and network.
  2. Create a cluster service group that includes the network paths and volume groups of the shared disk.
Be sure to include the entire set of volume groups, logical volumes and mount points required for Caché and the application to run. These include those mount points required for the main Caché installation location, your data files, journal files, and any other disk required for the application in use.
Installing the VCS Caché Agent
The Caché VCS agent consists of five files and one soft link that must be installed on all servers in the cluster.
Sample Caché VCS agent scripts and type definition are included in a development install. These samples will be sufficient for most two-node cluster installations. Follow the instructions provided for copying the files to their proper locations in the cluster.
A development install is not required in the cluster. The files listed in the following can be copied from a development install outside the cluster to the cluster.
Assuming a development install has been completed to the /cachesys directory, the following files are located in /cachesys/dev/cache/HAcluster/VCS/Linux/:
CacheTypes.cf
Definition of the Caché agent
clean
Script that is run if VCS cannot complete an offline or online event
monitor
Monitor run to check if Caché is marked up or down or other
offline
Script to take Caché down
online
Script to bring Caché up
  1. On all cluster nodes, create the directory to hold the files associated with the Caché agent:
    cd /opt/VRTSvcs/bin/
    mkdir Cache
    
  2. Create the link from the Caché agent to the VCS Script51Agent binary:
    cd /opt/VRTSvcs/bin/Cache/
    ln -s /opt/VRTSvcs/bin/Script51Agent CacheAgent
    
  3. Copy the Caché agent script files to the /opt/VRTSvcs/bin/Cache directory:
    cp <installdir>/dev/cache/HAcluster/VCS/Linux/monitor /opt/VRTSvcs/bin/Cache/
    cp <installdir>/dev/cache/HAcluster/VCS/Linux/clean /opt/VRTSvcs/bin/Cache/
    cp <installdir>/dev/cache/HAcluster/VCS/Linux/online /opt/VRTSvcs/bin/Cache/
    cp <installdir>/dev/cache/HAcluster/VCS/Linux/offline /opt/VRTSvcs/bin/Cache/
    
  4. Adjust the ownerships and permissions of the agent files:
    chown root:root /opt/VRTSvcs/bin/Cache/offline
    chown root:root /opt/VRTSvcs/bin/Cache/online
    chown root:root /opt/VRTSvcs/bin/Cache/monitor
    chown root:root /opt/VRTSvcs/bin/Cache/clean
    
    chmod 750 /opt/VRTSvcs/bin/Cache/offline
    chmod 750 /opt/VRTSvcs/bin/Cache/online
    chmod 750 /opt/VRTSvcs/bin/Cache/monitor
    chmod 750 /opt/VRTSvcs/bin/Cache/clean
    
  5. Copy the Caché agent type definition to the VCS configuration directory and adjust ownerships and permissions:
    cp <installdir>/dev/cache/HAcluster/VCS/Linux/CacheTypes.cf /etc/VRTSvcs/conf/config/
    chmod 600 /etc/VRTSvcs/conf/config/CacheTypes.cf
    chown root:root /etc/VRTSvcs/conf/config/CacheTypes.cf
    
  6. Edit your main.cf file and add the following include line at the top of the file:
    include "CacheTypes.cf"
You are now ready to install Caché in the cluster and configure VCS to control your Caché instance(s) using the Caché agent.
Installing Caché in the Cluster
After a service group has been created and configured, install Caché in the cluster using the procedures outlined below.
Note:
For information about upgrading Caché in an existing failover cluster, see Upgrading a Cluster in the “Upgrading Caché” chapter of the Caché Installation Guide.
These instructions assume that the VCS scripts have been placed in /opt/VRTSvcs/ and the configuration information in /etc/VRTSvcs/ as described in Installing the VCS Caché Agent, earlier in this appendix.
There are different procedures depending on whether you are installing only one instance of Caché or multiple instances of Caché. Installing a single instance of Caché in the cluster is common in production clusters. In development and test clusters it is common to have multiple instances of Caché controlled by the cluster software.
Note:
If it is possible that you will install multiple instances of Caché in the future, follow the procedure for multiple instances.
Installing a Single Instance of Caché
Use the following procedure to install and configure a single instance of Caché in the VCS cluster:
  1. Bring the service group online on one node. This should mount all required disks and allow for the proper installation of Caché.
    1. Check the file and directory ownerships and permissions on all mount points and subdirectories.
    2. Prepare to install Caché by reviewing the Installing Caché on UNIX® and Linux chapter of the Caché Installation Guide.
  2. Create a link from /usr/local/etc/cachesys to the shared disk. This forces the Caché registry and all supporting files to be stored on the shared disk resource you have configured as part of the service group.
    A good choice is to use a ./usr/local/etc/cachesys/ subdirectory under your installation directory.
    For example, assuming Caché is to be installed in /cacheprod/cachesys/, specify the following:
    mkdir –p /cacheprod/cachesys/usr/local/etc/cachesys
    mkdir –p /usr/local/etc/
    ln –s /cacheprod/cachesys/usr/local/etc/cachesys /usr/local/etc/cachesys
    
  3. Run Caché cinstall on the node with the mounted disks. Be sure the users and groups (either default or custom) have already been created on all nodes in the cluster, and that they all have the same UIDs and GIDs.
  4. Stop Caché and relocate the service group to the other node. Note that the service group does not yet control Caché.
  5. On the second node in the cluster, create the link in /usr/local/etc/ and the links in /usr/bin for ccontrol and csession:
    mkdir –p /usr/local/etc/
    ln –s /cacheprod/cachesys/usr/local/etc/cachesys/ /usr/local/etc/cachesys/
    ln –s /usr/local/etc/cachesys/ccontrol /usr/bin/ccontrol
    ln –s /usr/local/etc/cachesys/csession /usr/bin/csession
    
  6. Manually start Caché using ccontrol start. Test connectivity to the cluster through the virtual IP address (VIP). Be sure the application, all interfaces, any ECP clients, and so on connect to Caché using the VIP as configured here.
  7. Be certain Caché is stopped on all nodes. Shut down VCS to prepare to add reference to the Caché agent. Make sure the agent is installed in /opt/VRTSvcs/bin/Cache/. Make sure the CacheTypes.cf configuration file is in /etc/VRTSvcs/conf/config/. Make sure ownerships and permissions match VCS requirements. See Installing the VCS Caché Agent section of this appendix for more information about these requirements.
  8. Add the Caché agent configured to control your new instance to your cluster service group, as follows. This example assumes the instance being controlled is named CACHEPROD. See the Understanding the VCS Caché Agent Options section of this appendix for information about the Inst and CleanStop options.
    Cache cacheprod (
          Inst = CACHEPROD
          CleanStop = 0
    )
  9. The Caché resource must be configured to require the disk resource and optionally the IP resource.
  10. Start VCS and verify that Caché starts on the primary node.
Installing Multiple Instances of Caché
To install multiple instances of Caché, follow these steps:
  1. Bring the service group online on one node. This should mount all required disks and allow for the proper installation of Caché.
    1. Check the file and directory ownerships and permissions on all mount points and subdirectories.
    2. Prepare to install Caché by reviewing the Installing Caché on UNIX® and Linux chapter of the Caché Installation Guide.
  2. Run Caché cinstall on the node with the mounted disks. Be sure the users and groups (either default or custom) are already created on all nodes in the cluster, and that they all have the same UIDs and GIDs.
  3. The /usr/local/etc/cachesys directory and all its files must be available to all nodes at all times. To enable this, copy /usr/local/etc/cachesys from the first node you install to each node in the cluster. The following method preserves symbolic links during the copy process:
    cd /usr/local/
    rsync -av -e ssh etc root@node2:/usr/local/
    Verify that ownerships and permissions on the cachesys directory and its files are identical on all nodes
    Note:
    In the future, keep the Caché registries on all nodes in sync using ccontrol create or ccontrol update or by copying the directory again; for example:
    ccontrol create CSHAD directory=/myshadow/ versionid=2013.1.475
  4. Stop Caché and relocate the service to the other node. Note that the service group does not yet control Caché.
  5. On the second node in the cluster, create the links in /usr/bin for ccontrol and csession, as follows
    ln –s /usr/local/etc/cachesys/ccontrol /usr/bin/ccontrol
    ln –s /usr/local/etc/cachesys/csession /usr/bin/csession
    
  6. Manually start Caché using ccontrol start. Test connectivity to the cluster through the VIP. Be sure the application, all interfaces, any ECP clients, and so on connect to Caché using the VIP as configured here .
  7. Be certain Caché is stopped on all nodes. Shut down VCS to prepare to add reference to the Caché agent. Make sure the agent is installed in /opt/VRTSvcs/bin/Cache/. Make sure the CacheTypes.cf configuration file is in /etc/VRTSvcs/conf/config/. Make sure ownerships and permissions match VCS requirements. See Installing the VCS Caché Agent section of this appendix for more information about these requirements.
  8. Add the Caché agent configured to control your new instance to your cluster service group, as follows. This example assumes the instance being controlled is named CACHEPROD. See the Understanding the VCS Caché Agent Options section of this appendix for information about the Inst and CleanStop options.
    Cache cacheprod (
          Inst = CACHEPROD
          CleanStop = 0
    )
  9. The Caché resource must be configured to require the disk resource and optionally the IP resource.
  10. Start VCS and verify that Caché starts on the primary node.
Notes on adding the second instance of Caché to the cluster
When you are ready to install a second instance of Caché within the same cluster, follow these additional steps:
  1. Configure VCS to add the disk and IP resources associated with the second instance of Caché.
  2. Bring VCS online so the disks are mounted on one of the nodes.
  3. Be sure the users and groups to be associated with the new instance are created and synchronized between nodes.
  4. On the node with the mounted disk, run ccinstal following the procedures outlined in the "Installing Caché on UNIX® and Linux" chapter of the Caché Installation Guide.
  5. Stop Caché.
  6. Synchronize the Caché registry using the following steps:
    1. On the install node run
      ccontrol list
    2. Record the instance name, version ID and installation directory of the instance you just installed.
    3. On the other node, run the following command to create the registry entry, using the information you recorded from the recently installed instance:
      ccontrol create <instance_name> versionid=<version_ID> directory=<instance_directory>
  7. Add the Caché agent controller for this instance to your main.cf script.
  8. The Caché resource must be configured to require the disk resource and optionally the IP resource.
  9. Start the cluster service group and verify that Caché starts.
Understanding the VCS Caché Agent Options
The VCS Caché agent has two options that can be configured as part of the resource:
Inst
Set to the name of the instance being controlled by this resource (there is no default).
CleanStop
Set to 1 for a ccontrol stop or 0 for an immediate ccontrol force.
CleanStop determines the behavior of Caché when VCS attempts to offline the resource. When CleanStop is set to 1, Caché first uses ccontrol stop. When CleanStop is set to 0, Caché immediately uses ccontrol force. Consider the following consequences when deciding about this option:
ccontrol stop (CleanStop = 1)
Waits for processes to end cleanly, potentially delaying the stop, especially when some processes are unresponsive due to a hardware failure or fault. This setting can significantly lengthen time-to-recovery.
ccontrol force (CleanStop = 0)
Because it does not wait for processes to end, dramatically decreases time-to-recovery in most cases of failover due to hardware failures or fault. However, while ccontrol force fully protects the structurally integrity of the databases, it may result in transaction rollbacks at startup. This may lengthen the time required to restart Caché, especially if long transactions are involved.
If a controlled failover is to occur, such as during routine maintenance, follow these steps:
  1. Notify and remove the user community’s connections, stop batch and background jobs.
  2. Stop Caché from the command line using ccontrol stop <instance_name>.
  3. Fail over the cluster service.
Even if CleanStop is set to 0, the ccontrol force command issued during the stop of the cluster service has no effect since Caché is already cleanly stopped, with all transactions rolled back by the command line ccontrol stop before processes are halted .
Application Considerations
Consider the following for your applications:
Testing and Maintenance
Upon first setting up the cluster, be sure to test that failover works as planned. This also applies any time changes are made to the operating system, its installed packages, the disk, the network, Caché, or your application.
In addition to the topics described in this section, you should contact the InterSystems Worldwide Response Center (WRC) for assistance when planning and configuring your Veritas Cluster Server resource to control Caché. The WRC can check for any updates to the Caché agent, as well as discussing failover and HA strategies.
Failure Testing
Typical full scale testing must go beyond a controlled service relocation. While service relocation testing is necessary to validate that the package configuration and the service scripts are all functioning properly, you should also test responses to simulated failures. Be sure to test failures such as:
Testing should include a simulated or real application load. Testing with an application load builds confidence that the application will recover in the event of an actual failure.
If possible, test with a heavy disk write load; during heavy disk writes the database is at its most vulnerable. Caché handles all recovery automatically using its CACHE.WIJ and journal files but testing a crash during an active disk write ensures that all file system and disk devices are properly failing over.
Software and Firmware Updates
Keep software patches and firmware revisions up to date. Avoid known problems by adhering to a patch and update schedule.
Monitor Logs
Keep an eye on the VCS logs in /var/VRTSvcs/log/. The Caché agent logs time-stamped information to the “engine” log during cluster events. To troubleshoot any problems, search for the Caché agent error code 60022.
Use the Caché console log, the Caché Monitor and the Caché System Monitor to be alerted to problems with the database that may not be caught by the cluster software. (See the chapters Monitoring Caché Using the Management Portal, Using the Caché Monitor and Using the Caché System Monitor in the Caché Monitoring Guide for information about these tools.)