docs.intersystems.com
Home  /  Architecture  /  High Availability Guide  /  Using Pacemaker-based Red Hat Enterprise Linux HA with InterSystems IRIS


High Availability Guide
Using Pacemaker-based Red Hat Enterprise Linux HA with InterSystems IRIS
[Back]  [Next] 
InterSystems: The power behind what matters   
Search:  


Red Hat Enterprise Linux (RHEL) release 7 includes Pacemaker as its high availability service management component. The InterSystems rgmanager/cman based agent used with previous versions of RHEL cannot be used with Pacemaker-based clusters. This appendix explains how to use the InterSystems IRIS™ Open Clustering Framework based (OCF-based) resource agent to configure InterSystems IRIS as a resource controlled by RHEL 7 with the Pacemaker-based High Availability Add-On (HA).
The procedures here highlight the key portions of the configuration of RHEL HA, including how to incorporate the InterSystems IRIS OCF-based resource agent into the cluster. Refer to your Red Hat documentation and consult with your hardware and operating system vendors on all cluster configurations.
When using InterSystems IRIS in a high availability environment controlled by RHEL HA:
  1. Install the hardware and operating system according to your vendor recommendations for high availability, scalability and performance; for more information, see Hardware Configuration.
  2. Configure RHEL HA with shared disks and a virtual IP address (VIP), and verify that common failures are detected and the cluster continues operating; see Configuring Red Hat Enterprise Linux HA for more information.
  3. Install the InterSystems IRIS OCF-based resource agent script according to the information in Installing the InterSystems IRIS OCF-based Resource Agent.
  4. Install InterSystems IRIS and your application according to the guidelines in this appendix and verify connectivity to your application through the VIP; for more information, see Installing InterSystems IRIS in the Cluster.
  5. Test disk failures, network failures, and system crashes, and test and understand your application’s response to such failures; for more information, see Application Considerations and Testing and Maintenance.
Hardware Configuration
Configure the hardware according to best practices for your application. In addition to adhering to the recommendations of your hardware vendor, consider the following:
Configuring Red Hat Enterprise Linux HA
Prior to installing InterSystems IRIS and your InterSystems IRIS-based application, follow the recommendations in this section when configuring RHEL 7. These recommendations assume a cluster of two identical nodes. Other configurations are possible; consult with your hardware vendor and the InterSystems Worldwide Response Center (WRC) for guidance.
Red Hat Enterprise Linux
When configuring Linux on the nodes in the cluster, use the following guidelines:
Red Hat Enterprise Linux High Availability Add-On
This document assumes RHEL 7 or later, with the HA Add-On using Pacemaker as the high availability service management component. The script and directions here apply only to Pacemaker-based HA. RHEL 6.5 includes Pacemaker-based HA and may work as well; consult with Red Hat and the InterSystems Worldwide Response Center (WRC) for guidance.
In general, you will follow these steps:
  1. Install and cable all hardware, disk and network.
  2. Configure STONITH fencing resources.
  3. Create VIP and disk resources (file system, LVM, perhaps CLVM) that include the network paths and volume groups of the shared disk.
Be sure to include the entire set of volume groups, logical volumes and mount points required for InterSystems IRIS and the application to run. These include those mount points required for the main InterSystems IRIS installation location, your data files, journal files, and any other disk required for the application in use.
With the move to Pacemaker, the RHEL HA Add-on no longer supports qdisk or any quorum disk. With two-node clusters, therefore, a robust STONITH configuration is especially important to avoid a partitioned or unsynchronized cluster. Consult with Red Hat on other possibilities such as adding a third-node for quorum purposes only.
Installing the InterSystems IRIS OCF-based Resource Agent
The InterSystems IRIS OCF-based resource agent consists of one file that must be installed on all nodes in the cluster that will run InterSystems IRIS resources.
A sample InterSystems IRIS agent script is included in a development installation of InterSystems IRIS, or in the InterSystems IRIS distribution kit in the dist/dev/cache/HAcluster/RedHat directory. This sample is sufficient for most cluster installations without modification. No development installation is required; simply follow the instructions provided here for copying the InterSystems IRIS agent script file to its proper locations in the cluster.
The following files are located in /installdir/dev/cache/HAcluster/RedHat after a development installation:
To copy the agent script, do the following:
  1. Copy the script to the /usr/lib/ocf/resource.d/heartbeat/ directory on all cluster nodes, changing the name of the file to Cache, as follows:
    cp installdir/dev/cache/HAcluster/RedHat/CacheOCFagent /usr/lib/ocf/resource.d/heartbeat/Cache
  2. Adjust the ownerships and permissions of the agent file on each node:
    chown root:root /usr/lib/ocf/resource.d/heartbeat/Cache 
    chmod 555 /usr/lib/ocf/resource.d/heartbeat/Cache
You are now ready to install InterSystems IRIS in the cluster and configure RHEL HA to control your InterSystems IRIS instance(s) using the InterSystems IRIS agent.
Installing InterSystems IRIS in the Cluster
After a resource group has been created and configured, install InterSystems IRIS in the cluster using the procedures outlined in this section. These instructions assume that the InterSystems IRIS resource script has been properly located as described in the previous section, Installing the InterSystems IRIS OCF-based Resource Agent.
Procedures differ depending on whether you are installing only one instance of InterSystems IRIS or multiple instances of InterSystems IRIS. Installing a single instance of InterSystems IRIS is common in clusters dedicated to the production instance. In development and test clusters, it is common to have multiple instances of InterSystems IRIS controlled by the cluster software. If it is possible that you will install multiple instances of InterSystems IRIS in the future, follow the procedure for multiple instances.
Note:
For information about upgrading InterSystems IRIS in an existing failover cluster, see Upgrading a Cluster in the “Upgrading InterSystems IRIS” chapter of the Installation Guide.
Installing a Single Instance of InterSystems IRIS
To install a single instance of InterSystems IRIS in the cluster, use the following procedure.
Note:
If any InterSystems IRIS instance that is part of a failover cluster is to be added to an InterSystems IRIS mirror, you must use the procedure described in Installing Multiple Instances of InterSystems IRIS, rather than the procedure in this section.
  1. Bring the resource group online on one node. This should mount all required disks and allow for the proper installation of InterSystems IRIS.
    1. Check the file and directory ownerships and permissions on all mount points and subdirectories.
    2. Prepare to install InterSystems IRIS by reviewing the Installing InterSystems IRIS on UNIX® and Linux chapter of the Installation Guide.
  2. Create a link from /usr/local/etc/cachesys to the shared disk. This forces the InterSystems IRIS registry and all supporting files to be stored on the shared disk resource you have configured as part of the resource group.
    A good choice is to use a ./usr/local/etc/cachesys/ subdirectory under your installation directory. For example, assuming InterSystems IRIS is to be installed in /cacheprod/cachesys/, specify the following:
    mkdir –p /cacheprod/cachesys/usr/local/etc/cachesys
    mkdir –p /usr/local/etc/
    ln –s /cacheprod/cachesys/usr/local/etc/cachesys /usr/local/etc/cachesys
  3. Run InterSystems IRIS cinstall on the node with the mounted disks. Be sure the users and groups (either default or custom) are available on all nodes in the cluster, and that they all have the same UIDs and GIDs.
  4. Stop InterSystems and relocate all resources to the other node. Note that Pacemaker does not yet control InterSystems IRIS.
  5. On the second node in the cluster, create the link in /usr/local/etc/ and the links in /usr/bin for ccontrol and csession:
    mkdir –p /usr/local/etc/
    ln –s /cacheprod/cachesys/usr/local/etc/cachesys/ /usr/local/etc/cachesys/
    ln –s /usr/local/etc/cachesys/ccontrol /usr/bin/ccontrol
    ln –s /usr/local/etc/cachesys/csession /usr/bin/cession
  6. Manually start InterSystems IRIS using ccontrol start. Test connectivity to the cluster through the VIP. Be sure the application, all interfaces, any ECP clients, and so on connect to InterSystems IRIS using the VIP as configured here.
  7. Be certain InterSystems IRIS is stopped on all nodes.
  8. Add the InterSystems IRIS resource configured to control your new instance to your cluster, as follows. This example assumes the instance being controlled is named CACHEPROD. See Understanding the Parameters of the InterSystems IRIS Resource for information about the Instance and cleanstop options.
    pcs resource create CacheProd ocf:heartbeat:Cache Instance=CACHEPROD cleanstop=1
  9. The InterSystems IRIS resource (CacheProd) must be configured to collocate and ordered to start after its disk resources and optionally its VIP resource(s). To prevent unexpected stops and restarts, InterSystems IRIS and its collocated resources should be configured to prefer their current location (resource-stickiness=INFINITY) and thus never fail back after a node reboots.
  10. Verify that InterSystems IRIS starts in the cluster.
Installing Multiple Instances of InterSystems IRIS
To install multiple instances of InterSystems IRIS, use the procedure in this section.
Note:
If any InterSystems IRIS instance that is part of a failover cluster is to be added to an InterSystems IRIS mirror, the ISCAgent (which is installed with InterSystems IRIS) must be properly configured; see Configuring the ISCAgent in the “Mirroring” chapter of this guide for more information.
To install InterSystems IRIS on the first node, do the following:
  1. Bring the resource group online on one node. This should mount all required disks and allow for the proper installation of InterSystems IRIS.
    1. Check the file and directory ownerships and permissions on all mount points and subdirectories.
    2. Prepare to install InterSystems IRIS by reviewing the Installing InterSystems IRIS on UNIX® and Linux chapter of the Installation Guide.
  2. Run InterSystems IRIS cinstall on the node with the mounted disks. Be sure the users and groups (either default or custom) are available on all nodes in the cluster, and that they all have the same UIDs and GIDs.
  3. The /usr/local/etc/cachesys directory and all its files must be available to all nodes at all times. To enable this, after InterSystems IRIS is installed on the first node, copy /usr/local/etc/cachesys to each node in the cluster. The following method preserves symbolic links during the copy process:
    cd /usr/local
    rsync -av -e ssh etc root@node2:/usr/local/
  4. Verify that ownerships and permissions on the cachesys directory and its files are identical on all nodes
    Note:
    In the future, keep the InterSystems IRIS registries on all nodes in sync using ccontrol create or ccontrol update, or by copying the directory again; for example:
    ccontrol create CSHAD directory=/myshadow/ versionid=2018.1.475
  5. Stop InterSystems and relocate all resources to the other node. Note that Pacemaker does not yet control InterSystems IRIS.
  6. On the second node in the cluster, create the links in /usr/bin for ccontrol and csession, as follows
    ln –s /usr/local/etc/cachesys/ccontrol /usr/bin/ccontrol
    ln –s /usr/local/etc/cachesys/csession /usr/bin/csession
    
  7. Manually start InterSystems IRIS using ccontrol start. Test connectivity to the cluster through the VIP. Be sure the application, all interfaces, any ECP clients, and so on connect to InterSystems IRIS using the VIP as configured here.
  8. Be certain InterSystems IRIS is stopped on all nodes.
  9. Add the InterSystems IRIS resource configured to control your new instance to your cluster, as follows. This example assumes the instance being controlled is named CACHEPROD. See Understanding the Parameters of the InterSystems IRIS Resource for information about theInstance and cleanstop options.
    pcs resource create CacheProd ocf:heartbeat:Cache Instance=CACHEPROD cleanstop=1
  10. The InterSystems IRIS resource (CacheProd) must be configured to collocate and ordered to start after its disk resources and optionally its VIP resource(s). To prevent unexpected stops and restarts, InterSystems IRIS and its collocated resources should be configured to prefer their current location (resource-stickiness=INFINITY) and thus never fail back after a node reboots.
  11. Verify that InterSystems IRIS starts in the cluster.
When you are ready to install a second instance of InterSystems IRIS within the same cluster, follow these additional steps:
  1. Configure Red Hat HA to add the disk and IP resources associated with the second instance of InterSystems IRIS.
  2. Bring the new resources online so the disks are mounted on one of the nodes.
  3. Be sure the users and groups to be associated with the new instance are created and synchronized between nodes.
  4. On the node with the mounted disk, run ccinstall following the procedures outlined in the "Installing InterSystems IRIS on UNIX® and Linux" chapter of the Installation Guide.
  5. Stop InterSystems IRIS.
  6. Synchronize the InterSystems IRIS registry using the following steps:
    1. On the install node run
      ccontrol list
    2. Record the instance name, version ID and installation directory of the instance you just installed.
    3. On the other node, run the following command to create the registry entry, using the information you recorded from the recently installed instance:
      ccontrol create instname versionid=vers_ID directory=installdir
  7. Add the InterSystems IRIS resource for this instance to your cluster as follows:
    pcs resource create instname ocf:heartbeat:Cache Instance=instname cleanstop=1
  8. The new InterSystems IRIS resource must be configured to collocate and ordered to start after its disk resources and optionally its VIP resource(s). To prevent unexpected stops and restarts, InterSystems IRIS and its collocated resources should be configured to prefer their current location (resource-stickiness=INFINITY) and thus never fail back after a node reboots.
  9. Verify that InterSystems IRIS starts in the cluster.
Understanding the Parameters of the InterSystems IRIS Resource
The InterSystems IRIS OCF-based resource agent has two parameters that can be configured as part of the resource:
Application Considerations
Consider the following for your applications:
Testing and Maintenance
Upon first setting up the cluster, be sure to test that failover works as planned. This also applies any time changes are made to the operating system, its installed packages, the disk, the network, InterSystems IRIS, or your application.
In addition to the topics described in this section, you should contact the InterSystems Worldwide Response Center (WRC) for assistance when planning and configuring RHEL HA cluster to control InterSystems IRIS. The WRC can check for any updates to the InterSystems IRIS agent, as well as discussing failover and HA strategies.
Failure Testing
Typical full scale testing must go beyond a controlled service relocation. While service relocation testing is necessary to validate that the package configuration and the service scripts are all functioning properly, you should also test responses to simulated failures. Be sure to test failures such as:
Testing should include a simulated or real application load. Testing with an application load builds confidence that the application will recover in the event of actual failure.
If possible, test with a heavy disk write load; during heavy disk writes the database is at its most vulnerable. InterSystems IRIS handles all recovery automatically using its CACHE.WIJ and journal files, but testing a crash during an active disk write ensures that all file system and disk devices are properly failing over.
Software and Firmware Updates
Keep software patches and firmware revisions up to date. Avoid known problems by adhering to a patch and update schedule.
Monitor Logs
Keep an eye on the /var/log/pacemaker.log file and messages file in /var/log/ as well as the InterSystems IRIS cconsole.log files. The InterSystems IRIS agent resource script logs time-stamped information to the logs during cluster events.
Use the InterSystems IRIS console log, the InterSystems IRIS Monitor and the InterSystems IRIS System Monitor to be alerted to problems with the database that may not be caught by the cluster software. (See the chapters Monitoring InterSystems IRIS Using the Management Portal, Using the InterSystems IRIS Monitor and Using the InterSystems IRIS System Monitor in the Monitoring Guide for information about these tools.)