Caché Installation Guide
File System and Storage Configuration Recommendations
[Back] [Next]

This appendix provides general recommendations in the following areas:

In addition, database configuration recommendations are outlined in Configuring Databases section of the “Configuring Caché” chapter of the Caché System Administration Guide.
File System Recommendations
In the interests of performance and recoverability, InterSystems recommends a minimum of four separate file systems for Caché, to host the following:
In addition, you can add another separate file system to the configuration for the WIJ file which, by default, is created in the install—dir\mgr\ directory.Ensure that such a file system has enough space to allow the WIJ to grow to its maximum size—that is, the size of the shared (global) buffer pool. For more information, see the Write Image Journal chapter of the Caché Data Integrity Guide.
On UNIX®/Linux platforms, /usr/local/etc/cachesys is the Caché registry directory and therefore must be on a local filesystem.
In the event of a catastrophic disk failure that damages database files, the journal files are a key element in recovering from backup. Therefore, you should place the primary and alternate journal directories on storage devices that are separate from the devices used by database files and the WIJ. (Journals should be separated from the WIJ because damage to the WIJ could compromise database integrity.) Since the alternate journal device allows journaling to continue after an error on the primary journal device, the primary and alternate journal directories should also be on devices separate from each other. For practical reasons, these different devices may be different logical units (LUNs) on the same storage array; the general rule is the more separation the better, with separate sets of physical drives highly recommended. See Journaling Best Practices in the “Journaling” chapter of the Caché Data Integrity Guide for more information about separate journal storage.
Current storage arrays, especially SSD/Flash-based arrays, do not always allow for the type of segregation recommended in the preceding. When using such a technology, consult and follow the storage vendor’s recommendations for performance and resiliency.
In addition, this section includes information about the following:
Supported File Systems on UNIX®/Linux Platforms
A complete list of file systems supported on UNIX®/Linux platforms, see “Supported File Systems” in the “Supported Technologies” chapter of InterSystems Supported Platforms.
File System Mount Options on UNIX®/Linux Platforms
Buffered I/O vs. Direct I/O
In general, most of the supported UNIX®/Linux file systems and operating systems offer two distinct I/O options, using either program control, a mount option, or both:
The use of buffered and direct I/O in Caché varies by platform, file system, and the nature of the files that are stored on the file system, as follows:
The noatime Mount Option
Generally, it is advisable to disable updates to the file access time when this option is available. This can typically be done using the noatime mount option on various file systems.
Storage Configuration Recommendations
Many storage technologies are available today, from traditional magnetic spinning HDD devices to SSD and PCIe Flash based devices. In addition, multiple storage access technologies include NAS, SAN, FCoE, direct-attached, PCIe, and virtual storage with hyper-converged infrastructure.
The storage technology that is best for your application depends on application access patterns. For example, for applications that predominantly involve random reads, SSD or Flash based storage would be an ideal solution, and for applications that are mostly write intensive, traditional HDD devices might be the best approach.
The following guidelines are provided as general suggestions. Specific storage product providers may specify separate and even contradictory best practices that should be consulted and followed accordingly.
Storage Connectivity
The following considerations apply to storage connectivity.
Storage Area Network (SAN) Fibre Channel
Use multiple paths from each host to the SAN switches or storage controllers. The level of protection increases with multiple HBAs to protect from a single card failure, however a minimum recommendation is to use at least a dual-port HBA.
To provide resiliency at the storage array layer, an array with dual controllers in either an active-active or active-passive configuration is recommended to protect from a storage controller failure, and to provide continued access even during maintenance periods for activities such as firmware updates.
If using multiple SAN switches for redundancy, a good general practice is to make each switch a separate SAN fabric to keep errant configuration changes on a single switch from impacting both switches and impeding all storage access.
Network Attached Storage (NAS)
With 10Gb Ethernet commonly available, for best performance 10Gb switches and host network interface cards (NICs) are recommended.
Having dedicated infrastructure is also advised to isolate traffic from normal network traffic on the LAN. This will help ensure predictable NAS performance between the hosts and the storage. -
Jumbo frame support should be included to provide efficient communication between the hosts and storage.
Many network interface cards (NICs) provide TCP Offload Engine (TOE) support. TOE support is not universally considered advantageous. The overhead and gains greatly depend on the server’s CPU for available cycles (or lack thereof). Additionally, TOE support has a limited useful lifetime because system processing power rapidly catches up to the TOE performance level of a given NIC, or in many cases exceeds it.
Storage Configuration
The storage array landscape is ever-changing in technology features, functionality, and performance options, and multiple options will provide optimal performance and resiliency for Caché. The following guidelines provide general best practices for optimal Caché performance and data resiliency.
In the past, RAID10 was recommended for maximum protection and performance. However, storage controller capacities, RAID types and algorithm efficiencies, and controller features such as inline compression and deduplication provide more options than ever before. Additionally, your application’s I/O patterns will help you decide with your storage vendor which storage RAID levels and configuration provide the best solution.
Where possible, it is best to use block sizes similar to that of the file type. While most storage arrays have a lower limit on the block size that can be used for a given volume, you can approach the file type block size as closely as possible; for example, a 32KB or 64KB block size on the storage array is usually a viable option to effectively support CACHE.DAT files with 8KB block format. The goal here is to avoid excessive/wasted I/O on the storage array based on your application’s needs.
The following table is provided as a general overview of storage I/O within a Caché installation.
I/O Type
Database reads, mostly random
Continuous by user processes
User process initiates disk I/O to read data
Database reads are performed by daemons serving web pages, SQL queries, or direct user processes
Database writes, ordered but non-contiguous
Approx. every 80 seconds or when pending updates reach threshold percentage of database cache, whichever comes first
Database write daemons
(8 processes)
Database writes are performed by a set of database system processes known as write daemons. User processes update the database cache and the trigger (time or database cache percent full) commits the updates to disk using the write daemons. Typically expect anywhere from a few MBs to several GBs that must be written during the write cycle depending on update rates.
WIJ writes, sequential
Approx. every 80 seconds or when pending updates reach threshold percentage of database cache, whichever comes first
Database master write daemon (1 process) The WIJ is used to protect physical database file integrity from system failure during a database write cycle. Writes are approximately 256KB each in size.
Journal writes, sequential
Every 64KB of journal data or 2 seconds, or sync requested by ECP, Ensemble, or application
Database journal daemon (1 process)
Journal writes are sequential and variable in size from 4KB to 4MB. There can be as low as a few dozen writes per second to several thousand per second for very large deployments using ECP and separate application servers.
Bottlenecks in storage are one of the most common problems affecting database system performance. A common error is sizing storage for data capacity only, rather than allocating a high enough number of discrete disks to support expected Input/Output Operations Per Second (IOPS).
I/O Type
Average Response Time
Maximum Response Time
Database block size random read (non-cached)
<=6 ms
<=15 ms
Database blocks are a fixed 8KB, 16KB, 32KB, or 64KB—most reads to disk will not be cached because of large database cache on the host.
Database block size random write (cached)
<=1 ms
<2 ms
All database file writes are expected to be cached by the storage controller cache memory.
4KB to 4MB journal write (without ECP)
<=2 ms
<=5 ms
Journal writes are sequential and variable in size from 4KB to 4MB. Write volume is relatively low when no ECP application servers are used.
4KB to 4MB journal write (with ECP)
<=1 ms
<=2 ms
Journal synchronization requests generated from ECP impose a stringent response time requirement to maintain scalability. The synchronization requests issue can trigger writes to the last block in the journal to ensure data durability.
Please note that these figures are provided as guidelines, and that any given application may have higher or lower tolerances and thresholds for ideal performance. These figures and I/O profiles are to be used as a starting point for your discussions with your storage vendor.