File System and Storage Configuration Recommendations

This section provides general recommendations in the following areas:

In addition, database configuration recommendations are outlined in Configuring Databases section of the “Configuring Caché” chapter of the Caché System Administration Guide.

File System Recommendations

For recommendations about the best file system to use for any given operating system, see “Supported File Systems” in the “Supported Technologies” chapter of the online InterSystems Supported PlatformsOpens in a new tab document for this release.

In the interests of performance and recoverability, InterSystems recommends a minimum of four separate file systems for Caché, to host the following:

Installation files, executables, and system databases (including, by default, the write image journal, or WIJ, file)
Database files (and optionally the WIJ)
Primary journal directory
Alternate journal directory

In addition, you can add another separate file system to the configuration for the WIJ file which, by default, is created in the install—dir\mgr\ directory. Ensure that such a file system has enough space to allow the WIJ to grow to its maximum size—that is, the size of the database cache as allocated on the Memory and Startup page (System Administration > Configuration > System Configuration > Memory and Startup) (see Memory and Startup Settings in the “Configuring Caché” chapter of the Caché System Administration Guide). For more information on the WIJ, see the “Write Image Journal” chapter of the Caché Data Integrity Guide.

Note:

On UNIX®, Linux, and macOS platforms, /usr/local/etc/cachesys is the Caché registry directory and therefore must be on a local filesystem.

In the event of a catastrophic disk failure that damages database files, the journal files are a key element in recovering from backup. Therefore, you should place the primary and alternate journal directories on storage devices that are separate from the devices used by database files and the WIJ. (Journals should be separated from the WIJ because damage to the WIJ could compromise database integrity.) Since the alternate journal device allows journaling to continue after an error on the primary journal device, the primary and alternate journal directories should also be on devices separate from each other. For practical reasons, these different devices may be different logical units (LUNs) on the same storage array; the general rule is the more separation the better, with separate sets of physical drives highly recommended. See Journaling Best Practices in the “Journaling” chapter of the Caché Data Integrity Guide for more information about separate journal storage.

The journal directories and the WIJ directory are not configured during installation. For information on changing them after you install Caché, see Configuring Journal Settings in the Caché Data Integrity Guide.

InterSystems does not support the use of symbolic links for database directories.

Note:

Current storage arrays, especially SSD/Flash-based arrays, do not always allow for the type of segregation recommended in the preceding. When using such a technology, consult and follow the storage vendor’s recommendations for performance and resiliency.

In addition, this section includes information about the following:

Storage Configuration Recommendations

Many storage technologies are available today, from traditional magnetic spinning HDD devices to SSD and PCIe Flash based devices. In addition, multiple storage access technologies include NAS, SAN, FCoE, direct-attached, PCIe, and virtual storage with hyper-converged infrastructure.

The storage technology that is best for your application depends on application access patterns. For example, for applications that predominantly involve random reads, SSD or Flash based storage would be an ideal solution, and for applications that are mostly write intensive, traditional HDD devices might be the best approach.

The sections that follow provide guidelines as general suggestions. Specific storage product providers may specify separate and even contradictory best practices that should be consulted and followed accordingly.

Storage Connectivity

The following considerations apply to storage connectivity.

Storage Area Network (SAN) Fibre Channel

Use multiple paths from each host to the SAN switches or storage controllers. The level of protection increases with multiple HBAs to protect from a single card failure, however a minimum recommendation is to use at least a dual-port HBA.

To provide resiliency at the storage array layer, an array with dual controllers in either an active-active or active-passive configuration is recommended to protect from a storage controller failure, and to provide continued access even during maintenance periods for activities such as firmware updates.

If using multiple SAN switches for redundancy, a good general practice is to make each switch a separate SAN fabric to keep errant configuration changes on a single switch from impacting both switches and impeding all storage access.

Network Attached Storage (NAS)

With 10Gb Ethernet commonly available, for best performance 10Gb switches and host network interface cards (NICs) are recommended.

Having dedicated infrastructure is also advised to isolate traffic from normal network traffic on the LAN. This will help ensure predictable NAS performance between the hosts and the storage. -

Jumbo frame support should be included to provide efficient communication between the hosts and storage.

Many network interface cards (NICs) provide TCP Offload Engine (TOE) support. TOE support is not universally considered advantageous. The overhead and gains greatly depend on the server’s CPU for available cycles (or lack thereof). Additionally, TOE support has a limited useful lifetime because system processing power rapidly catches up to the TOE performance level of a given NIC, or in many cases exceeds it.

Storage Configuration

The storage array landscape is ever-changing in technology features, functionality, and performance options, and multiple options will provide optimal performance and resiliency for Caché. The following guidelines provide general best practices for optimal Caché performance and data resiliency.

In the past, RAID10 was recommended for maximum protection and performance. However, storage controller capacities, RAID types and algorithm efficiencies, and controller features such as inline compression and deduplication provide more options than ever before. Additionally, your application’s I/O patterns will help you decide with your storage vendor which storage RAID levels and configuration provide the best solution.

Where possible, it is best to use block sizes similar to that of the file type. While most storage arrays have a lower limit on the block size that can be used for a given volume, you can approach the file type block size as closely as possible; for example, a 32KB or 64KB block size on the storage array is usually a viable option to effectively support CACHE.DAT files with 8KB block format. The goal here is to avoid excessive/wasted I/O on the storage array based on your application’s needs.

The following table is provided as a general overview of storage I/O within a Caché installation.

I/O Type	When	How	Notes
Database reads, mostly random	Continuous by user processes	User process initiates disk I/O to read data	Database reads are performed by daemons serving web pages, SQL queries, or direct user processes
Database writes, ordered but non-contiguous	Approx. every 80 seconds or when pending updates reach threshold percentage of database cache, whichever comes first	Database write daemons (8 processes)	Database writes are performed by a set of database system processes known as write daemons. User processes update the database cache and the trigger (time or database cache percent full) commits the updates to disk using the write daemons. Typically expect anywhere from a few MBs to several GBs that must be written during the write cycle depending on update rates.
WIJ writes, sequential	Approx. every 80 seconds or when pending updates reach threshold percentage of database cache, whichever comes first	Database master write daemon (1 process)	The WIJ is used to protect physical database file integrity from system failure during a database write cycle. Writes are approximately 256KB each in size.
Journal writes, sequential	Every 64KB of journal data or 2 seconds, or sync requested by ECP, Ensemble, or application	Database journal daemon (1 process)	Journal writes are sequential and variable in size from 4KB to 4MB. There can be as low as a few dozen writes per second to several thousand per second for very large deployments using ECP and separate application servers.

Bottlenecks in storage are one of the most common problems affecting database system performance. A common error is sizing storage for data capacity only, rather than allocating a high enough number of discrete disks to support expected Input/Output Operations Per Second (IOPS).

I/O Type	Average Response Time	Maximum Response Time	Notes
Database block size random read (non-cached)	<=6 ms	<=15 ms	Database blocks are a fixed 8KB, 16KB, 32KB, or 64KB—most reads to disk will not be cached because of large database cache on the host.
Database block size random write (cached)	<=1 ms	<2 ms	All database file writes are expected to be cached by the storage controller cache memory.
4KB to 4MB journal write (without ECP)	<=2 ms	<=5 ms	Journal writes are sequential and variable in size from 4KB to 4MB. Write volume is relatively low when no ECP application servers are used.
4KB to 4MB journal write (with ECP)	<=1 ms	<=2 ms	Journal synchronization requests generated from ECP impose a stringent response time requirement to maintain scalability. The synchronization requests issue can trigger writes to the last block in the journal to ensure data durability.

Please note that these figures are provided as guidelines, and that any given application may have higher or lower tolerances and thresholds for ideal performance. These figures and I/O profiles are to be used as a starting point for your discussions with your storage vendor.