Thanks for your feedback!
Need to tell us more? Click here or use the Feedback button.

Is this page helpful?

Contents

Plan Sharded Cluster
Backup & Restore
Disaster Recovery
Cloning
Sharding APIs
Deploy in Namespace
Reserved Names
See Also

Sharding Reference

This page contains additional information about planning, deploying, and using a sharded configuration.

Planning an InterSystems IRIS Sharded Cluster

This section provides some first-order guidelines for planning a basic sharded cluster, and for adding compute nodes if appropriate. It is not intended to represent detailed instructions for a full-fledged design and planning process. The following tasks are addressed:

Combine Sharding with Vertical Scaling

Planning for sharding typically involves considering the tradeoff between resources per system and number of systems in use. At the extremes, the two main approaches can be stated as follows:

Scale vertically to make each system and instance as powerful as feasible, then scale horizontally by adding additional powerful nodes.
Scale horizontally using multiple affordable but less powerful systems as a cost-effective alternative to one high-end, heavily-configured system.

In practice, in most situations, a combination of these approaches works best. Unlike other horizontal scaling approaches, InterSystems IRIS sharding is easily combined with InterSystems IRIS’s considerable vertical scaling capacities. In many cases, a cluster hosted on reasonably high-capacity systems with a range of from 4 to 16 data nodes will yield the greatest benefit.

Plan a Basic Cluster of Data Nodes

To use these guidelines, you need to estimate several variables related to the amount of data to be stored on the cluster.

First, review the data you intend to store on the cluster to estimate the following:
1. Total size of all the sharded tables to be stored on the cluster, including their indexes.
2. Total size of the nonsharded tables (including indexes) to be stored on the cluster that will be frequently joined with sharded tables.
3. Total size of all of the nonsharded tables (including indexes) to be stored on the cluster. (Note that the previous estimate is a subset of this estimate.)

Translate these totals into estimated working sets, based on the proportion of the data that is regularly queried.

Estimating working sets can be a complex matter. You may be able to derive useful information about these working sets from historical usage statistics for your existing database cache(s). In addition to or in place of that, divide your tables into the three categories and determine a rough working set for each by doing the following:

For significant SELECT statements frequently made against the table, examine the WHERE clauses. Do they typically look at a subset of the data that you might be able to estimate the size of based on table and column statistics? Do the subsets retrieved by different SELECT statements overlap with each other or are they additive?
Review significant INSERT statements for size and frequency. It may be more difficult to translate these into working set, but as a simplified approach, you might estimate the average hourly ingestion rate in MB (records per second * average record size * 3600) and add that to the working set for the table.
Consider any other frequent queries for which you may be able to specifically estimate results returned.
Bear in mind that while queries joining a nonsharded table and a sharded table count towards the working set NonshardSizeJoinedWS, queries against that same nonsharded data table that do not join it to a sharded table count towards the working set NonshardSizeTotalWS; the same nonsharded data can be returned by both types of queries, and thus would count towards both working sets.

You can then add these estimates together to form a single estimate for the working set of each table, and add those estimates to roughly calculate the overall working sets. These overall estimates are likely to be fairly rough and may turn out to need adjustment in production. Add a safety factor of 50% to each estimate, and then record the final total data sizes and working sets as the following variables:

Cluster Planning Variables

Variable	Value
ShardSize, ShardSizeWS	Total size and working set of sharded tables (plus safety factor)
NonshardSizeJoined, NonshardSizeJoinedWS	Total size and working set of nonsharded tables that are frequently joined to sharded tables (plus safety factor)
NonshardSizeTotal, NonshardSizeTotalWS	Total size and working set of nonsharded tables (plus safety factor)
NodeCount	Number of data node instances

In reviewing the guidelines in the table that follows, bear the following in mind:

Generally speaking and all else being equal, more shards will perform faster due to the added parallelism, up to a point of diminishing returns due to overhead, which typically occurs at around 16 data nodes.
The provided guidelines represent the ideal or most advantageous configuration, rather than the minimum requirement.
For example, as noted in Evaluating the Benefits of Sharding, sharding improves performance in part by caching data across multiple systems, rather than all data being cached by a single nonsharded instance, and the gain is greatest when the data in regular use is too big to fit in the database cache of a nonsharded instance. As indicated in the guidelines, for best performance the database cache of each data node instance in a cluster would equal at least the combined size of the sharded data working set and the frequently joined nonsharded data working set, with performance decreasing as total cache size decreases (all else being equal). But as long as the total of all the data node caches is greater than or equal to the cache size of a given single nonsharded instance, the sharded cluster will outperform that nonsharded instance. Therefore, if it is not possible to allocate database cache memory on the data nodes equal to what is recommended, for example, get as close as you can. Furthermore, your initial estimates may turn out to need adjustment in practice.
Database cache refers to the database cache (global buffer pool) memory allocation that must be made for each instance. You can allocate the database cache on your instances as follows:
- When using one of the automated deployment methods, you can override the default globals setting as part of deployment.
- When deploying manually using the %SYSTEM.Cluster API or the Management Portal, you can use the manual procedure described in Memory and Startup Settings.
For guidelines for allocating memory to an InterSystems IRIS instance’s routine and database caches as well as the shared memory heap, see Shared Memory Allocations .
Default globals database indicates the target size of the database in question, which is the maximum expected size plus a margin for greater than expected growth. The file system hosting the database should be able to accommodate this total, with a safety margin there as well. For general information about InterSystems IRIS database size and expansion and the management of free space relative to InterSystems IRIS databases, and procedures for specifying database size and other characteristics when configuring instances manually, see Configuring Databases and Maintaining Local Databases.
When deploying with the IKO, you can specify the size of the instance’s storage volume for data, which is where the default globals databases for the master and cluster namespaces are located, as part of deployment; this must be large enough to accommodate the target size of the default globals database.

Important:

When deploying manually, ensure that all instances have database directories and journal directories located on separate storage devices. This is particularly important when high volume data ingestion is concurrent with running queries. For guidelines for file system and storage configuration, including journal storage, see Storage Planning, File System Separation, and Journaling Best Practices.
The number of data nodes (NodeCount) and the database cache size on each data node are both variables. The desired relationship between the sum of the data nodes’ database cache sizes and the total working set estimates can be created by varying the number of shards and the database cache size per data node. This choice can be based on factors such as the tradeoff between system costs and memory costs; where more systems with lower memory resources are available, you can allocate smaller amounts of memory to the database caches, and when memory per system is higher, you can configure fewer servers. Generally speaking and all else being equal, more shards will perform faster due to the added parallelism, up to a point of diminishing returns (caused in part by increased sharding manager overhead). The most favorable configuration is typically in the 4-16 shard range, so other factors aside, for example, 8 data nodes with M memory each are likely to perform better than 64 shards with M/8 memory each.
Bear in mind that if you need to add data nodes after the cluster has been loaded with data, you can automatically redistribute the sharded data across the new servers, which optimizes performance; see Add Data Nodes and Rebalance Data for more information. On the other hand, you cannot remove a data node with sharded data on it, and a server’s sharded data cannot be automatically redistributed to other data nodes, so adding data nodes to a production cluster involves considerably less effort than reducing the number of data nodes, which requires dropping all sharded tables before removing the data nodes, then reloading the data after.
Parallel query processing is only as fast as the slowest data node, so the best practice is for all data nodes in a sharded cluster to have identical or at least closely comparable specifications and resources. In addition, the configuration of all InterSystems IRIS instances in the cluster should be consistent; database settings such as collation and those SQL settings configured at instance level (default date format, for example) should be the same on all nodes to ensure correct SQL query results. Standardized procedures and use of an automated deployment method can help ensure this consistency.

The recommendations in the following table assume that you have followed the procedures for estimating total data and working set sizes described in the foregoing, including adding a 50% safety factor to the results of your calculations for each variable.

Cluster Planning Guidelines

Size of ...	should be at least ...	Notes
Database cache on data nodes	(ShardSizeWS / NodeCount) + NonshardSizeJoinedWS	This recommendation assumes that your application requires 100% in-memory caching. Depending on the extent to which reads can be made from fast storage such as solid-state drives instead, the size of the cache can be reduced.
Default globals database for cluster namespace on each data node	ShardSize / NodeCount plus space for expected growth	When data ingestion performance is a major consideration, consider configuring initial size of the database to equal the expected maximum size, thereby avoiding the performance impact of automatic database expansion. However, if running in a cloud environment, you should also consider the cost impact of paying for storage you are not using.
Default globals database for master namespace on node 1 (see Configuring Namespaces)	NonshardSizeTotal and possibly space for expected growth	Nonsharded data is likely to grow less over time than sharded data, but of course this depends on your application.
IRISTEMP database on each data node (temporary storage database for master and cluster namespaces)	No specific guideline. The ideal initial size depends on your data set, workload, and query syntax, but will probably be in excess of 100 GB and could be considerably more.	Ensure that the database is located on the fastest possible storage, with plenty of space for significant expansion. T
CPU	No specific recommendations.	All InterSystems IRIS servers can benefit by greater numbers of CPUs, whether or not sharding is involved. Vertical scaling of CPU, memory, and storage resources can always be used in conjunction with sharding to provide additional benefit, but is not specifically required, and is governed by the usual cost/performance tradeoffs.

Important:

All InterSystems IRIS instances in a sharded cluster must be of the same version, and all must have sharding licenses.

All data nodes in a sharded cluster should have identical or at least closely comparable specifications and resources; parallel query processing is only as fast as the slowest data node. In addition, the configuration of all InterSystems IRIS instances in the cluster should be consistent; database settings such as collation and those SQL settings configured at instance level (default date format, for example) should be the same on all nodes to ensure correct SQL query results. Standardized procedures and use of an automated deployment method can help ensure this consistency.

Because applications can connect to any data node's cluster namespace and experience the full dataset as if it were local, the general recommended best practice is to load balance application connections across all of the data nodes in a cluster. The IKO can automatically provision and configure a load balancer for the data nodes as needed under typical scenarios; if deploying a sharded cluster by other means, a load balancing mechanism is required. For an important discussion of load balancing a web server tier distributing application connections across data nodes, see Load Balancing, Failover, and Mirrored Configurations.

To maximize the performance of the cluster, it is a best practice to configure low-latency network connections between all of the data nodes, for example by locating them on the same subnet in the same data center or availability zone.

Plan Compute Nodes

As described in Overview of InterSystems IRIS Sharding, compute nodes cache the data stored on data nodes and automatically process read-only queries, while all write operations (insert, update, delete, and DDL operations) are executed on the data nodes. The scenarios most likely to benefit from the addition of compute nodes to a cluster are as follows:

When high volume data ingestion is concurrent with high query volume, one compute node per data node can improve performance by separating the query workload (compute nodes) from the data ingestion workload (data nodes)
When high multiuser query volume is a significant performance factor, multiple compute nodes per data node increases overall query throughput (and thus performance) by permitting multiple concurrent sharded queries to run against the data on each underlying data node. (Multiple compute nodes do not increase the performance of individual sharded queries running one at a time, which is why they are not beneficial unless multiuser query workloads are involved.) Multiple compute nodes also maintain workload separation when a compute node fails, as queries can still be processed on the remaining compute nodes assigned to that data node.

When planning compute nodes, consider the following factors:

If you are considering deploying compute nodes, the best approach is typically to evaluate the operation of your basic sharded cluster before deciding whether the cluster can benefit from their addition. Compute nodes can be easily added to an existing cluster using one of the automated deployment methods described in Automated Deployment Methods for Clusters or using the %SYSTEM.Cluster API. For information adding compute nodes, see Deploy Compute Nodes for Workload Separation and Increased Query Throughput.
For best performance, a cluster’s compute nodes should be colocated with the data nodes (that is, in the same data center or availability zone) to minimize network latency.
When compute nodes are added to a cluster, they are automatically distributed as evenly as possible across the data nodes. Bear in mind that adding compute nodes yields significant performance improvement only when there is at least one compute node per data node.
The recommended best practice is to assign the same number of compute nodes to each data node. Therefore, if you are planning eight data nodes, for example, recommended choices for the number of compute nodes include zero, eight, sixteen, and so on.
Because compute nodes support query execution only and do not store any data, their hardware profile can be tailored to suit those needs, for example by emphasizing memory and CPU and keeping storage to the bare minimum. All compute nodes in a sharded cluster should have closely comparable specifications and resources.
Follow the data node database cache size recommendations (see Plan a Basic Cluster of Data Nodes) for compute nodes; ideally, each compute node should have the same size database cache as the data node to which it is assigned.

The distinction between data and compute nodes is completely transparent to applications, which can connect to any node's cluster namespace and experience the full dataset as if it were local. Application connections can therefore be load balanced across all of the data and compute nodes in a cluster, and under most applications scenarios this is the most advantageous approach. What is actually best for a particular scenario depends on whether you would prefer to optimize query processing or data ingestion. If sharded queries are most important, you can prioritize them by load balancing across the data nodes, so applications are not competing with shard-local queries for compute node resources; if high-speed ingestion using parallel load is most important, load balance across the compute nodes to avoid application activity on the data nodes. If queries and data ingestion are equally important, or you cannot predict the mix, load balance across all nodes.

The IKO allows you to automatically add a load balancer to your DATA node or COMPUTE node definitions; you can also create your own load balancing arrangement. For an important discussion of load balancing a web server tier distributing application connections across data nodes, see Load Balancing, Failover, and Mirrored Configurations.

Coordinated Backup and Restore of Sharded Clusters

When data is distributed across multiple systems, as in an InterSystems IRIS sharded cluster, backup and restore procedures may involve additional complexity. Where strict consistency of the data across a sharded cluster is required, independently backing up and restoring individual nodes is insufficient, because the backups may not all be created at the same logical point in time. This makes it impossible to be certain, when the entire cluster is restored following a failure, that ordering is preserved and the logical integrity of the restored databases is thereby ensured.

For example, suppose update A of data on data node S1 was committed before update B of data on data node S2. Following a restore of the cluster from backup, logical integrity requires that if update B is visible, update A must be visible as well. But if backups of S1 and S2 are taken independently, it is impossible to guarantee that the backup of S1 was made after A was committed, even if the backup of S2 was made after B was committed, so restoring the backups independently could lead to S1 and S2 being inconsistent with each other.

If, on the other hand, the procedures used coordinate either backup or restore and can therefore guarantee that all systems are restored to the same logical point in time — in this case, following update B — ordering is preserved and the logical integrity that depends on it is ensured. This is the goal of coordinated backup and restore procedures.

To greatly reduce the chances of having to use any of the procedures described here to restore your sharded cluster, you can deploy it with mirrored data servers, as described in Mirror for High Availability. Even if the cluster is unmirrored, most data errors (data corruption, for example, or accidental deletion of data) can be remedied by restoring the data node on which the error occurred from the latest backup and then recovering it to the current logical point in time using its journal files. The procedures described here are for use in much rarer situations requiring a cluster-wide restore.

This section covers the following topics:

Coordinated Backup and Restore Approaches for Sharded Clusters

Coordinated backup and restore of a sharded cluster always involves all of the data nodes in the cluster. The InterSystems IRIS Backup API includes a Backup.ShardedClusterOpens in a new tab class that supports three approaches to coordinated backup and restore of a cluster’s data nodes.

Bear in mind that the goal of all approaches is to restore all data servers to the same logical point in time, but the means of doing so varies. In one, it is the backups themselves that share a logical point in time, but in the others, InterSystems IRIS database journaling provides the common logical point in time, called a journal checkpoint, to which the databases are restored. The approaches include:

Coordinated backups
Uncoordinated backups followed by coordinated journal checkpoints
A coordinated journal checkpoint included in uncoordinated backups

To understand how these approaches work, it is important that you understand the basics of InterSystems IRIS data integrity and crash recovery, which are discussed in Introduction to Data Integrity. Database journaling, a critical feature of data integrity and recovery, is particularly significant for this topic. Journaling records all updates made to an instance’s databases in journal files. This makes it possible to recover updates made between the time a backup was taken and the moment of failure, or another selected point, by restoring updates from the journal files following restore from backup. Journal files are also used to ensure transactional integrity by rolling back transactions that were left open by a failure. For detailed information about journaling, see Journaling.

Considerations when selecting an approach to coordinated backup and restore include the following:

The degree to which activity is interrupted by the backup procedure.
The frequency with which the backup procedure should be performed to guarantee sufficient recoverability.
The complexity of the required restore procedure.

These issues are discussed in detail later in this section.

Coordinated Backup and Restore API Calls

The methods in the Backup.ShardedClusterOpens in a new tab class can be invoked on any data node. All of the methods take a ShardMasterNamespace argument; this is the name of the master namespace on data node 1 (IRISDM by default) that is accessible from all nodes in the cluster.

The available methods are as follows:

Backup.ShardedCluster.Quiesce() Opens in a new tab
Blocks all activity on all data nodes of the sharded cluster, and waits until all pending writes have been flushed to disk. Backups of the cluster’s data nodes taken under Quiesce() represent a logical point in time.
Backup.ShardedCluster.Resume()Opens in a new tab
Resumes activity on the data nodes after they are paused by Quiesce().
Backup.ShardedCluster.JournalCheckpoint()Opens in a new tab
Creates a coordinated journal checkpoint and switches each data node to a new journal file, then returns the checkpoint number and the names of the precheckpoint journal files. The precheckpoint files are the last journal files on each data node that should be included in a restore; later journal files contain data that occurred after the logical point in time represented by the checkpoint.
Backup.ShardedCluster.ExternalFreezeOpens in a new tab
Freezes physical database writes, but not application activity, across the cluster, and then creates a coordinated journal checkpoint and switches each data node to a new journal file, returning the checkpoint number and the names of the precheckpoint journal files. The backups taken under ExternalFreeze() do not represent a logical point in time, but they include the precheckpoint journal files, thus enabling restore to the logical point in time represented by the checkpoint.
Backup.ShardedCluster.ExternalThawOpens in a new tab
Resumes disk writes after they are suspended by ExternalFreeze().

You can review the technical documentation of these calls in the InterSystems Class Reference.

Procedures for Coordinated Backup and Restore

The steps involved in the three coordinated backup and restore approaches provided by the Sharding API are described in the following sections.

Create coordinated backups
Quiesces all database activity for a period of time.
Create uncoordinated backups followed by coordinated journal checkpoints
Zero downtime required.
Include a coordinated journal checkpoint in uncoordinated backups
Zero downtime required.

Data node backups should, in general, include not only database files but all files used by InterSystems IRIS, including the journal directories, write image journal, and installation data directory, as well as any needed external files. The locations of these files depend in part on how the cluster was deployed (see Deploying the Sharded Cluster); the measures required to include them in backups may have an impact on your choice of approach.

Important:

The restore procedures described here assume that the data node being restored has no mirror failover partner available, and would be used with a mirrored data node only in a disaster recovery situation, as described in Disaster Recovery of Mirrored Sharded Clusters, as well as in Disaster Recovery Procedures. If the data node is mirrored, remove the primary from the mirror, complete the restore procedure described, and then rebuild it as described in Rebuilding a Mirror Member.

Create Coordinated Backups

Call Backup.ShardedCluster.QuiesceOpens in a new tab, which pauses activity on all data nodes in the cluster (and thus all application activity) and waits until all pending writes have been flushed to disk. When this process is completed and the call returns, all databases and journal files across the cluster are at the same logical point in time.
Create backups of all data nodes in the cluster. Although the database backups are coordinated, they may include open transactions; when the data nodes are restarted after being restored from backup, InterSystems IRIS recovery uses the journal files to restore transactional integrity by rolling back these back.
When backups are complete, call Backup.ShardedCluster.ResumeOpens in a new tab to restore normal data node operation.

Important:

Resume() must be called within the same job that called Quiesce(). A failure return may indicate that the backup images taken under Quiesce() were not reliable and may need to be discarded.
Following a failure, on each data node:
1. Restore the backup image.
2. Verify that the only journal files present are those in the restored image from the time of the backup.
  
  Caution:
  
  This is critically important because at startup, recovery restores the journal files and rolls back any transactions that were open at the time of the backup. If journal data later than the time of the backup exists at startup, it could be restored and cause the data node to be inconsistent with the others.
3. Restart the data node.
The data node is restored to the logical point in time at which database activity was quiesced.

Note:

As an alternative to the first three steps in this procedure, you can gracefully shut down all data nodes in the cluster, create cold backups, and restart the data nodes.

Create Uncoordinated Backups Followed by Coordinated Journal Checkpoints

Create backups of the databases on all data nodes in the cluster while the data nodes are in operation and application activity continues. These backups may be taken at entirely different times using any method of your choice and at any intervals you choose.
Call Backup.ShardedCluster.JournalCheckpoint()Opens in a new tab on a regular basis, preferably as a scheduled task. This method creates a coordinated journal checkpoint and returns the names of the last journal file to include in a restore on each data node in order to reach that checkpoint. Bear in mind that it is the time of the latest checkpoint and the availability of the precheckpoint journal files that dictate the logical point in time to which the data nodes can be recovered, rather than the timing of the backups.

Note:

Before switching journal files, JournalCheckpoint() briefly quiesces all data nodes in the sharded cluster to ensure that the precheckpoint files all end at the same logical moment in time; as a result, application activity may be very briefly paused during execution of this method.
Ensure that for each data node, you store a complete set of journal files from the time of its last backup to the time at which the most recent coordinated journal checkpoint was created, ending with the precheckpoint journal file, and that all of these files will remain available following a server failure (possibly by backing up the journal files regularly). The databases backups are not coordinated and may also include partial transactions, but when the data nodes are restarted after being restored from backup, recovery uses the coordinated journal files to bring all databases to the same logical point in time and to restore transactional integrity.
Following a failure, identify the latest checkpoint available as a common restore point for all data nodes. This requires means that for each data node you have a database backup that preceding the checkpoint and all intervening journal files up to the precheckpoint journal file.

Caution:

This is critically important because at startup, recovery restores the journal files and rolls back any transactions that were open at the time of the backup. If journal files later than the precheckpoint journal file exist at startup, they could be restored and cause the data node to be inconsistent with the others.
On each data node, restore the databases from the backup preceding the checkpoint, restoring journal files up to the checkpoint. Ensure that no journal data after that checkpoint is applied. The simplest way to ensure that is to check if the server has any later journal files, and if so move or delete them, and then delete the journal log.
The data node is now restored to the logical point in time at which the coordinated journal checkpoint was created.

Include a Coordinated Journal Checkpoint in Uncoordinated Backups

Call Backup.ShardedCluster.ExternalFreezeOpens in a new tab(). This method freezes all activity on all data nodes in the sharded cluster by suspending their write daemons; application activity continues, but updates are written to the journal files only and are not committed to disk. Before returning, the method creates a coordinated journal checkpoint and switches each data node to a new journal file, then returns the checkpoint number and the names of the precheckpoint journal files. At this point, the precheckpoint journal files represent a single logical point in time.

Note:

Backup.ShardedCluster.ExternalFreezeOpens in a new tab() internally calls Backup.ShardedCluster.JournalCheckpoint()Opens in a new tab, which in turn calls Backup.ShardedCluster.Quiesce() Opens in a new taband Backup.ShardedCluster.Resume()Opens in a new tab to briefly quiesce the system while switching journal files. When Resume() completes it logs the message Backup.ShardedCluster.Resume: System resumed. This does not mean that the system is no longer frozen, only that it is no longer quiesced; after calling ExternalFreeze(), the system remains frozen until Backup.ShardedCluster.ExternalThaw()Opens in a new tab is called.
Create backups of all data nodes in the cluster. The databases backups are not coordinated and may also include partial transactions, but when restoring the data nodes you will ensure that they are recovered to the journal checkpoint, bringing all databases to the same logical point in time and to restoring transactional integrity.

Note:

By default, when the write daemons have been suspended by Backup.ShardedCluster.ExternalFreeze()Opens in a new tab for 10 minutes, application processes are blocked from making further updates (due to the risk that journal buffers may become full). However, this period can be extended using an optional argument to ExternalFreeze() if the backup process requires more time.
When all backups are complete, call Backup.ShardedCluster.ExternalThaw()Opens in a new tab to resume the write daemons and restore normal data node operation.

Important:

A failure return may indicate that the backup images taken under ExternalFreeze() were not reliable and may need to be discarded.
Following a failure, on each data node:
1. Restore the backup image.
2. Remove any journal files present in the restored image that are later than the precheckpoint journal file returned by ExternalFreeze().
3. Follow the instructions in Starting InterSystems IRIS Without Automatic WIJ and Journal Recovery to manually recover the InterSystems IRIS instance. When you restore the journal files, start with the journal file that was switched to by ExternalFreeze() and end with the precheckpoint journal file returned by ExternalFreeze(). (Note that these may be the same file, in which case this is the one and only journal file to restore.)
  
  Note:
  
  If you are working with containerized InterSystems IRIS instances, see Upgrading InterSystems IRIS Containers for instructions for doing a manual recovery inside a container.
The data node is restored to the logical point in time at which the coordinated journal checkpoint was created by the ExternalFreeze() method.

Note:

This approach requires that the databases and journal files on each data node be located such that a single backup can include them both.

Disaster Recovery of Mirrored Sharded Clusters

Disaster recovery (DR) asyncs keep the same synchronized copies of mirrored databases as the backup failover member, the differences being that communication between an async and its primary is asynchronous, and that an async does not participate in automatic failover. However, a DR async can be promoted to failover member, becoming the backup, when one of the failover members has become unavailable; for example, when you are performing maintenance on the backup, or when an outage of the primary causes the mirror to fail over to the backup and you need to maintain the automatic failover capability while you investigate and correct the problem with the former primary. When a major failure results in an outage of both failover members, you can perform disaster recovery by manually failing over to a promoted DR async.

DR async mirror members make it possible to provide a disaster recovery option for a mirrored sharded cluster, allowing you to restore the cluster to operation following an outage of the mirror failover pairs in a relatively short time. Specifically, enabling disaster recovery for a mirrored cluster includes:

Configuring at least one DR async in every data node mirror in a mirrored sharded cluster.
To help ensure that they remain available when a major failure creates a failover pair outage, DR asyncs are typically located in separate data centers or availability zones from the failover pairs. If the DR asyncs you manually fail over to in your disaster recovery procedure are distributed across multiple locations, the network latency between them may have a significant impact on the performance of the cluster. For this reason, it is a best practice to ensure that all of the data node mirrors in the cluster include at least one DR async in a common location.

Important:

If you add DR asyncs to the data node mirrors in an existing mirrored cluster as part of enabling disaster recovery (or for any other reason), or demote a backup member to DR async, you must call $SYSTEM.Sharding.VerifyShards()Opens in a new tab in the cluster namespace on one of the mirror primaries to update the cluster’s metadata for the additions.
Making regular coordinated backups as described in the previous section.
Because the degree to which the cluster’s operation is interrupted by backups, the frequency with which backups should be performed, and the complexity of the restore procedure all vary with the coordinated backup and restore approach you choose, you should review these approaches to determine which is most appropriate to your circumstances and disaster recovery goals (the amount of data loss you are willing to tolerate, the speed with which cluster operation must be restored, and so on).
Planning, preparing, and testing the needed disaster recovery procedures, including the restore procedure described for the coordinated backup procedure you have selected, as well as familiarizing yourself with the procedures described in Disaster Recovery Procedures.

Assuming you have configured the needed DR asyncs and have been making regular coordinated backups, the general disaster recovery procedure for a mirrored sharded cluster would be as follows:

In each data node mirror, do the following:
- Promote a DR async (ideally one sharing a common location with DR asyncs in all of the other data node mirrors) to failover member using the procedures described in Promoting a DR Async Member to Failover Member.
- Manually fail over to the promoted DR async, making it primary, as described in Manual Failover to a Promoted DR Async During a Disaster.
Restore the most recent coordinated backup using the procedures described for the coordinated backup and restore approach you selected, as described in the appropriate section of Procedures for Coordinated Backup and Restore.
To restore failover capability to the cluster, complete the failover pair in each mirror. If the data node mirrors all included multiple DR asyncs, promote another DR async to failover member in each. If there are no additional DR asyncs in the mirrors, configure a second failover member for each as described in Mirror Data Nodes for High Availability.
Restore application access to the sharded cluster.

Note:

If the mirrored sharded cluster you recovered included compute nodes, these were very likely colocated with the data node failover pairs, and also unavailable due to the failure. In this case, a full recovery of the cluster would include minimizing network latency by deploying new compute nodes colocated with the recovered cluster, as described in Deploy Compute Nodes for Workload Separation and Increased Query Throughput. If the cluster’s existing compute nodes are still operational in the original location, they should be relocated to the new cluster location as soon as possible. A recovered cluster is operational without the compute nodes, but is lacking the benefits they provided.

Cloning a Sharded Cluster

Under some circumstances you may need to replicate an existing cluster on a different set of hosts, for example when you want to:

Stand up a test system based on a snapshot of a production cluster.
Restore a cluster from backup on new hosts following a failure that rendered some or all of the existing hosts unusable.
Move a cluster to a new location, such as from one data center to another.

This section describes a procedure for replicating an existing cluster on a new set of hosts, so that you can accomplish one of these tasks or something similar.

A sharded cluster is made up of nodes and namespaces that are configured to work together through multiple mappings and communication channels. Information about this configuration, including specific hostnames, port numbers, and namespaces, is stored and maintained in the master namespace on data node 1 (or the shard master in a namespace-level cluster). To replicate a cluster, you need to duplicate this configuration on the new cluster.

Note:

This procedure can be used only when the number of data nodes in the original (existing) and target (new) clusters are the same.

Because %SYSTEM.Sharding API calls are involved in the cloning process, the resulting cloned cluster is always a namespace-level cluster, which can be managed and modified using only the %SYSTEM.Sharding API and the namespace-level pages in the Management Portal.

The instructions for the procedure assume one node per host, but can be adapted for use with a namespace-level cluster with multiple nodes per host.

To clone a sharded cluster, follow these steps:

Provision or identify the data node hosts for the target cluster, deploy InterSystems IRIS on them, and enable sharding using the Management Portal or the $SYSTEM.Sharding.EnableSharding API call.
In the master namespace on data node 1 of the original cluster, call $SYSTEM.Sharding.GetConfig()Opens in a new tab to display the locations of the default globals database of the master namespace on node 1 and the default globals database of the shard namespace on each data node. These databases must be replicated on the corresponding nodes in the new cluster, so be sure to record the information accurately. If the default routines database of the master namespace is different from its globals database, include it in the information you record.
The output of GetConfig() includes the shard number of each data node; you will need each original data node’s shard number to configure the corresponding target data node. Determine which target node will replicate which original node by assigning one of the original cluster’s shard numbers to each host in the target cluster.
Block application access to the original cluster. When all writes have been flushed to disk, individually shut down all of the InterSystems IRIS instances in the cluster.
Replicate the original cluster’s globals databases (and possibly the master’s routines database) on the target cluster. You can do this in one of two ways:
- Back up the databases on the original cluster and restore them on the target cluster.
- Copy the IRIS.DAT files from the original cluster to the target cluster.
The directory paths do not need to be the same on the two clusters. Whichever method you use, ensure that you replicate each database on the target cluster node corresponding to the original’s shard number.

Note:

If the original cluster is mirrored, back up or copy the databases from just one failover member of each node.
On each node of the target cluster, create a shard namespace with the replicated shard database as its default globals database. On node 1, do the same for the master namespace; include the routines database if it is different from the globals database.
In the master namespace on node 1 of the target cluster:
1. Call $SYSTEM.Sharding.ReassignShard()Opens in a new tab once for each target data node, specifying the hostname or IP address, superserver port, shard namespace, and shard number of each, making sure to specify the correct shard numbers to correspond to the original cluster, for example:
```
set status = $SYSTEM.Sharding.ReassignShard(,"shard-host",shard-port,"shard-namespace","shard-number")
```
2. Call $SYSTEM.Sharding.Reinitialize()Opens in a new tab, which verifies that the target cluster is of a compatible version; sets up all the mappings, ECP connections, and metadata needed to activate the target cluster; and automatically calls $SYSTEM.Sharding.VerifyShards()Opens in a new tab to complete activation and confirm that the configuration is correct.
  If the master namespace contains any user-defined global, routine, or package mappings to databases other than the globals and routines databases you replicated on the target cluster, Reinitialize() returns an error because these additional databases are not available in the new cluster. You can avoid the error by specifing 1 for the second, IgnoreMappings argument when you call Reinitialize(), as follows:
```
set status = $SYSTEM.Sharding.Reinitialize(,1)
```
  Following the Reinitialize() call, replicate the additional databases involved on target node 1, as you did the globals and routines databases, and when they are all in place call $SYSTEM.Sharding.VerifyShards()Opens in a new tab to propagate the relevant mappings to the shard namespaces.
If desired, convert the cloned cluster to mirrored, add compute nodes, or both.

Sharding APIs

At this release, InterSystems IRIS provides two APIs for use in configuring and managing a sharded cluster:

The %SYSTEM.Cluster API is for use in deploying and managing the current architecture (see Elements of Sharding).
The %SYSTEM.Sharding API is for use in deploying and managing the namespace-level architecture of previous versions (see Namespace-level Sharding Architecture).

%SYSTEM.Cluster API

For more detail on the %SYSTEM.Cluster API methods and instructions for calling them, see the %SYSTEM.ClusterOpens in a new tab class documentation in the InterSystems Class Reference.

Use the %SYSTEM.ClusterOpens in a new tab API methods in the following ways:

Set up an InterSystems IRIS instance as the first node of a new sharded cluster by calling InitializeOpens in a new tab.
Add an instance to a cluster as a data node by calling AttachAsDataNode on the instance being addedOpens in a new tab.
Add an instance to a cluster as a compute node by calling AttachAsComputeNodeOpens in a new tab on the instance being added.
Display a list of a cluster's nodes by calling ListNodesOpens in a new tab.
Retrieve an overview of a cluster's metadata by calling GetMetaDataOpens in a new tab.
Get the name of the cluster namespace for the current instance by calling ClusterNamespaceOpens in a new tab.

%SYSTEM.Cluster methods include the following; click the name of a method to open its Class Reference entry.

$SYSTEM.Cluster.Initialize()Opens in a new tab
Automatically and transparently performs all steps needed to enable the current InterSystems IRIS instance as the first node of a cluster
$SYSTEM.Cluster.AttachAsDataNode()Opens in a new tab
Attaches the current InterSystems IRIS instance to a specified cluster as a data node.
$SYSTEM.Cluster.AttachAsComputeNode()Opens in a new tab
Attaches the current InterSystems IRIS instance to a specified cluster as a compute node.
$SYSTEM.Cluster.Detach()Opens in a new tab
Detaches the current InterSystems IRIS instance from the cluster to which it is currently attached.
$SYSTEM.Cluster.ListNodes()Opens in a new tab
Lists the nodes of the cluster to which the current InterSystems IRIS instance belongs to the console or to a specified output file.
$SYSTEM.Cluster.GetMetaData()Opens in a new tab
Retrieves an overview of the metadata of the cluster to which the current InterSystems IRIS instance belongs.
$SYSTEM.Cluster.ClusterNamespace()Opens in a new tab
Gets the name of the cluster namespace on the current InterSystems IRIS instance.

%SYSTEM.Sharding API

For more detail on the %SYSTEM.Sharding API methods and instructions for calling them, see the %SYSTEM.ShardingOpens in a new tab class documentation in the InterSystems Class Reference.

Use the %SYSTEM.ShardingOpens in a new tab API methods in the following ways:

Enable an InterSystems IRIS instance to act as a shard master or shard server by calling the EnableShardingOpens in a new tab method.
Define the set of shards belonging to a master namespace by making repeated calls to AssignShardOpens in a new tab in the master namespace, one call for each shard.
Once shards have been assigned, verify that they are reachable and correctly configured by calling VerifyShardsOpens in a new tab.
If additional shards are assigned to a namespace that already contains sharded tables, and the new shards can't be reached for automatic verification during the calls to AssignShardOpens in a new tab, you can call ActivateNewShardsOpens in a new tab to activate them once they are reachable.
List all the shards assigned to a master namespace by calling ListShardsOpens in a new tab.
When converting a nonmirrored cluster to a mirrored cluster, after creating a mirror on each existing data node, add the master database and shard databases to their respective mirrors by calling AddDatabasesToMirrors.
Rebalance existing sharded data across the cluster after adding data nodes/shard data servers with $SYSTEM.Sharding.Rebalance()Opens in a new tab (see Add Data Nodes and Rebalance Data).
Assign a shard data server to a different shard namespace at a different address by calling ReassignShardOpens in a new tab.
Remove a shard from the set belonging to a master namespace by calling DeassignShardOpens in a new tab.
Set sharding configuration options by calling SetOptionOpens in a new tab, and retrieve their current values by calling GetOptionOpens in a new tab.

%SYSTEM.Sharding methods include the following; click the name of a method to open its Class Reference entry.

$SYSTEM.Sharding.EnableSharding()Opens in a new tab
Enables the current InterSystems IRIS instance to act as a shard master or shard server.
$SYSTEM.Sharding.AssignShard()Opens in a new tab
Assigns a shard to a master namespace.
$SYSTEM.Sharding.VerifyShards()Opens in a new tab
Verifies that assigned shards are reachable and are correctly configured.
$SYSTEM.Sharding.ListShards()Opens in a new tab
Lists the shards assigned to a specified master namespace, to the console or current device.
$SYSTEM.Sharding.ActivateNewShards()Opens in a new tab
Activates shards that could not be activated by prior calls to AssignShardOpens in a new tab.
$SYSTEM.Sharding.AddDatabasesToMirrors()Opens in a new tab
Adds the master and shard databases of data nodes that have been added as mirror failover members to their respective mirrors.
$SYSTEM.Sharding.Rebalance()Opens in a new tab
Rebalances existing sharded data across the cluster after adding data nodes/shard data servers.
$SYSTEM.Sharding.ReassignShard()Opens in a new tab
Reassigns a shard by assigning its shard number to a different shard namespace at a different address.
$SYSTEM.Sharding.DeassignShard()Opens in a new tab
Deassigns (unassigns) a shard from a master namespace to which it had previously been assigned. This removes the shard from the set of shards belonging to the master namespace.
$SYSTEM.Sharding.SetOption()Opens in a new tab
Sets a specified sharding configuration option to a specified value within the scope of a specified master namespace.
$SYSTEM.Sharding.GetOption()Opens in a new tab
Gets the value of a specified sharding configuration option within the scope of a specified master namespace.
$SYSTEM.Sharding.SetNodeIPAddress()Opens in a new tab
Configures a specified IP address rather than a node’s hostname as its address for cluster communications (must be used on all nodes).
$SYSTEM.Sharding.GetConfig()Opens in a new tab
Retrieves configuration information about the sharded cluster to which the specified master or cluster namespace belongs.
$SYSTEM.Sharding.Reinitialize()Opens in a new tab
Reinitializes the internal mappings, connections, and cluster metadata of a sharded cluster which has been cloned (as described in Cloning a Sharded Cluster).
$SYSTEM.Sharding.GetFederatedTableInfo()Opens in a new tab
Returns information about the specified federated table.
$SYSTEM.Sharding.Help()Opens in a new tab
Displays a summary of the methods of %SYSTEM.ShardingOpens in a new tab.

Deploying the Namespace-level Architecture

Use the procedures in this section to deploy an InterSystems IRIS sharded cluster with the older namespace-level architecture, consisting of a shard master, shard data servers, and optionally query servers using the Management Portal or the %SYSTEM.ShardingOpens in a new tab API. These procedures assume each InterSystems IRIS instance is installed on its own system.

Because applications can connect to any shard data server’s cluster namespace and experience the full dataset as if it were local, the general recommended best practice for a namespace-level cluster that does not include shard query servers is to load balance application connections across all of the shard data servers. In general, this is also the best approach for clusters that include shard query servers, but there are additional considerations; for more information, see the end of Plan Compute Nodes.

Unlike node-level deployment of a node-level cluster, namespace-level deployment cannot create mirrors as part of cluster configuration. However, if you want to deploy a mirrored sharded cluster — that is, with a mirrored shard master data server and mirrored shard data servers — you can do any of the following:

Configure the primary of an existing mirror as the shard master data server, first adding the globals database of the intended master namespace to the mirror.
Create a mirror with the existing shard master data server as primary, adding the globals database of the master namespace to the mirror before you add the backup.
Assign the failover members of an existing mirror as a shard data server, first adding the globals database of the intended cluster namespace to the mirror.
Create a mirror with an assigned shard data server as primary, adding the globals database of the shard namespace to the mirror, and then provide information about the backup to reassign it as a mirrored shard data server.

You can include DR asyncs in mirrored sharded clusters, but not reporting asyncs. Bear in mind that the recommended best practice is to avoid mixing mirrored and nonmirrored nodes, that is, the shard master and all shard data servers should be mirrored, or none of them should be. The steps of these procedures, shown below, can be used to deploy both nonmirrored and mirrored clusters.

Provision or identify the infrastructure
Deploy InterSystems IRIS on the cluster nodes
Prepare the shard data servers
Configure the cluster using either
- the Management Portal
- the %SYSTEM.Sharding API

Provision or Identify the Infrastructure

Identify the needed number of networked host systems (physical, virtual, or cloud) — one host each for the shard master and the shard data servers.

If you want to deploy mirror cluster, provision two hosts for the shard master data server, and two or more (depending on whether you want to a DR async members) for each shard data server.

Important:

Be sure to review provision or identify the infrastructure for requirements and best practices for the infrastructure of a sharded cluster.

Deploy InterSystems IRIS on the Cluster Nodes

This procedure assumes that each system hosts or will host a single InterSystems IRIS instance.

Deploy an instance of InterSystems IRIS, either by creating a container from an InterSystems-provided image (as described in Running InterSystems Products in Containers) or by installing InterSystems IRIS from a kit (as described in the Installation Guide).

Important:

Be sure to review deploy InterSystems IRIS on the data nodes for requirements and best practices for the InterSystems IRIS instances in a sharded cluster.
Ensure that the storage device hosting each instance’s databases is large enough to accommodate the target globals database size, as described in Estimate the Database Cache and Database Sizes.
All instances should have database directories and journal directories located on separate storage devices, if possible. This is particularly important when high volume data ingestion is concurrent with running queries. For guidelines for file system and storage configuration, including journal storage, see Storage Planning, File System Separation, and Journaling Best Practices.
Allocate the database cache (global buffer pool) for each instance, depending on its eventual role in the cluster, according to the sizes you determined in Estimate the Database Cache and Database Sizes. For the procedure for allocating the database cache, see Memory and Startup Settings.

Note:

In some cases, it may be advisable to increase the size of the shared memory heap on the cluster members. For information on how to allocate memory to the shared memory heap, see gmheap.

For guidelines for allocating memory to an InterSystems IRIS instance’s routine and database caches as well as the shared memory heap, see System Resource Planning and Management.

Configure the Namespace-Level Cluster using the Management Portal

For information about opening the Management Portal in your browser, see the instructions for an instance deployed in a container or one installed from a kit.

Prepare the Shard Data Servers

Do the following on each desired instance to prepare it to be added to the cluster as a shard data server.

Note:

Under some circumstances, the API (which underlies the Management Portal) may be unable to resolve the hostnames of one or more nodes into IP addresses that are usable for interconnecting the nodes of a cluster. When this is the case, you can call $SYSTEM.Sharding.SetNodeIPAddress()Opens in a new tab (see %SYSTEM.Sharding API) to specify the IP address to be used for each node. To use $SYSTEM.Sharding.SetNodeIPAddress(), you must call it on every intended cluster node before making any other %SYSTEM.Sharding API calls on those nodes, for example:

set status = $SYSTEM.Sharding.SetNodeIPAddress("00.53.183.209")

When this call is used, you must use the IP address you specify for each shard data server node, rather than the hostname, when assigning the shards to the master namespace, as described in Configure the Shard Master Data Server and Assign the Shard Data Servers.

Open the Management Portal for the instance and select System Administration > Configuration > System Configuration > Sharding > Enable Sharding, and on the dialog that displays, click OK. (The value of the Maximum Number of ECP Connections setting need not be changed as the default is appropriate for virtually all clusters.)
Restart the instance. (There is no need to close the browser window or tab containing the Management Portal; you can simply reload it after the instance has fully restarted.)
To create the shard namespace, navigate to the Configure Namespace-Level page (System Administration > Configuration > System Configuration > Sharding > Configure Namespace-Level) and click the Create Namespace button, then follow the instructions in Create/Modify a Namespace. (The namespace need not be interoperability-enabled.) Although the shard namespaces on the different shard data servers can have different names, you may find it more convenient to give them the same name.
In the process, create a new database for the default globals database, making sure that it is located on a device with sufficient free space to accommodate the target size for the shard namespaces, as described in Estimate the Database Cache and Database Sizes. If data ingestion performance is a significant consideration, set the initial size of the database to its target size. Also select the globals database you created as the namespace’s default routines database.

Important:

If you are preparing an existing mirror to be a shard data server, when you create the default globals database for the shard namespace on each member — the primary first, then the backup, then any DR asyncs — select Yes at the Mirrored database? prompt to include the shard namespace globals database in the mirror.

Note:

As noted in the Estimate the Database Cache and Database Sizes, the shard master data server and shard data servers all share a single default globals database physically located on the shard master and known as the master globals database. The default globals database created when a shard namespace is created remains on the shard, however, becoming the local globals database, which contains the data stored on the shard. Because the shard data server does not start using the master globals database until assigned to the cluster, for clarity, the planning guidelines and instructions in this document refer to the eventual local globals database as the default globals database of the shard namespace.
A new namespace is automatically created with IRISTEMP configured as the temporary storage database; do not change this setting for the shard namespace.
For a later step, record the DNS name or IP address of the host system, the superserver (TCP) port of the instance, and the name of the shard namespace you created.

Note:

From the perspective of another node (which is what you need in this procedure), the superserver port of a containerized InterSystems IRIS instance depends on which host port the superserver port was published or exposed as when the container was created. For details on and examples of this, see Running an InterSystems IRIS Container with Durable %SYS and Running an InterSystems IRIS Container: Docker Compose Example and see Container networkingOpens in a new tab in the Docker documentation.
The default superserver port number of a noncontainerized InterSystems IRIS instance that is the only such on its host is 1972. To see or set the instance’s superserver port number, select System Administration > Configuration > System Configuration > Memory and Startup in the instance’s Management Portal. (For information about opening the Management Portal for the instance and determining the superserver port, see the instructions for an instance deployed in a container or one installed from a kit.)

Configure the Shard Master Data Server and Assign the Shard Data Servers

Do the following on the desired instance to configure it as the shard master data server and assign the prepared shard data servers to the cluster. You can also use the procedure to assign query shards to the cluster.

Open the Management Portal for the instance, select System Administration > Configuration > System Configuration > Sharding > Enable Sharding, and on the dialog that displays, click OK. (The value of the Maximum Number of ECP Connections setting need not be changed as the default is appropriate for virtually all clusters.)
Restart the instance. (There is no need to close the browser window or tab containing the Management Portal; you can simply reload it after the instance has fully restarted.)
To create the master namespace, navigate to the Configure Namespace-Level page (System Administration > Configuration > System Configuration > Sharding > Configure Namespace-Level) and click the Create Namespace button, then follow the instructions in Create/Modify a Namespace. (The namespace need not be interoperability-enabled.)
In the process, create a new database for the default globals database, making sure that it is located on a device with sufficient free space to accommodate the target size for the master namespace, as described in Estimate the Database Cache and Database Sizes. Also select the globals database you created as the namespace’s default routines database.

Important:

If you are configuring an existing mirror as the shard master data server, when you create the default globals database for the master namespace on each member — the primary first, then the backup, then any DR asyncs — select Yes at the Mirrored database? prompt to include the master namespace globals database in the mirror.
To assign the shard data servers, ensure that the Namespace drop-down is set to the master namespace you just created. For each shard, press the Assign Shard button on the Configure Namespace-Level page to display the ASSIGN SHARD dialogue, then enter the following information for the shard data server instance:
- The name of its host (or the IP address, if you used the $SYSTEM.Sharding.SetNodeIPAddress()Opens in a new tab call on each node before starting this procedure, as described at the beginning of this section).
- The superserver port.
- The name of the shard namespace you created when preparing it.
- Data as the role for the shard.
  
  Important:
  
  To assign a shard query server, select Query as the role for the shard, rather than Data.
If you are assigning an existing mirror as a shard data server, begin with the primary and, after entering the above information, select Mirrored and provide the following information; the backup is then automatically assigned with the primary. (Shard query servers are never mirrored.)
- The name of the mirror.
- The name (or IP address) of the backup’s host.
- The superserver port of the backup.
- The mirror’s virtual IP address (VIP), if it has one.
Finally, select Finish.
After you assign a shard, it is displayed in the list on the Configure Namespace-Level page; if it is a mirror, both failover members are included.
After you have assigned all the shard data servers (and optionally shard query servers), click Verify Shards to confirm that the ports are correct and all needed configuration of the nodes is in place so that the shard master can communicate with the shard data servers.

Deassign a Shard Server

You can use the Deassign link to the right of a listed shard data server or shard query server to remove the shard server from the cluster. When deassigning a mirrored shard data server, you must do this in the Management Portal of the primary.

Important:

If there are sharded tables or classes on the cluster (regardless of whether the tables contain data) you cannot deassign any of the shard data servers.

Configure the Cluster Nodes Using the API

Once you have provisioned or identified the infrastructure and installed InterSystems IRIS on the cluster nodes, follow the steps in this section to configure the namespace-level cluster using the API. Consult the %SYSTEM.Sharding class documentationOpens in a new tab in the InterSystems Class Reference for complete information about the calls described here, as well as other calls in the API.

Note:

Under some circumstances, the API may be unable to resolve the hostnames of one or more nodes into IP addresses that are usable for interconnecting the nodes of a cluster. When this is the case, you can call $SYSTEM.Sharding.SetNodeIPAddress()Opens in a new tab (see %SYSTEM.Sharding API) to specify the IP address to be used for each node. To use $SYSTEM.Sharding.SetNodeIPAddress(), you must call it on every intended cluster node before making any other %SYSTEM.Sharding API calls on those nodes, for example:

set status = $SYSTEM.Sharding.SetNodeIPAddress("00.53.183.209")

When this call is used, you must use the IP address you specify for each node, rather than the hostname, as the shard-host argument when calling $SYSTEM.Sharding.AssignShard()Opens in a new tab on the shard master to assign the node to the cluster, as described in the following procedure.

Prepare the Shard Data Servers

Do the following on each desired instance to prepare it to be added to the cluster as a shard data server.

Note:

Namespace-level deployment cannot create mirrors. You can, however, add the failover members of existing mirrors as shard data servers. To do this, create the mirrors, prepare the members using this procedure, and then assign them as shard data servers as described in the next procedure.

Create the shard namespace using the Management Portal, as described in Create/Modify a Namespace. (The namespace need not be interoperability-enabled.) Although the shard namespaces on the different shard data servers can have different names, you ay find it more convenient to give them the same name.
Create a new database as the default globals database, making sure that it is located on a device with sufficient free space to accommodate its target size, as described in Estimate the Database Cache and Database Sizes. If data ingestion performance is a significant consideration, set the initial size of the database to its target size.

Important:

If you are preparing an existing mirror to be a shard data server, when you create the default globals database for the shard namespace on each member — the primary first, then the backup, then any DR asyncs — select Yes at the Mirrored database? prompt to include the shard namespace globals database in the mirror.

Select the globals database you created for the namespace’s default routines database.

Note:

As noted in the Estimate the Database Cache and Database Sizes, the shard master data server and shard data servers all share a single default globals database physically located on the shard master and known as the master globals database. The default globals database created when a shard namespace is created remains on the shard, however, becoming the local globals database, which contains the data stored on the shard. Because the shard data server does not start using the master globals database until assigned to the cluster, for clarity, the planning guidelines and instructions in this document refer to the eventual local globals database as the default globals database of the shard namespace.
A new namespace is automatically created with IRISTEMP configured as the temporary storage database; do not change this setting for the shard namespace.
For a later step, record the DNS name or IP address of the host system, the superserver (TCP) port of the instance, and the name of the shard namespace you created.

Note:

From the perspective of another node (which is what you need in this procedure), the superserver port of a containerized InterSystems IRIS instance depends on which host port the superserver port was published or exposed as when the container was created. For details on and examples of this, see Running an InterSystems IRIS Container with Durable %SYS and Running an InterSystems IRIS Container: Docker Compose Example and see Container networkingOpens in a new tab in the Docker documentation.
The default superserver port number of a noncontainerized InterSystems IRIS instance that is the only such on its host is 1972. To see or set the instance’s superserver port number, select System Administration > Configuration > System Configuration > Memory and Startup in the instance’s Management Portal. (For information about opening the Management Portal for the instance and determining the superserver port, see the instructions for an instance deployed in a container or one installed from a kit.)
In an InterSystems Terminal window, in any namespace, call $SYSTEM.Sharding.EnableShardingOpens in a new tab (see %SYSTEM.Sharding API) to enable the instance to participate in a sharded cluster, as follows:
```
set status = $SYSTEM.Sharding.EnableSharding()
```
No arguments are required.
Note:

To see the return value (for example, 1 for success) for the each API call detailed in these instructions, enter:
```
zw status
```
Reviewing status after each call is a good general practice, as a call might fail silently under some circumstances. If a call does not succeed (status is not 1), display the user-friendly error message by entering:
```
do $SYSTEM.Status.DisplayError(status) 
```
After making this call, restart the instance.

Configure the Shard Master Data Server and Assign the Shard Data Servers

On the shard master data server instance, do the following to configure the shard master and assign the shard data servers.

Create the master namespace using the Management Portal, as described in Create/Modify a Namespace. (The namespace need not be interoperability-enabled.)
Ensure that the default globals database you create is located on a device with sufficient free space to accommodate its target size, as described in Estimate the Database Cache and Database Sizes. If data ingestion performance is a significant consideration, set the initial size of the database to its target size.

Important:

If you are configuring an existing mirror as the shard master data server, when you create the default globals database for the master namespace on each member — the primary first, then the backup, then any DR asyncs — select Yes at the Mirrored database? prompt to include the master namespace globals database in the mirror.

Select the globals database you created for the namespace’s default routines database.

Note:

A new namespace is automatically created with IRISTEMP configured as the temporary storage database; do not change this setting for the master namespace. Because the intermediate results of sharded queries are stored in IRISTEMP, this database should be located on the fastest available storage with significant free space for expansion, particularly if you anticipate many concurrent sharded queries with large result sets.
In an InterSystems Terminal window, in any namespace, do the following:
1. Call $SYSTEM.Sharding.EnableSharding()Opens in a new tab (see %SYSTEM.Sharding API) to enable the instance to participate in a sharded cluster (no arguments are required), as follows:
```
set status = $SYSTEM.Sharding.EnableSharding()
```
  After making this call, restart the instance.
2. Call $SYSTEM.Sharding.AssignShard()Opens in a new tab (see %SYSTEM.Sharding API) once for each shard data server, to assign the shard to the master namespace you created, as follows:
```
set status = $SYSTEM.Sharding.AssignShard("master-namespace","shard-host",shard-superserver-port,
    "shard_namespace")
```
  where the arguments represent the name of the master namespace you created and the information you recorded for that shard data server in the previous step, for example:
```
set status = $SYSTEM.Sharding.AssignShard("master","shardserver3",1972,"shard3")
```
  If you used the $SYSTEM.Sharding.SetNodeIPAddress()Opens in a new tab call on each node before starting this procedure, as described at the beginning of this section, substitute the shard host’s IP address for shard-host, rather than its hostname.
3. To verify that you have assigned the shards correctly, you can issue the following command and verify the hosts, ports, and namespace names:
```
do $SYSTEM.Sharding.ListShards()
Shard   Host                       Port    Namespc  Mirror  Role    VIP
1       shard1.internal.acme.com   56775   SHARD1
2       shard2.internal.acme.com   56777   SHARD2
...
```
  Note:
  
  For important information about determining the superserver port of an InterSystems IRIS instance, see step 2 in Configure the Shard Data Servers.
4. To confirm that the ports are correct and all needed configuration of the nodes is in place so that the shard master can communicate with the shard data servers, call $SYSTEM.Sharding.VerifyShards()Opens in a new tab (see %SYSTEM.Sharding API) as follows:
```
do $SYSTEM.Sharding.VerifyShards() 
```
  The $SYSTEM.Sharding.VerifyShards()Opens in a new tab call identifies a number of errors. For example, if the port provided in a $SYSTEM.Sharding.AssignShard()Opens in a new tab call is a port that is open on the shard data server host but not the superserver port for an InterSystems IRIS instance, the shard is not correctly assigned; the $SYSTEM.Sharding.VerifyShards()Opens in a new tab call indicates this.
5. If you prepared and added the failover pairs of mirrors as shard data servers, you can use the $SYSTEM.Sharding.AddDatabasesToMirrors()Opens in a new tab call (see %SYSTEM.Sharding API) on the shard master data server to add the master and shard databases as mirrored databases on each primary, fully configuring each failover pair as a mirrored shard data server, as follows:
```
set status = $SYSTEM.Sharding.AddDatabasesToMirrors("master-namespace")
```
  Bear in mind that the recommended best practice is to avoid mixing mirrored and nonmirrored shard data servers, but rather to deploy the cluster as either entirely mirrored or entirely nonmirrored.

To add a DR async to the failover pair in a configured data node mirror, create databases on the new node corresponding to the mirrored databases on the first failover member, add the new node to the mirror as a DR async, and restore the databases from the backup made on the first failover member to automatically add them to the mirror.

Create a mirror on each existing data node and then call $SYSTEM.Sharding.AddDatabasesToMirrors()Opens in a new tab on node 1 to automatically convert the cluster to a mirrored configuration.

Reserved Names

The following names are used by InterSystems IRIS and should not be used in the names of user-defined elements:

The package name IRIS.Shard is reserved for system-generated shard-local class names and should not be used for user-defined classes.
The schema name IRIS_Shard is reserved for system-generated shard-local table names and should not be used for user-defined tables.
The prefixes IRIS.Shard., IS., and BfVY. are reserved for globals of shard-local tables, and in shard namespaces are mapped to the shard’s local databases. User-defined global names and global names for nonsharded tables should not begin with these prefixes. Using these prefixes for globals other than those of shard-local tables can result in unpredictable behavior.

Sharding Reference

Planning an InterSystems IRIS Sharded Cluster

Combine Sharding with Vertical Scaling

Plan a Basic Cluster of Data Nodes

Plan Compute Nodes

Coordinated Backup and Restore of Sharded Clusters

Coordinated Backup and Restore Approaches for Sharded Clusters

Coordinated Backup and Restore API Calls

Procedures for Coordinated Backup and Restore

Create Coordinated Backups

Create Uncoordinated Backups Followed by Coordinated Journal Checkpoints

Include a Coordinated Journal Checkpoint in Uncoordinated Backups

Disaster Recovery of Mirrored Sharded Clusters

Cloning a Sharded Cluster

Sharding APIs

%SYSTEM.Cluster API

%SYSTEM.Sharding API

Deploying the Namespace-level Architecture

Provision or Identify the Infrastructure

Deploy InterSystems IRIS on the Cluster Nodes

Configure the Namespace-Level Cluster using the Management Portal

Prepare the Shard Data Servers

Configure the Shard Master Data Server and Assign the Shard Data Servers

Deassign a Shard Server

Configure the Cluster Nodes Using the API

Prepare the Shard Data Servers

Configure the Shard Master Data Server and Assign the Shard Data Servers

Reserved Names

See Also