ECP Recovery Process, Guarantees, and Limitations

This page describes the ECP recovery process (applicable to distributed cache clusters). It includes information on guarantees and limitations. A summary is in ECP Recovery Protocol.

ECP Recovery Introduction

The simplest case of ECP recovery is a temporary network interruption that is long enough to be noticed, but short enough that the underlying TCP connection stays active during the outage. During the outage, the application server notices that the connection is nonresponsive and blocks new network requests for that connection. Once the connection resumes, processes that were blocked are able to send their pending requests.

If the underlying TCP connection is reset, the data server waits for a reconnection for the Time interval for Troubled state setting (one minute by default). If the application server does not succeed in reconnecting during that interval, the data server resets its connection, rolls back its open transactions, and releases its locks. Any subsequent connection from that application server is converted into a request for a brand new connection and the application server is notified that its connection is reset.

The application server keeps a queue of locks to remove and transactions to roll back once the connection is reestablished. By keeping this queue, processes on the application server can always halt, whether or not the data server on which it has pending transactions and locks is currently available. ECP recovery completes any pending Set and Kill operations that had been queued for the data server before the network outage was detected, before it completes the release of locks.

Any time a data server learns that an application server has reset its own connection (due to application server restart, for example), even if it is still within the Time interval for Troubled state, the data server resets the connection immediately, rolling back transactions and releasing locks on behalf of that application server. Since the application server’s state was reset, there is no longer any state to be maintained by the data server on its behalf.

The final case is when the data server shut down, either gracefully or as a result of a crash. The application server maintains the application state and tries to reconnect to the data server for the Time to wait for recovery setting (20 minutes by default). The data server remembers the application server connections that were active at the time of the crash or shutdown; after restarting, it waits up to thirty seconds for those application servers to reconnect and recover their connections. Recovery involves several steps on the data server, some of which involve the data server journal file in very significant ways. The result of the several different steps is that:

The data server’s view of the current active transactions from each application server has been restored from the data server’s journal file.
The data server’s view of the current active Lock operations from each application server has been restored, by having the application server upload those locks to the data server.
The application server and the data server agree on exactly which requests from the application server can be ignored (because it is certain they completed before the crash) and which ones should be replayed. Therefore, the last recovery step is to simply let the pending network requests complete, but only those network requests that are safe to replay.
Finally, the application server delivers to the data server any pending unlock or rollback indications that it saved from jobs that halted while the data server was restarting. All guarantees are maintained, even in the face of sudden and unanticipated data server crashes, as long as the integrity of the storage devices (for database, WIJ, and journal files) are maintained.

During the recovery of an ECP-configured system, InterSystems IRIS guarantees a number of recoverable semantics which are described in detail in ECP Recovery Guarantees. Limitations to these guarantees are described in detail in ECP Recovery Limitations.

ECP Recovery Guarantees

During the recovery of an ECP-configured system, InterSystems IRIS guarantees the following recoverable semantics:

In the description of each guarantee the first paragraph describes a specific condition. Subsequent paragraphs describe the data guarantee applicable to that particular situation.

In these descriptions, Process A, Process B and so on refer to processes attempting update globals on a data server. These processes may originate on the same or different application servers, or on the data server itself; in some cases the origins of processes are specified, in others they are not germane.

In-order Updates Guarantee

Process A updates two data elements sequentially, first global ^x and next global ^y, where ^x and ^y are located on the same data server.

If Process B sees the change to ^y, it also sees the change to ^x. This guarantee applies whether or not Process A and Process B are on the same application server as long as the two data items are on the same data server and the data server remains up.

Process B’s ability to view the data modified by Process A does not ensure that Set operations from Process B are restored after the Set operations from Process A. Only a Lock or a $Increment operation can ensure proper ordering of competing Set and Kill operations from two different processes during cluster failover or cluster recovery.

See the Loose Ordering in Cluster Failover or Restore limitation regarding the order in which competing Set and Kill operations from separate processes are applied during cluster dejournaling and cluster failover.

Important:

This guarantee does not apply if the data server crashes, even if ^x and ^y are journaled. See the Dirty Data Reads for ECP Without Locking limitation for a case in which processes that fit this description can see dirty data that never becomes durable before the data server crash.

ECP Lock Guarantee

Process B on DataServer S acquires a lock on global ^x, which was once locked by Process A.

Process B can see all updates on DataServer S done by Process A (while holding a lock on ^x). Also, if Process C sees the updates done by Process B on DataServer S (while holding a lock on ^x), Process C is guaranteed to also see the updates done by Process A on DataServer S (while holding a lock on ^x).

Serializability is guaranteed whether or not Process A, Process B, and Process C are located on the same application server or on DataServer S itself, as long as DataServer S stays up throughout.

Important:

The lock and the data it protects must reside on the same data server.

Clusters Lock Guarantee

Process B on a cluster member acquires a lock on global ^x in a clustered database; a lock once held by Process A.

Process B sees all updates to any clustered database done by Process A (while holding a lock on ^x).

Additionally, if Process C on a cluster member sees the updates on a clustered database made by Process B (while holding a lock on ^x), Process C also sees the updates made by Process A on any clustered database (while holding a lock on ^x).

Serializability is guaranteed whether or not Process A, Process B, and Process C are located on the same cluster member, and whether or not any cluster member crashes.

Important:

See the Dirty Data Reads When Cluster Member Crashes limitation regarding transactions on one cluster member seeing dirty data from a transaction on a cluster member that crashes.

Rollback Guarantee

Process A executes a TStart command, followed by a series of updates, and either halts before issuing a TCommit, or executes a TRollback before executing a TCommit.

All the updates made by Process A as part of the transaction are rolled back in the reverse order in which they originally occurred.

Important:

See the rollback-related limitations: Conflicting, Non-Locked Change Breaks Rollback, Journal Discontinuity Breaks Rollback, and Asynchronous TCommit Converts to Rollback for more information.

Commit Guarantee

Process A makes a series of updates on DataServer S and halts after starting the execution of a TCommit.

On each DataServer S that is part of the transaction, the data modifications on DataServer S are either committed or rolled back. If the process that executes the TCommit has the Perform Synchronous Commit property turned on (SynchCommit=1, in the configuration file) and the TCommit operation returns without errors, the transaction is guaranteed to have durably committed on all the data servers that are part of the transaction.

Important:

If the transaction includes updates to more than one server (including the local server) and the TCommit cannot complete successfully, some servers that are part of the transaction may have committed the updates while others may have rolled them back.

Transactions and Locks Guarantee

Process A executes a TStart for Transaction T, locks global ^x on DataServer S, and unlocks ^x (unlock does not specify the “immediate unlock” lock type).

InterSystems IRIS guarantees that the lock on ^x is not released until the transaction has been either committed or rolled back. No other process can acquire a lock on ^x until Transaction T either commits or rolls back on DataServer S.

Once Transaction T commits on DataServer S, Process B that acquires a lock on ^x sees changes on DataServer S made by Process A during Transaction T. Any other process that sees changes on DataServer S made by Process B (while holding a lock on ^x) sees changes on DataServer S made by Process A (while executing Transaction T). Conversely, if Transaction T rolled back on DataServer S, a Process B that acquires a lock on ^x, sees none of the changes made by Process A on DataServer S.

Important:

See the Conflicting, Non-Locked Change Breaks Rollback limitation for more information.

ECP Rollback Only Guarantee

Process A on AppServer C makes changes on DataServer S that are part of a Transaction T, and DataServer S unilaterally rolls back those changes (which can happen in certain network outages or data server outages).

All subsequent network requests to DataServer S by Process A are rejected with <NETWORK> errors until Process A explicitly executes a TRollback command.

Additionally, if any process on AppServer C completes a network request to DataServer S between the rollback on DataServer S and the TCommit of Transaction T (AppServer C finds out about the rollback-only condition before the TCommit), Transaction T is guaranteed to roll back on all data servers that are part of Transaction T.

ECP Transaction Recovery Guarantee

An data server crashes in the middle of an application server transaction, restarts, and completes recovery within the application server recovery timeout interval.

The transaction can be completed normally without violating any of the described guarantees. The data server does not perform any data operations that violate the ordering constraints defined by lock semantics. The only exception is the $Increment function (see ECP and Clusters $Increment Limitation for more information). Any transactions that cannot be recovered are rolled back in a way that preserves lock semantics.

Important:

InterSystems IRIS expects but does not guarantee that in the absence of continuing faults (whether in the network, the data server, or the application server hardware or software), all or most of the transactions pending into a data server at the time of a data server outage are recovered.

ECP Lock Recovery Guarantee

DataServer S has an unplanned shutdown, restarts, and completes recovery within the recovery interval.

The ECP Lock Guarantee still applies as long as all the modified data is journaled. If data is not being journaled, updates made to the data server before the crash can disappear without notice to the application server. InterSystems IRIS no longer guarantees that a process that acquires the lock sees all the updates that were made earlier by other processes while holding the lock.

If DataServer S shuts down gracefully, restarts, and completes recovery within the recovery interval, the ECP Lock Guarantee still applies whether or not data is being journaled.

Updates that are part of a transaction are always journaled; the ECP Transaction Recovery Guarantee applies in a stronger form. Other updates may or may not be journaled, depending on whether or not the destination global in the destination database is marked for journaling.

$Increment Ordering Guarantee

The $Increment function induces a loose ordering on a series of Set and Kill operations from separate processes, even if those operations are not protected by a lock.

For example: Process A performs some Set and Kill operations on DataServer S and performs a $Increment operation to a global ^x on DataServer S. Process B performs a subsequent $Increment to the same global ^x. Any process, including Process B, that sees the result of Process B incrementing ^x, sees all changes on DataServer S that Process A made before incrementing ^x.

Important:

See ECP and Clusters $Increment Limitation for more information.

ECP Sync Method Guarantee

Process A updates a global located on Data Server S, and issues a $system.ECP.Sync() call to S. Process B then issues a $system.ECP.Sync() to S. Process B can see all updates performed by Process A on Data Server S prior to its $system.ECP.Sync() call.

$system.ECP.Sync() is relevant only for processes running on an application server. If either process A or B are running on DataServer S itself, then that process does not need to issue a $system.ECP.Sync(). If both are running on DataServer S then neither needs $system.ECP.Sync, and this is simply the statement that global updates are immediately visible to processes running on the same server.

Important:

$system.ECP.Sync() does not guarantee durability; see the Dirty Data Reads in ECP without Locking limitation.

ECP Recovery Limitations

During the recovery of an ECP-configured system, there are the following limitations to the InterSystems IRIS guarantees:

ECP and Clusters $Increment Limitation

If a data server crashes while the application server has a $Increment request outstanding to the data server and the global is journaled, InterSystems IRIS attempts to recover the $Increment results from the journal; it does not re-increment the reference.

ECP Cache Liveness Limitation

In the absence of continuing faults, application servers observe data that is no more than a few seconds out of date, but this is not guaranteed. Specifically, if an ECP connection to the data server becomes nonfunctional (network problems, data server shutdown, data server backup operation, and so on), the user process may observe data that is arbitrarily stale, up to an application server connection-timeout value. To ensure that data is not stale, use the Lock command around the data-fetch operation, or use $system.ECP.Sync. Any network request that makes a round trip to the data server updates the contents of the application server ECP network cache.

ECP Routine Revalidation Limitation

If an application server downloads routines from a data server and the data server restarts (planned or unplanned), the routines downloaded from the data server are marked as if they had been edited.

Additionally, if the connection to the data server suffers a network outage (neither application server nor data server shuts down), the routines downloaded from the data server are marked as if they had been edited. In some cases, this behavior causes spurious <EDITED> errors as well as <ERRTRAP> errors.

Conflicting, Non-Locked Change Breaks Rollback

In InterSystems IRIS, the Lock command is only advisory. If Process A starts a transaction that is updating global ^x under protection of a lock on global ^y, and another process modifies ^x without the protection of a lock on ^y, the rollback of ^x does not work.

On the rollback of Set and Kill operations, if the current value of the data item is what the operation set it to, the value is reset to what it was before the operation. If the current value is different from what the specific Set or Kill operation set it to, the current value is left unchanged.

If a data item is sometimes modified inside a transaction, and sometimes modified outside of a transaction and outside the protection of a Lock command, rollback is not guaranteed to work. To be effective, locks must be used everywhere a data item is modified.

Journal Discontinuity Breaks Rollback

Rollback depends on the reliability and completeness of the journal. If something interrupts the continuity of the journal data, rollbacks do not succeed past the discontinuity. InterSystems IRIS silently ignores this type of transaction rollback.

A journal discontinuity can be caused by executing ^JRNSTOP while InterSystems IRIS is running, by deleting the Write Image Journal (WIJ) file after an InterSystems IRIS shutdown and before restart, or by an I/O error during journaling on a system that is not set to freeze the system on journal errors.

ECP Can Miss Error After Recovery

A Set or Kill operation completes on a data server, but receives an error. The data server crashes after completing that packet, but before delivering that packet to the application server system.

ECP recovery does not replay this packet, but the application server has not found out about the error; resulting in the application server missing Set or Kill operations on the data server.

Partial Set or Kill Leads to Journal Mismatch

There are certain cases where a Set or Kill operation can be journaled successfully, but receive an error before actually modifying the database. Given the particular way rollback of a data item is defined, this should not ever break transaction rollback; but the state of a database after a journal restore may not match the state of that database before the restore.

Loose Ordering in Cluster Failover or Restore

Cluster dejournaling is loosely ordered. The journal files from the separate cluster members are only synchronized wherever a lock, a $Increment, or a journal marker event occurs. This affects the database state after either a cluster failover or a cluster crash where the entire cluster must be brought down and restored. The database may be restored to a state that is different from the state just before the crash. The $Increment Ordering Guarantee places additional constraints on how different the restored database can be from its original form before the crash.

Dirty Data Reads When Cluster Member Crashes

A cluster Member A completes updates in Transaction T1, and that system commits that transaction, but in non-synchronous transaction commit mode. Transaction T2 on a different cluster Member B acquires the locks once owned by Transaction T1. Cluster Member A crashes before all the information from Transaction T1 is written to disk.

Transaction T1 is rolled back as part of cluster failover. However, Transaction T2 on Member B could have seen data from Transaction T1 that later was rolled back as part of cluster failover, despite following the rules of the locking protocol. Additionally, if Transaction T2 has modified some of the same data items as Transaction T1, the rollback of Transaction T1 may fail because only some of the transaction data has rolled back.

A workaround is to use synchronous commit mode for transactions on cluster Member A. When using synchronous commit mode, Transaction T1 is durable on disk before its locks are released, so Transaction T1 is not rolled back once the application sees that it is complete.

Dirty Data Reads in ECP Without Locking

If an incoming ECP transaction reads data without locking, it may see data that is not durable on disk which may disappear if the data server crashes. It can only see such data when the data location is set by other ECP connections or by the local data server system itself. It can never see nondurable data that is set by this connection itself. There is no possibility of seeing nondurable data when locking is used both in the process reading the data and the process writing the data. This is a violation of the In-order Updates Guarantee and there is no easy workaround other than to use locking.

Asynchronous TCommit Converts to Rollback

If the data server side of a transaction receives an asynchronous error condition, such as a <FILEFULL>, while updating a database, and the application server does not see that error until the TCommit, the transaction is automatically rolled back on the data server. However, rollbacks are synchronous while TCommit operations are usually asynchronous because the rollback will be changing blocks the application server should be notified of before the application server process surrenders any locks.

The data server and the database are fine, but on the application server if the locks get traded to another process you may see temporarily see data that is about to be rolled back. However, the application server does not usually do anything that causes asynchronous errors.

ECP Recovery Process, Guarantees, and Limitations

ECP Recovery Introduction

ECP Recovery Guarantees

In-order Updates Guarantee

ECP Lock Guarantee

Clusters Lock Guarantee

Rollback Guarantee

Commit Guarantee

Transactions and Locks Guarantee

ECP Rollback Only Guarantee

ECP Transaction Recovery Guarantee

ECP Lock Recovery Guarantee

$Increment Ordering Guarantee

ECP Sync Method Guarantee

ECP Recovery Limitations

ECP and Clusters $Increment Limitation

ECP Cache Liveness Limitation

ECP Routine Revalidation Limitation

Conflicting, Non-Locked Change Breaks Rollback

Journal Discontinuity Breaks Rollback

ECP Can Miss Error After Recovery

Partial Set or Kill Leads to Journal Mismatch

Loose Ordering in Cluster Failover or Restore

Dirty Data Reads When Cluster Member Crashes

Dirty Data Reads in ECP Without Locking

Asynchronous TCommit Converts to Rollback

See Also