Skip to main content

Journaling

Global journaling records all global update operations performed on a database, and used in conjunction with backup makes it possible to restore a database to its state immediately before a failure or crash.

While backup is the cornerstone of physical recovery, it is not the complete answer. Restoring a database from backup does not recover global updates made since that backup, which may have been created a number of hours before the point at which physical integrity was lost. These post-backup updates can be restored to the database from journal files after the database is restored from backup, bringing the database up to date. Any transactions open at the time of the failure are rolled back to ensure transaction integrity.

This chapter discusses the following topics:

Journaling Overview

Each instance of Caché keeps a journal, a set of files that keeps a time-sequenced log of updates that have been made to databases since the last backup. The process is redundant and logical and does not use the Caché write daemon. Caché transaction processing works with journaling to maintain the logical integrity of data following a failure.

Together, backup and journaling allow you to recreate your database. If a failure renders your database corrupt, inaccessible or unusable, you can restore the most recent backup and then apply the changes in the journal to recreate your database to the point of failure. This method of recovering from a loss of physical integrity is known as “roll forward” recovery. The journal is also used for rolling back incomplete transactions.

The journaling state is a property of the database, not individual globals. A database can have only one of two global journaling states: Yes or No. By default, all databases you create are journaled (the Global Journal State is Yes). In newly installed Caché instances, the CACHEAUDIT, CACHESYS, and USER databases are journaled; the CACHELIB, CACHETEMP, DOCBOOK, CACHE, and SAMPLES databases are not. Operations to globals in CACHETEMP are never journaled; map temporary globals to the Caché temporary database, CACHETEMP.

Important:

Be sure to read Consequences of Not Journaling Databases for important information about limits to the recovery of non-journaled databases.

When Caché starts, it reapplies all journal entries since the last write daemon pass. Since user processes update the journal concurrently, rather than through the write daemon, this approach provides added assurance that updates prior to a crash are preserved.

In addition to recording all updates to journaled databases, the journal contains all updates to non-journaled databases that are part of transactions (primarily Set and Kill operations). This greatly improves the reliability of the system, avoiding post-recovery inconsistencies due to updates to globals that may or may not be journaled, and that may or may not be involved in transactions. (Set and Kill operations on local and process-private variables are not journaled.)

Journaling global operations in databases mounted on a cluster depends on the database setting. The local Caché instance does not journal transaction operations to globals on remote nodes. In a network configuration, journaling is the responsibility of the node on which the global actually resides, not the one that requests the Set or Kill. Thus, if node B performs a Set at the request of node A, the journal entry appears in the journal on node B, not node A.

Note:

If you need to journal restore a privately mounted database on a Caché cluster node from a point prior to the last crash/start/restart of the node, use the cluster journal restore procedure (see Cluster Journal Restore in the “Cluster Journaling” chapter of this guide) instead of noncluster journal restore because open ECP transactions from the crashed node are transferred to the surviving node rather than rolled back. InterSystems recommends that you back up the databases after a cluster failover recovery, which makes it unnecessary to start journal restore from a pre-failover point.

The following topics provide greater detail of how journaling works:

Differences Between Journaling and Write Image Journaling

In this chapter, “the journal” refers to the journal file; “journaling” refers to the writing of global update operations to the journal file. Do not confuse the Caché journal described in this chapter with write image journaling, which is described in the “Write Image Journaling and Recovery” chapter of this guide. Journaling and write image journaling have different functions, as follows:

  • Journaling provides a complete record of all database modifications (as long as you have journaling enabled for the database). In the event that some modifications are lost, for example because they occurred after the most recent backup of a recovered database, you restore them to the database by restoring the contents of the journal file.

  • Write image journaling provides a record of all database modifications that have been made in memory but not yet written to the database. When a system crash occurs, the system automatically writes the contents of the write image journal to the database when it restarts.

Protecting Database Integrity

The Caché recovery process is designed to provide maximal protection:

  • It uses the “roll forward” approach. If a system crash occurs, the recovery mechanism completes the updates that were in progress. By contrast, other systems employ a “roll back” approach, undoing updates to recover. While both approaches protect internal integrity, the roll forward approach used by Caché does so with reduced data loss.

  • It protects the sequence of updates; if an update is present in the database following recovery, all preceding updates are also present. Other systems which do not correctly preserve update sequence may yield a database that is internally consistent but logically invalid.

  • It protects the incremental backup file structures, as well as the database. You can run a valid incremental backup following recovery from a crash.

Automatic Journaling of Transactions

In a Caché application, you can define a unit of work, called a transaction. Caché transaction processing uses the journal to store transactions. Caché journals any global update that is part of a transaction regardless of the global journal state setting for the database in which the affected global resides.

You use commands to:

  • Indicate the beginning of a transaction.

  • Commit the transaction, if the transaction completes normally.

  • Roll back the transaction, if an error is encountered during the transaction.

Caché supports many SQL transaction processing commands. See the “Transaction Processing” chapter of Using Caché ObjectScript for details on these commands.

Rolling Back Incomplete Transactions

If a transaction does not complete, Caché rolls back the transaction using the journal entries, returning the globals involved to their pre-transaction values. As part of updating the database, Caché rolls back incomplete transactions by applying the changes in the journal, that is, by performing a journal restore. This happens in the following situations:

  • During recovery, which occurs as part of Caché startup after a system crash.

  • When you halt your process while transactions are in progress.

  • When you use the Terminate option to terminate a process from the Process page of the Management Portal (System Operation > Process). If you terminate a process initiated by the Job command, the system automatically rolls back any incomplete transactions in it. If you terminate a user process, the system sends a message to the user asking whether it should commit or roll back incomplete transactions.

You can write roll back code into your applications. The application itself may detect a problem and request a rollback. Often this is done from an error-handling routine following an application-level error.

See the Managing Transactions Within Applications section of the “Transaction Processing” chapter of Using Caché ObjectScript for more information.

Consequences of Not Journaling Databases

Databases that are not journaled do not participate in journal recovery and transaction rollback at Caché startup. As a consequence, the following conditions apply after a failure, backup restore, and restart:

  • The most recent updates to non-journaled databases can be lost; the data in these databases will represent an earlier moment in time than the data in journaled databases.

  • Transactions or portions of transactions in non-journaled databases can be left partially committed. The durability provided by synchronous-commit transactions does not apply to databases that are not journaled.

  • While updates to non-journaled databases that are part of transactions are journaled, these journal records are used only to roll back transactions during normal operation. These journal records do not provide a means to roll back transactions at startup, nor can they be used to recover data at startup or following a backup restore.

The Journal Write Cycle

The operation that writes the contents of the journal buffer to the journal file is called a journal sync. A journal sync is guaranteed to write all operations currently in the journal buffer to the current journal file.

The frequency with which the journal is synced depends on the operating circumstances of the Caché instance involved. A journal sync can be triggered:

  • Once every two (2) seconds if the system is idle.

  • In an ECP configuration, by the data server when responding to specific requests (for example, $Increment) from the application servers to guarantee ECP semantics.

  • By a TCOMMIT (in synchronous commit mode, which causes the data involved in that transaction to be flushed to disk) if you are using Caché transactions.

  • As part of every database write cycle by the write daemon.

  • When the journal buffer is full.

Journal Files and Journal History Log

Journal files are stored in the primary journal directory (install-dir\Mgr\journal by default) and are logged in the journal history log file, install-dir\Mgr\journal.log, which contains a list of all journal files maintained by the instance. The log is used by all journal-related functions, utilities, and APIs to locate journal files.

The journal history log file is updated as follows:

  • Entries are added to the log when a new journal file is created.

  • Entries are purged periodically, starting from the beginning of the file, if the corresponding journal file identified by the entry no longer exists and the entry is 30 days old or older. Purging stops when an entry is reached that does not satisfy both criteria.

Caution:

Do not modify the journal.log file. If the file is modified outside of the journal utilities, it may be viewed as being corrupt, which may disable journaling. If the file is corrupt, contact the InterSystems Worldwide Response Center (WRC)Opens in a new tab for guidance. If journaling is disabled (that is, Caché is not able to update the journal.log file), rename the corrupt log file and restart journaling.

InterSystems recommends that you include the journal.log file in your backup strategy to ensure that it is available when needed for a journal restore following a backup restore; for information about backup and restore strategies and procedures, see the “Backup and Restore” chapter in this guide.

If the journal.log file is missing (for example, if you renamed the file because it is corrupt), the system creates a new one when a new journal file is created, but information about previous journal files is lost because the log file only lists journal files created since it was created. Unlisted journal files are not available for journal-related functions, utilities, and APIs that use the journal.log file. However, for journal restores, if the journal.log file is missing or you do not want to use the existing log file, you can specify the journal files manually (see the “Restore Globals from Journal Files Using ^JRNRESTO” section in this chapter).

In addition, you can use the journal.log file to migrate/restore journal files to different locations, as follows:

  1. Copy the journal files and journal.log file to a location other than the install-dir\Mgr directory on the target Caché instance.

  2. On the target system, run the ^JRNRESTO routine, and enter No in response to the following prompt:

    Are journal files created by this Cache instance and located in their original
    paths? (Uses journal.log to locate journals)?
    
    
  3. When prompted, specify the locations (on the target system) of the copied journal files and journal.log file; ^JRNRESTO uses the log file to validate the range of journal files you want to migrate/restore to the target system.

  4. Complete the process as described in the “Restore Globals from Journal Files Using ^JRNRESTO” section in this chapter.

Note:

When a Caché instance becomes a member of a mirror, the following journaling changes to support mirroring occur:

  • On becoming primary, a journal switch is triggered to a new journal file prefixed with MIRROR-mirror_name, for example MIRROR-MIR21-20120921.001. From that point, all journal files are written as mirror journal files and logged to the mirrorjrn-mirror_name.log, for example mirrorjrn-MIR21-20120921.log, as well as to journal.log.

  • On becoming backup or async, mirror journal files received from the primary are written to the configured journal directory along with the local instance’s standard journal files, and a copy of the primary’s mirror journal log (mirrorjrn-mirror_name.log) is created in install-dir\Mgr and continuously updated.

For more information about the role of journal files in mirroring, see Mirror Synchronization in the “Mirroring” chapter of the Caché High Availability Guide.

Using Temporary Globals and CACHETEMP

Nothing mapped to the CACHETEMP database is ever journaled.

Since the globals in a namespace may be mapped to different databases, some may be journaled and some may not be. It is the journal property for the database to which the global is mapped that determines if Caché journals the global operation. The difference between CACHETEMP and a database with the journal property set to No is that nothing in CACHETEMP, not even transactional updates, are journaled.

Note:

A database configured with the Journal globals property set to No (see the Create Local Databases in the “Configuring Caché” chapter of the Caché System Administration Guide) continues to journal global Set/Kill operations in journal transactions, which can cause the journal file to become very large. CACHETEMP, however, does not journal Set/Kill operations, even when they are in a journal transaction.

If you need to exclude new z/Z* globals from journaling, map the globals to a database with the journal property set to No. To always exclude z/Z* globals from journaling, you must map them in every namespace to the CACHETEMP database.

Caché does not journal temporary globals. Some of the system globals designated by Caché as temporary and contained in CACHETEMP are:

  • ^%cspSession

  • ^CacheTemp*

  • ^mtemp*

Journal Management Classes and Globals

See the class documentation for %SYS.Journal.SystemOpens in a new tab in the InterSystems Class Reference for information on available journaling methods and queries. It is part of the %SYS.Journal package.

Also, Caché uses the ^%SYS(“JOURNAL”) global node to store information about the journal file. For example:

  • ^%SYS("JOURNAL","ALTDIR") stores the name of the alternate journal directory.

  • ^%SYS("JOURNAL","CURDIR") stores the name of the current journal directory.

  • ^%SYS("JOURNAL","CURRENT") stores journal status and the journal file name.

You can view this information from the System Explorer > Globals page of the Management Portal.

Configuring Journaling

There are a number of factors to consider in planning and configuring journaling. Topics in this section include the following:

Enabling Journaling

By default, journaling is enabled for the Caché databases CACHESYS, CACHEAUDIT, and USER. You can enable or disable journaling on each database from the Local Databases page of the Management Portal (System Administration > Configuration > System Configuration > Local Databases). Click Edit on the row corresponding to the database and click Yes or No in the Global Journal State box.

The default setting of the journal state for new databases is Yes. When you first mount a database from an earlier release of Caché, the value is set to Yes, regardless of the previous setting for new globals and regardless of the previous settings of individual globals within that database.

You can change the global journal setting for a database on a running system. If you do this, Caché warns you of the potential consequences and audits the change if auditing is enabled.

Journal File Naming

A journal file’s name consists of an optional user-defined prefix, a base name consisting of the date and time it is created in the format yyyymmdd.nnn, and a suffix nnn used to incrementally number the journal files created during one calendar day. When a journal file fills, the system automatically switches to a new one with the same prefix and base name but with the suffic increased by one. The base name changes only if a new calendar day begins while the journal file is in use.

For example, if the first journal file that is active on April 27, 2014 is named 20140427.001. When it fills, the system starts a new one called 20140427.002. If midnight passes and the date changes while the journal file is in use, however, it is renamed 20140428.001.

Journaling Best Practices

The following are some important points to consider when planning and configuring journaling:

  • Journal files are stored in both a primary journal directory and an alternate journal directory (for use if the primary directory becomes unwriteable for any reason).

    In the interests of both performance and recoverability, InterSystems recommends placing the primary and alternate journal directories on storage devices that are separated from the devices used by databases and the write image journal (WIJ), as well as separated from each other. For practical reasons, these different devices may be different logical unit numbers (LUNs) on the same storage area network (SAN), but the general rule is the more separation the better, with separate sets of physical drives highly recommended. The major benefits of this separation between database/WIJ storage and primary and alternate journal storage include the following:

    • Isolating journal directories from failures that may compromise the databases or WIJ ensures that journal files will be available for use in restoring the database after such a failure.

    • Separating primary and alternate journal directories ensures that when an outage occurs on the device on which the primary directory is located, journaling can continue.

    • Separating journal I/O paths is a key factor in achieving the I/O concurrency that most applications require.

    For simplicity and convenience, Caché installation creates the directory install-dir\Mgr\journal, configures it as both the primary and alternate journal directory, and creates in it the first journal file for the default journaled databases. InterSystems recommends, however, that you identify and prepare separate storage devices for the primary and alternate journal directories and reconfigure these settings (as described in Configuring Journal Settings) as soon as possible after installation.

    Note:

    Journal files should always be backed up along with database files, as described in the “Backup and Restore” chapter of this guide. Consider replicating journal files offsite for disaster recovery purposes, enabling recovery from a failure involving multiple storage devices at the primary data center. Caché mirroring, Caché shadowing, disk-level replication or file system shadowing can be used for this purpose.

  • Verify that journaling is enabled for all databases (other than those that contain only transient data).

    Important:

    Be sure to read Consequences of Not Journaling Databases for important information about limits to the recovery of non-journaled databases.

  • Consider setting the journal Freeze on error option to Yes. If a failure causes the system to be unable to write to both the primary and the alternate journal devices, this setting causes the system to freeze, making it unavailable to users and ensuring that no data is lost. Alternatively, you can set Freeze on error option to No, which lets the system continue and leads to journaling being disabled, keeping the system available but compromising data integrity and recoverability. See Journal I/O Errors for more information about Freeze on error.

  • Do not purge a journal file unless it was closed prior to the last known good backup, as determined by the backup validation procedure. Set the number of days and the number of successful backups after which to keep journal files appropriately.

  • To ensure optimal performance during a journal restore, consider increasing the size of the generic memory heap (gmheap); see Restore Journal Files for more information.

Configuring Journal Settings

To configure Caché journaling, navigate to the System Administration > Configuration > System Configuration > Journal Settings page of the Management Portal.

You can edit the following settings:

  • Primary journal directory — Enter the name of a directory in which to store the journal files. The directory name may be up to 214 characters long.

  • Secondary journal directory — Enter the name of an alternate directory for journaling to use if the current directory becomes unwritable for any reason. (You can also manually switch journal directories, as described in Switch Journal Directories.) The directory name may be up to 214 characters long.

    Important:

    InterSystems recommends placing the primary and alternate journal directories on storage devices that are separated from the devices used by databases and the write image journal (WIJ), as well as separated from each other; see Journaling Best Practices for more information.

  • Start new journal file every — Enter the number of megabytes for the maximum size of the journal file after which the journal file switches. The default size is 1024 MB; the maximum size is 4079 MB.

  • Journal File Prefix (optional) — Enter an alphanumeric prefix for journal file names.

  • When to purge journal files — You can set either or both of the following two options. If you enter nonzero values for both settings, purging occurs when a journal file meets whichever of the two conditions occurs first. If you set 0 (zero) for one and not the other, purging is determined by the nonzero setting. If both are 0, the automatic purging of journal files (and journal history) is disabled.

    • After this many days — Enter the number of days after which to purge (valid values: 0-100).

      Note:

      When you set the number of days after which to purge journal files, the last journal file from the day before the purge limit is also retained. For example, if After this many days is set to 1 and the purge is run at midnight on April 1, the last journal file created on March 30 is retained along with those created on March 31.

    • After this many successive successful backups — Enter the number of consecutive successful backups after which to purge (valid values: 0-10).

      This includes Caché online backups, an external backup using $$BACKUP^DBACK("","E"), or the use of the Backup.General.ExternalSetHistory()Opens in a new tab method to add to the backup history.

    Note:

    If After this many days is set to 0 (no time-based purge) and After this many successive successful backups is set to 1, journal files are not purged until there have been two successful backups; that is, there must be two successful backups for the “successive” criterion to be met.

    You can also update these settings using the ^JRNOPTS routine or by selecting option 7, Edit Journal Properties, from the ^JOURNAL routine menu. See Update Journal Settings Using ^JRNOPTS for details.

    Note:

    Journal files are sometimes retained even if they meet the criteria of the purge setting. When this happens, the event is recorded in the console log and the reason (for example, that the journal file contains open transactions) is provided.

  • Freeze on error — Select Yes or No. This setting controls the behavior when an error occurs in writing to the journal. The default is No. See the Journal I/O Errors section for a detailed explanation of this setting.

    Note:

    When a Caché instance is the primary failover member of a mirror (see the “Mirroring” chapter of the Caché High Availability Guide), the instance’s Freeze on error configuration is automatically overridden to freeze all journaled global updates when a journal I/O error occurs, regardless of the current setting. If the current setting is No, behavior reverts to this setting when the instance is no longer a primary failover member.

  • Journal CSP Session — Select Yes or No. This setting controls whether or not Caché Server Page (CSP) session journaling is enabled. The default is No.

  • Write image journal entry — Enter the location of the write image journal (WIJ) file. See the Write Image Journaling section of the “Write Image Journaling and Recovery” chapter of this guide for a detailed explanation of this setting.

  • Target size for the wij (MB) (0=not set) — Enter the target size for the WIJ file.

Note:

All of the settings on this page are included in the instance’s cache.cpf file. For information about the journal settings, see [Journal] in the Caché Parameter File Reference; for information about the WIJ settings, see Write Image Journal (WIJ) File in the “Write Image Journaling and Recovery” chapter of this guide, as well as targwijsiz and wijdir in the [config] section of the Caché Parameter File Reference).

You are not required to restart Caché after changing most of these settings (except where indicated), but any change causes a new journal file to begin.

There are two additional configuration settings affecting journaling, as follows:

  • jrnbufs — Specifies the amount of memory allocated to journal buffers; the default is 64 MB, the maximum is 1024 MB, and the minimum is 16 MB for Unicode instances and 8 MB for 8-bit instances. Increasing this setting means increasing the amount of journal data that can be held in memory, which improves journaling performance, but increases the maximum amount of journal data that could be lost in the event of a system failure because it was written to the buffer after the last journal sync (see The Journal Write Cycle).

    To change the jrnbufs setting, navigate to the Advanced Memory Settings page of the management portal (System Administration > Configuration > Additional Settings > Advanced memory). The jrnbufs setting can also be changed by editing the cache.cpf file; for more information, see jrnbufs in the Caché Parameter File Reference.

  • SynchCommit — Specifies when the TCOMMIT command requests that journal data involved in a transaction be flushed to disk: when this setting is true, TCOMMIT does not complete until the journal data write operation completes; when it is false (the default), TCOMMIT does not wait for the write operation to complete.

    To change the SynchCommit setting, navigate to the Compatibility Settings page of the management portal (System Administration > Configuration > Additional Settings > Compatibility). For more information on SynchCommit, see SynchCommit in Caché Additional Configuration Settings Reference and TCOMMIT in Caché ObjectScript Reference.

Journaling Operation Tasks

Once journaling is configured there are several tasks you can perform:

Start Journaling

If journaling is stopped, you can start it using the ^JRNSTART routine or by selecting option 1, Begin Journaling, from the ^JOURNAL routine menu. See Start Journaling Using ^JRNSTART for details.

Note:

You cannot start journaling from the Management Portal.

When you start journaling, Caché audits the change if auditing is enabled.

Stop Journaling

Stopping journaling system wide has a number of generally undesirable consequences, as described in the Journal Freeze on Error Setting is No section. Both shadowing and transaction processing are affected.

When you stop journaling, transaction processing ceases. If a transaction is in progress when you stop journaling, the complete transaction may not be entered in the journal. To avoid this problem, it is best to make sure all users are off the system before stopping journaling.

If you stop journaling and Caché crashes, the startup recovery process does not roll back incomplete transactions started before journaling stopped since the transaction may have been committed but not journaled.

In contrast, transactions are not affected in any adverse way by switching journal files. Rollback correctly handles transactions spanning multiple journal files created by journal switching; so, if possible, it is better to switch journal files than to stop journaling.

You can stop journaling using the ^JRNSTOP routine or by selecting option 2, Stop Journaling, from the ^JOURNAL routine menu. See Stop Journaling Using ^JRNSTOP for details.

Note:

You cannot stop journaling from the Management Portal.

When you stop journaling, Caché audits the change if auditing is enabled.

View Journal Files

You can view a journal file on the Home > Journals > View Journal page of the Management Portal.

  1. Click Journals from the System Operations menu of the home page to list the instance’s journal files. Use the Filter box to shorten the list if necessary.

  2. If the instance is configured as a mirror member, all journal files including mirrored and nonmirrored are displayed by default. Optionally click the link containing the mirror name, for example Mirror Journal Files Of 'MUNDANE', to display a list of mirror journal files only. If the instance is configured as a reporting async member of multiple mirrors, there is a separate link for the journal files from each mirror. To return to displaying all journal files, click the All Journal Files link.

    Note:

    For information about mirror journal files, see Journal Files and Journal History Log in this chapter and Mirror Synchronization in the “Mirroring” chapter of the Caché High Availability Guide.

  3. To view a journal file, click View in the row of the journal file you want to see.

  4. The journal file is displayed record by record on the System Operation > Journals > View Journal page. You can:

    1. Click in the Offset column of a record to view a dialog box containing its details.

    2. Choose whether to color code the records by the time of entry, the process that performed the operation recorded in the journal, the type of operation, whether the operation was part of a transaction, the global involved in the operation, or the database involved in the operation.

      Note:

      The Journal File Operations table in the section Display Journal Records Using ^JRNDUMP provides information about the values that appear in the Type column of the display to indicate the type of operation represented by each record.

    3. Search for a particular record set of records using the Match boxes and the Search button.

      1. For a manual search, set the first drop-down to the column you want to search by, select an operator such as “equal to” or “not equal to”, and enter the value you want to match in the right-most box, then click Search.

      2. To match a particular cell in one of the columns, just double-click in that cell. For example, to find all journal records containing KILL operations, double-click in any cell in the Type column containing KILL. The operator drop-down is automatically set to “equal to” but you can change that before pressing Search.

You can also use the ^JRNDUMP utility to display the entire journal and the SELECT^JRNDUMP entry point to display selected entries. See Display Journal Records Using ^JRNDUMP for details.

Switch Journal Files

The system automatically switches the journal file in the following situations:

  • After a successful backup of a Caché database.

  • When the current journal file grows to the maximum file size allowed (configurable on the Journal Settings page).

  • When the journal directory becomes unavailable and you specified an alternate directory.

  • After updating settings on the Journal Settings page.

Switching the journal file is preferable to stopping and starting journaling because during the latter process, any global operations that occur after stopping but before restarting are not journaled.

To manually switch journal files:

  1. Navigate to the System Operation > Journals page of the Management Portal.

  2. Click Switch Journal above the list of database journal files.

  3. Confirm the journal switch by clicking OK.

You can also switch journal files using the ^JRNSWTCH routine or by selecting option 3, Switch Journal File from the ^JOURNAL routine menu. See Switch Journal Files Using ^JRNSWTCH for details.

Switch Journal Directories

As described in Configuring Journal Settings, journaling automatically switches to the secondary journaling directory (assuming it is configured) if the primary directory becomes unwritable for any reason. To manually switch journaling directories, do the following:

  1. Navigate to the System Operation > Journals page of the Management Portal.

  2. Click Switch Directory above the list of database journal files.

  3. Confirm the journal switch by clicking OK.

You can also switch journal directories by selecting option 13, Switch Journaling to Secondary/Primary Directory from the ^JOURNAL routine menu. See Switch Journaling Directories Using SWDIR^JOURNAL for details.

Display Journal File Profiles

You can display the global profile of a journal file, showing the globals that appear in the file’s records and the number of records each appears in, on the Journal Profile page (Home > Journals > Journal Profile).

  1. Click Journals from the System Operations menu of the home page to list the instance’s journal files. Use the Filter box to shorten the list if necessary.

  2. To display a journal file profile, click Profile in the row of the appropriate journal file. The Journal Profile page displays with the profile on it. If the journal file has a large number of records, it may take a little while to build the profile.

  3. You can sort the journal profile by global or by the cumulative size, in bytes, of all the records in which each global appears.

  4. If the journal file is the current one, you can use the Recalculate button build the profile again after some time has passed.

Check Journal File Integrity

You can check the integrity of a journal file on the Journals page. This operation verifies that the journal file ends where it is expected to end, which verifies that there are no records missing from the end of the file.

  1. Click Journals from the System Operations menu of the home page to list the instance’s journal files. Use the Filter box to shorten the list if necessary.

  2. To run an integrity check on a journal file, click Integrity Check in the row of the appropriate journal file. The Journal Integrity Check page displays.

  3. Select Check Details to scan the journal file record by record from the beginning to detect potential missing records.

  4. Once you have clicked OK, a link to the System Operation > Background Tasks page appears, letting you view the status and results of the integrity check.

View Journal File Summaries

You can view information about a journal file on the Journal Profile page. For example, you can find out whether the journal file is encrypted, and what databases are affected by the operations recorded in the journal file.

  1. Click Journals from the System Operations menu of the homepage to list the instance’s journal files. Use the Filter box to shorten the list if necessary.

  2. To view information about a journal file, click Summary in the row of the appropriate journal file. The Journal File Summary page displays.

Purge Journal Files

You can schedule a task to run regularly that purges obsolete journal files. A new Caché instance contains a pre-scheduled Purge Journal task that is scheduled to run after the daily Switch Journal task that runs at midnight. For information about purging mirror journal files, see Purging Mirror Journal Files.

The purge process deletes journal files based on the When to purge journal files setting on the Journal Settings page; for information, see Configure Journal Settings in this chapter.

Note:

Journal files are sometimes retained even if they meet the criteria of the purge setting. When this happens, the event is recorded in the console log and the reason (for example, that the journal file contains open transactions) is provided.

You can also purge journal files using the PURGE^JOURNAL routine or by selecting option 6, Purge Journal Files from the ^JOURNAL routine menu. See Purge Journal Files Using PURGE^JOURNAL for details.

Note:

The configured journal purge settings can be overridden by the %ZJRNPURGE routine; for more information, contact the InterSystems Worldwide Response Center (WRC)Opens in a new tab.

Purging Mirror Journal Files

Mirror journal files are subject to additional purge criteria because they must be successfully distributed by the primary failover member to the other mirror members and dejournaled on each to synchronize the mirrored databases (see Mirror Synchronization in the “Mirroring” chapter of the Caché High Availability Guide for a full description of this process). Transmission of the files to the backup is synchronous and always rapid when the mirror is operating normally, but transmission to asynchronous (async) members may take longer and may be delayed when an async is disconnected from the mirror. Backup and DR async members must also follow the same policy as the primary, since they are eligible to become primary in failover or disaster recovery situations. Mirror journal files are therefore purged as follows:

  • On the primary failover member, a file is purged when the local journal file purge criteria have been met (see Configure Journal Settings) and when it has been received by the backup (if there is one) and all async members, whichever takes longer. If an async has been disconnected from the mirror for more than 14 days, however, files are purged even if that async has not yet received them.

  • On the backup failover member (if there is one) and any disaster recovery (DR) async members, a file is purged when it has been fully dejournaled on that member, when local journal file purge criteria have been met, and when it has been received by all async members, with the same exception for asyncs that have been disconnected for more than 14 days.

  • On reporting async members, mirror journal files are purged immediately after they have been dejournaled by default, to ensure that async mirror members do not run out of space (particularly when they are receiving journal files from multiple mirrors). You can optionally configure a reporting async to instead retain the files and purge them according to local journal file purge criteria; see Editing or Removing an Async Member in the “Mirroring” chapter.

No mirror journal file containing a currently open transaction is ever purged on any mirror member.

Note:

When a mirror journal file is retained longer than would be dictated by local journal file purge criteria, this is recorded in the member’s console log and the reason is provided.

You can modify the defaults for purging mirror journal files with the SYS.Mirror.JrnPurgeDefaultWait()Opens in a new tab method.

Restore Journal Files

After a system crash or disk hardware failure, recreate your database by restoring your backup copies. If you have been journaling and your journal file is still accessible, you can further restore your databases by applying changes since the last backup recorded in the journal files to the databases.

To restore the journal files:

  1. First confirm that all users exit Caché.

  2. Stop journaling if it is enabled.

  3. Restore the latest backup of your database. See the “Backup and Restore” chapter of this guide for more information.

  4. Run the journal restore utility. See the Restore Globals From Journal Files Using ^JRNRESTO section for details.

  5. Restart journaling if it is disabled.

Note:

You cannot run the journal restore process from the Management Portal.

Journaling Utilities

Caché provides several utilities to perform journaling tasks. The ^JOURNAL utility provides menu choices to run some common journaling utilities, which you can also run independently. There are also several other journaling utilities, which you run from the %SYS namespace.

The following sections describe the journaling utilities in detail:

In the following sections the sample procedures show C:\MyCache as the Caché installation directory.

Perform Journaling Tasks Using ^JOURNAL

The following example shows the menu available by invoking the ^JOURNAL routine; the full menu is not repeated in subsequent examples.

%SYS>Do ^JOURNAL
 
 1) Begin Journaling (^JRNSTART)
 2) Stop Journaling (^JRNSTOP)
 3) Switch Journal File (^JRNSWTCH)
 4) Restore Globals From Journal (^JRNRESTO)
 5) Display Journal File (^JRNDUMP)
 6) Purge Journal Files (PURGE^JOURNAL)
 7) Edit Journal Properties (^JRNOPTS)
 8) Activate or Deactivate Journal Encryption (ENCRYPT^JOURNAL())
 9) Display Journal status (Status^JOURNAL)
10) -not available-
11) -not available-
12) Journal catch-up for mirrored databases (MirrorCatchup^JRNRESTO)
13) Switch Journaling to Secondary Directory (SWDIR^JOURNAL)

Option?
Note:

The -not available- text for options 10 and 11 is replaced as follows:

  • Option 10) Cluster Journal Restore (CLUMENU^JRNRESTO) is displayed only on a cluster node; for information, see Restore Cluster Journal Using CLUMENU^JRNRESTO in this section.

  • Option 11) Manage pending or in progress transaction rollback (Manage^JRNROLL) is displayed if pending or in-progress transaction rollbacks are encountered when you run the ^STURECOV (at system startup) or ^MIRROR (at primary mirror member startup) routine; for more information, see Manage Transaction Rollback Using Manage^JRNROLL in this section.

Enter the appropriate menu number option to start that particular routine. Press Enter without entering an option number to exit the utility. The following subsections describe the options available through the ^JOURNAL utility:

Start Journaling Using ^JRNSTART

To start journaling, run ^JRNSTART or enter 1 at the Option prompt of the ^JOURNAL menu, as shown in the following examples.

Example of running ^JRNSTART directly:

%SYS>Do ^JRNSTART

Example of starting journaling from the ^JOURNAL menu:

%SYS>Do ^JOURNAL
 
 1) Begin Journaling (^JRNSTART)
 ...
Option? 1

If journaling is running when you select this option, a message similar to the following is displayed:

Already journaling to C:\MyCache\mgr\journal\20151113.001 

Stop Journaling Using ^JRNSTOP

To stop journaling, run ^JRNSTOP or enter 2 at the Option prompt of the ^JOURNAL menu, as shown in the following examples.

Note:

When the Freeze on error flag (see Configure Journal Settings in this chapter) is set to “Yes,” stopping journaling is allowed (although there is a risk of data loss) and does not cause the instance to freeze.

Example of running ^JRNSTOP directly:

%SYS>Do ^JRNSTOP
 
Stop journaling now? No => Yes


Example of stopping journaling from the ^JOURNAL menu:

%SYS>Do ^JOURNAL
 
 ...
 2) Stop Journaling (^JRNSTOP)
 ...
Option? 2
Stop journaling now? No => Yes

If journaling is not running when you select this option, you see a message similar to the following:

Not journaling now.

Switch Journal Files Using ^JRNSWTCH

To switch the journal file, run ^JRNSWTCH or enter 3 at the Option prompt of the ^JOURNAL menu, as shown in the following example:

%SYS>Do ^JOURNAL
 
 ...
 3) Switch Journal File (^JRNSWTCH)
 ...
Option? 3
Switching from: C:\MyCache\mgr\journal\20151113.002
To:             C:\MyCache\mgr\journal\20151113.003

The utility displays the name of the previous and current journal files.

Switch Journaling Directories Using SWDIR^JOURNAL

To switch journaling directories, assuming a secondary directory is configured as described in Configuring Journal Settings, run SWDIR^JOURNAL or enter 13 at the Option prompt of the ^JOURNAL menu, as shown in the following example:

%SYS>Do ^JOURNAL
 
 ...
13) Switch Journaling to Secondary Directory (SWDIR^JOURNAL)

Option? 3
Option? 13
Journaling to \\remote\MyCache\journal_secondary\MIRROR-MIRRORONE-20150720.007

The utility displays the name of the current journaling directory and journal file following the switch.

Restore Globals From Journal Files Using ^JRNRESTO

The Caché ^JRNRESTO routine is used after a database is restored from backup to return it to its state immediately prior to a failure by applying updates from journal files. This is called a journal restore, and the process of applying the changes is called dejournaling. A journal restore dejournals all journal records created between the creation of the backup and the failure. For example, if the database was backed up early Tuesday morning and crashed on Wednesday afternoon, after you restore the Tuesday backup, you can restore updates from the journal files created on Tuesday and Wednesday.

If there are sufficient computing and memory resources available, and you do not choose either to abort the operation due to both database-related problems and journal-related problems or to enable journaling of the updates during the restore, up to four jobs can perform the updates to separate databases in parallel within a journal restore operation. This is called parallel dejournaling and increases the performance of the operation.

Parallel dejournaling is used only when the host system has at least eight CPUs and the Caché instance involved has enough generic memory available to allocate for this purpose. In practice, parallel dejournaling will not be used in journal restores on most Caché instances unless generic memory is increased. The number of parallel dejournaling jobs can never exceed the size of the generic memory heap divided by 200; for example, to support four dejournaling jobs running in parallel, the generic heap must be greater than or equal to 800 MB. (Even if you do not have enough memory available to support parallel dejournaling, dejournaling throughput may improve if you increase the size of the generic memory heap from the default.)

Note:

To change the size of the generic memory heap or gmheap (sometimes known as the shared memory heap or SMH), navigate to the Advanced Memory Setting page (System Administration > Configuration > Additional Settings > Advanced Memory); see Advanced Memory Settings in the “Caché Additional Configuration Settings” chapter of the Caché Additional Configuration Settings Reference for more information.

Parallel dejournaling is also used by Caché mirroring; for more information, see Configuring Parallel Dejournaling in the “Mirroring” chapter of the Caché High Availability Guide.

^JRNRESTO restores only to databases whose journal state is Yes at the time of the journal restore. The first time it encounters each database, the routine checks and records its journal state. The restore process skips journal records for databases whose journal state is No. If no databases are marked as being journaled, the routine asks if you wish to terminate the restore; you can then change the database journal state to Yes on specific databases and restart ^JRNRESTO.

Note:

The journal state of a database at the time of restore determines what action is taken; Caché stores nothing in the journal about the current journal state of the database when a given journal record is written. This means that changes to databases whose journal state is Yes are durable, but changes to other databases may not be. Caché ensures physical consistency, but not necessarily application consistency, if transactions involve databases whose journal state is No.

^JRNRESTO lets you make several decisions about the journal restore. Using ^JRNRESTO, you can do the following:

  • Restore global updates to databases in the Caché instance in which you are running the routine, or to databases in another Caché instance. You can choose to restore updates for all globals to all databases in the current instance, or to select individual databases in the current or another instance and optionally specify which globals to restore to each.

  • Restore mirror journal files to a mirrored database (catch up a mirrored database) or to a non-mirrored database. On a mirror member, you are prompted to indicate whether you are catching up a mirrored database, as noted in the following procedure; if so, the procedure is redirected to the MirrorCatchup^ entry point to ^JRNRESTO (see Restore Journal to Mirrored Database Using MirrorCatchup^JRNRESTO).

  • Apply existing journal filters (see Filter Journal Records Using ^ZJRNFILT) to the restore.

  • Select a range of journal files to restore from.

  • Disable journaling of updates during the restore to make the operation faster.

Caution:

If you use journal restore scripts based on prompts, you should update the scripts because some prompts may have changed since the last release.

To restore global updates from journal files:

  1. Run the ^JRNRESTO routine in the %SYS namespace, then press <Enter> at the Restore the Journal? prompt to continue.

  2. If you are running the routine on a mirror member, the following prompt is displayed.

    Catch-up mirrored databases? No =>
    
    
    • If you are restoring mirror journal files to a mirrored database in the same mirror in which the mirror journal files were created, enter yes; the procedure is redirected to the MirrorCatchup^ entry point to ^JRNRESTO (see Restore Journal to Mirrored Database Using MirrorCatchup^JRNRESTO).

    • If you are restoring mirror journal files to a non-mirrored database, or are not restoring mirror journal files, enter no or <Enter> and continue to use the procedure described here.

  3. If you have existing journal filters (see Filter Journal Records Using ^ZJRNFILT), specify whether you want to use them:

    Use current journal filter (ZJRNFILT)? 
    Use journal marker filter (MARKER^ZJRNFILT)? 
    
    
  4. Choose whether you want to restore all journaled globals to all databases in the current Caché instance, or to specify one or more databases and optionally specify which globals to restore to each.

    Process all journaled globals in all directories? 
    
    
    • Enter Yes if you want to restore all globals to all databases in the current instance.

    • Enter No or <Enter> if you want to restore only selected databases in the current or another instance. Then do the following:

      • Indicate whether the journal files were created under a different operating system from that of the current system. This is important because the directory paths you specify for the databases you want to restore must exactly match the paths in the journal files, which are in canonical form. If you respond with No, ^JRNRESTO puts the directory paths you enter into canonical form for the current operating system so they will match those in the journal files. If you respond with Yes, ^JRNRESTO does not canonicalize the paths you enter, because the canonical form in the journal files is different from the canonical form on the current system. In the latter case, you must take care to enter directory paths in the canonical form appropriate to the operating system of the journal files to ensure that they will match.

        For example:

        • if you are working on a Windows system and enter No at this prompt, then enter the path c:\intersystems\cache\mgr\user, ^JRNRESTO automatically canonicalizes this to c:\intersystems\cache\user\ to match journal files created on a Windows system.

        • if you are working on a Unix system and enter Yes at this prompt because the journal files were created on a Windows system, you must be sure to enter the canonical form of the path, c:\intersystems\cache\mgr\user\, to ensure that it matches the journal files, because ^JRNRESTO cannot canonicalize it for you.

      • Specify each database you want to restore by entering its directory path; this indicates the source database from which the journal records were taken. Press <Enter> at the Redirect to directory prompt to indicate that source and target are the same and restore global updates to the source database. If you are restoring to a different database, for example because you have restored the source database from backup to a different system, enter the directory path of the target database.

        If you are restoring mirror journal files to a non-mirrored database, at the Directory to restore prompt, you can do either of the following:

        • Enter directory path of the source database and then either <Enter> or the directory path of the target non-mirrored database, as described in the foregoing.

        • Enter the full, case-sensitive mirror database name of the source mirrored database, for example,:mirror:JLAP:MIRRORDB, which can be found using the List mirrored databases option on the Mirror Status menu of the ^MIRROR utility, and then specify the directory path of the target non-mirrored database.

        Note:

        If you are restoring mirror journal files to a mirrored database, you will not have reached this point in the procedure; see Restore Journal to Mirrored Database Using MirrorCatchup^JRNRESTO.

      • For each database you specify, either confirm that you want to restore updates for all globals or enter one or more globals to restore.

      • When you have entered all the databases, press <Enter> at the Directory to restore prompt, then confirm the list of specified databases and globals.

      For example:

      Process all journaled globals in all directories? no
      Are journal files imported from a different operating system? No => No
       
      Directory to restore [? for help]: c:\intersystems\cache23\mgr\user\
      Redirect to Directory: c:\intersystems\cache23\mgr\user\
       => --> c:\intersystems\cache23\mgr\user\
      Process all globals in c:\intersystems\cache23\mgr\user\? No => yes
       
      Directory to restore [? for help]: c:\intersystems\cache23\mgr\samples\
       Redirect to Directory: c:\intersystems\cache23\mgr\samples\
       => --> c:\intersystems\cache23\mgr\samples\
      Process all globals in c:\intersystems\cache23\mgr\samples\? No => no
      
      Global ^Aviation.AircraftD
      Global^
      
      Directory to restore [? for help]:
      
      Processing globals from the following datasets:
       1. c:\intersystems\cache23\mgr\user\   All Globals
       2. c:\intersystems\cache23\mgr\Samples\   Selected Globals:
                ^Aviation.AircraftD
      
      Specifications correct? Yes =>  Yes
      
      
      Note:

      If you are redirecting two or more databases to the same directory, you must make the same global selection — that is, either enter yes to process all globals, or no and then the same list of globals to process – for all of these databases. If you try to restore multiple databases to a single directory and the global selections are not all the same, the utility gives you the opportunity to either change your database redirection and global selections or cancel the operation.

  5. Specify the journal files to restore from, which should be from the same Caché instance as the source databases you are restoring, by specifying the correct journal history log (see Journal History Log in this chapter).

    Are journal files created by this Cache instance and located in their original 
    paths? (Uses journal.log to locate journals)?
    
    
    • Enter yes or <Enter> at the prompt to use the journal history log of the current Caché instance to identify the journal files to process. For example, if you entered yes at the Process all journaled globals in all directories? prompt at the start of the process, enter yes here to restore all databases in the current instance from the current instance’s journal files.

    • If you entered no at the Process all journaled globals in all directories? prompt and then specified databases in another Caché instance, enter no here to specify the journal history log and journal file directory path of that instance, or files copied from that instance, so that the databases can be restored from that instance’s journal files.

      Important:

      If you are using a journal history log from another Caché instance, you must use a copy of the file, not the actual log.

      For example,

      Are journal files created by this Cache instance and located in their original
      paths? (Uses journal.log to locate journals)? no
      If you have a copy of the journal history log file from the Cache
      instance where the journal files were created, enter its full path below;
      otherwise, press ENTER and continue.
      Journal history log: c:\cache23_journals\journal.log
      
      Specify the location of the journal files to be processed
      Directory of the journal files:  c:\cache23_journals\journal\
      Directory of the journal files: 
      
      
  6. Specify the range of journal files you want to process. Bear in mind the following:

    • If Caché switched to multiple journal files since the restored backup, you must restore the journal files in order from the oldest to the most recent. For example, if you have three journal files to restore, 20130214.001, 20130214.002, and 20130215.001, you must restore them in the following order:

      20130214.001

      20130214.002

      20130215.001

    • When you back up with Caché online backup, information about the oldest journal file required for transaction rollback during restore is displayed at the beginning of the third and final pass and stored in the backup log. See the “Backup and Restore” chapter of this guide for more information.

    Respond to the prompts as follows:

    • If you entered yes at the Are journal files created by this Cache instance prompt, or answered no and then specified the journal history log and journal file location of another instance, you can enter the pathnames of the first and last journal files to process. You can also enter ? at either prompt to see a numbered list of the files in the specified location, then enter the numbers of the files, for example:

      Specify range of files to process
      Enter ? for a list of journal files to select the first and last files from
      First file to process:  ?
       
      1) c:\intersystems\cache2\mgr\journal\20130212.001
      2) c:\intersystems\cache2\mgr\journal\20130213.001
      3) c:\intersystems\cache2\mgr\journal\20130214.001
      4) c:\intersystems\cache2\mgr\journal\20130214.002
      5) c:\intersystems\cache2\mgr\journal\20130215.001
      6) c:\intersystems\cache2\mgr\journal\20130216.001
      7) c:\intersystems\cache2\mgr\journal\20130217.001
      8) c:\intersystems\cache2\mgr\journal\20130217.002
      
      First file to process:  5 c:\intersystems\cache2\mgr\journal\20130215.001
      Final file to process:
        c:\intersystems\20141316mar14\mgr\journal\20130217.002 => 
      
      Prompt for name of the next file to process? No => no
      
      
    • If you entered no at the Are journal files created by this Cache instance prompt and did not specify a journal history log, processing continues with prompts that attempt to identify the specific journal files you want to process. For example:

      Journal history log:
      Specify range of files to process (names in YYYYMMDD.NNN format)
       
      from:     <20130212.001> [?] => 20130215.001
       
      through:  <20130217.001> [?] => 20130217.002
       
      Prompt for name of the next file to process? No => no
       
      Provide or confirm the following configuration settings:
       
      Journal File Prefix: [?] =>
       
      Files to dejournal will be looked for in:
           c:\intersystems\cache\mgr\journal\
      in addition to any directories you are going to specify below, UNLESS
      you enter a minus sign ('-' without quotes) at the prompt below,
      in which case ONLY directories given subsequently will be searched
      
      Directory to search: {return when done} -
           [Directory search list is emptied]
      Directory to search: {return when done} c:\intersystems\cache2\mgr\journal
      Directory to search: {return when done}
      Here is a list of directories in the order they will be searched for files:
           c:\intersystems\cache2\mgr\journal\
      
      
  7. Process the journal files:

    Prompt for name of the next file to process? No => No 
    The following actions will be performed if you answer YES below:
     
    * Listing journal files in the order they will be processed
    * Checking for any missing journal file on the list ("a broken chain")
     
    The basic assumption is that the files to be processed are all
    currently accessible. If that is not the case, e.g., if you plan to
    load journal files from tapes on demand, you should answer NO below.
    Check for missing journal files? Yes => Yes
    
    
  8. If one or more journal files within the range you specified are missing, you are given the opportunity to abort the operation. If you do not, or if no files are missing, the process proceeds with an opportunity to check journal integrity before starting the restore:

    Journal files in the order they will be processed:
    1. c:\intersystems\cache2\mgr\journal\20130215.001
    2. c:\intersystems\cache2\mgr\journal\20130216.001
    3. c:\intersystems\cache2\mgr\journal\20130217.001
    4. c:\intersystems\cache2\mgr\journal\20130217.002
    
    While the actual journal restore will detect a journal integrity problem
    when running into it, you have the option to check the integrity now
    before performing the journal restore. The integrity checker works by
    scanning journal files, which may take a while depending on file sizes.
    Check journal integrity? No => No
    
    
  9. If the current journal file is included in the restore, you must switch journaling to another file, and are prompted to do so:

    The journal restore includes the current journal file.
    You cannot do that unless you stop journaling or switch
         journaling to another file.
    Do you want to switch journaling? Yes => yes
    Journaling switched to c:\intersystems\cache2\mgr\journal\20150217.003
    
  10. Next, choose whether to disable journaling of updates during the restore to make the operation faster.

    You may disable journaling of updates for faster restore for all
    databases other than mirrored databases. You may not want to do this
    if a database to restore is being shadowed as the shadow will not
    receive the updates.
    Do you want to disable journaling the updates? Yes => yes
    Updates will NOT be journaled
    
    
    Important:

    If you do not disable journaling of updates during the restore, parallel dejournaling will not be used to increase performance, as described at the beginning of this section.

    If journaling is disabled but database updates continue, you cannot use the last good journals to do a manual restore unless you can assure either of the following:

    • You know exactly what will be updated and can control what is restored to the satisfaction of the application.

    • You have restored the database(s) involved from the last backup and accept that after applying the journals you will have lost the data written when journaling was off.

    InterSystems recommends that you run the following commands after completing the journal restore under these circumstances to verify that the object IDs are not out of sync; only IDs that are found to be out of sync are reported in the array, errors:

    Do CheckIDCounters^%apiOBJ(.errors)
    zwrite errors
    
  11. After confirming or changing the default options for the restore job, confirm the restore to begin:

    Before we job off restore daemons, you may tailor the behavior of a
    restore daemon in certain events by choosing from the options below:
     
         DEFAULT:    Continue despite database-related problems (e.g., a target
         database is not journaled, cannot be mounted, etc.), skipping affected
         updates
     
         ALTERNATE:  Abort if an update would have to be skipped due to a
         database-related problem (e.g., a target database is not journaled,
         cannot be mounted, etc.)
     
         DEFAULT:    Abort if an update would have to be skipped due to a
         journal-related problem (e.g., journal corruption, some cases of missing
         journal files, etc.)
     
         ALTERNATE:  Continue despite journal-related problems (e.g., journal
         corruption, some missing journal files, etc.), skipping affected updates
     
    Would you like to change the default actions? No => No
      
    Start the restore? Yes =>
    
    Important:

    If you choose to abort due to both database-related problems and journal-related problems, parallel dejournaling will not be used to increase performance, as described at the beginning of this section.

  12. The progress of the journal restore is displayed at intervals, and when the job is complete a list of databases updated by the restore is displayed:

    c:\MyCache1\mgr\journal\20150406.001
     35.73%  70.61% 100.00%
    c:\MyCache1\mgr\journal\20150406.002
     35.73%  70.61% 100.00%
    c:\MyCache1\mgr\journal\20150406.003
     100.00%
    [Journal restore completed at 20150407 02:25:31]
     
    The following databases have been updated:
    
    1. c:\MyCache1\mgr\source22\
    2. c:\MyCache1\mgr\source23\
    3. c:\MyCache1\mgr\cache\
    4. c:\MyCache1\mgr\cachelib\
    5. c:\MyCache1\mgr\cachetemp\
    
    The following databases have been skipped:
    
    1. /bench/user/cache/162/
    2. /scratch1/user/cache/p750.162/mgr/
    3. /scratch1/user/cache/p750.162/mgr/cache/
    4. /scratch1/user/cache/p750.162/mgr/cachelib/
    5. /scratch1/user/cache/p750.162/mgr/cachetemp/
    6. /scratch1/user/cache/p750.162/mgr/user/
    
Rolling Back Incomplete Transactions

Restoring the journal also rolls back incomplete transactions. Ensure that users have completed all transactions so that the restore does not attempt to roll back active processes.

To ensure that transactions are all complete before you restore your backup and clear the journal file, InterSystems strongly recommends the following:

  • If you need to roll back transactions for your own process, the process must halt or use the TROLLBACK command.

  • If you need to roll back transactions system-wide, shut down Caché and restart it to ensure that no users are on the system.

Restoring Mirror Journal Files

You can restore mirror journal files to either mirrored or non-mirrored databases. If you are restoring to a mirrored database, see step 2 of the procedure in Restore Globals From Journal Files Using ^JRNRESTO and the section Restore Journal to Mirrored Database Using MirrorCatchup^JRNRESTO. If you are restoring to a non-mirrored database, see step 4 of the procedure in Restore Globals From Journal Files Using ^JRNRESTO.

Filter Journal Records Using ^ZJRNFILT

InterSystems provides a journal filter mechanism to manipulate the journal file. The journal filter program is a user-written routine called ^ZJRNFILT whose format is shown below. This is called by the Caché journal restore program, ^JRNRESTO, and ensures that only selected records are restored. Create the ^ZJRNFILT routine using the following format:

ZJRNFILT(jid,dir,glo,type,restmode,addr,time)

Argument Type Description
jid input Job ID which you can use to identify the PID that generated the journal
dir input Full pathname of the directory containing the CACHE.DAT file to be restored, as specified in the journal record
glo input Global in journal record
type input Command type in journal record (S for Set, K for Kill)
addr input Address of the journal record
time input Time stamp of the record (in $horolog format). This is the time the journal buffer is created, not when the Set or Kill operation occurs, so it represents the earliest this particular operation could have happened.
restmode output 0 - do not restore record

1 - restore record

^ZJRNFILT Considerations

Consider the following when using ^ZJRNFILT:

  • If the startup routine (^STU) calls ^JRNRESTO, it does not call the filter routine under any circumstances.

  • Journal restore only calls the journal filter (^ZJRNFILT) if it exists. If it does exist, the restore procedure prompts you to confirm the use of the filter in the restore process.

  • If you answer yes to use the journal filter, for every record in the journal file to restore, the routine calls the journal filter ^ZJRNFILT with the indicated input arguments to determine whether to restore the current record.

  • You can use any logic in your ^ZJRNFILT routine to determine whether or not to restore the record. Return confirmation through the output restmode argument.

  • If you are using the directory name, dir, in the ^ZJRNFILT routine logic, specify the full directory pathname.

  • The entire global reference is passed to ^ZJRNFILT for use in program logic.

  • When the journal restore process completes, it prompts you to confirm whether to rename the ^ZJRNFILT routine or delete it. If you choose to rename the filter, the utility renames it ^XJRNFILT and deletes the original ^ZJRNFILT.

  • The restore process aborts with an appropriate error message if any errors occur in the ^ZJRNFILT routine.

^ZJRNFILT Examples

Two globals, ^ABC and ^XYZ, are journaled. While journaling is turned on, the following code is executed, and the journal file records the Set and Kill operations for these globals:

 For I=1:1:500 Set ^ABC(I)=""
 For I=1:1:500 Set ^XYZ(I)=""
 For I=1:1:100 Kill ^ABC(I)
  1. To restore all records for ^ABC only, the ^ZJRNFILT routine looks like this:

    ZJRNFILT(jid,dir,glo,type,restmode,addr,time)    /*Filter*/
     Set restmode=1                                  /*Return 1 for restore*/
     If glo["XYZ" Set restmode=0                     /*except when it is ^XYZ*/
     Quit
     ; 
  2. To restore all records except the kill on ^ABC, the ^ZJRNFILT routine looks like this:

    ZJRNFILT(jid,dir,glo,type,restmode,addr,time)    /*Filter*/
     Set restmode=1                                  /*Return 1 for restore*/
     If glo["^ABC",type="K" Set restmode=0           /*except if a kill on ^ABC*/
     Quit
     ;
    
  3. In some cases (for example, when the jid is a PID or on a mirror member), remsysid is not the actual ECP system ID. In these cases, use the %SYS.Journal.Record.GetRealPIDSYSinFilterOpens in a new tab method to return the real ECP system ID as well as the real PID.

    To pull the real PID (and ECP system PID) in a filter, the ^ZJRNFILT routine looks like this:

    ZJRNFILT(jidsys,dir,glo,type,restmode,addr,time) ;
     SET restmode=0 ;test only
     SET pid=##class(%SYS.Journal.Record).GetRealPIDSYSinFilter(jidsys,.ecpsysid)
     DO ##class(%SYS.System).WriteToConsoleLog($SELECT(pid="":"jid="_+jidsys,1:"pid="_pid)_",ecpsysid="_ecpsysid)
     QUIT
    Note:

    The jidsys argument in ^ZJRNFILT contains two components: jid and remsysid, separated by a comma.

  4. To restore all records after a specific time, the ^ZJRNFILT routine looks like this:

    ZJRNFILT(jid,dir,glo,type,restmode,addr,time)               /*Filter*/
     Set restmode=1                                             /*Return 1 for restore*/
     If time<$zdatetimeh("08/14/2015 14:18:31") Set restmode=0  /*except if before Aug 14 2015 2:18.31 pm*/
     Quit
     ;
    

Display Journal Records Using ^JRNDUMP

To display the records in the journal file, enter 5 at the Option prompt of the ^JOURNAL menu or run ^JRNDUMP as shown in the following example:

  1. Run the ^JRNDUMP utility from the system manager namespace by entering:

    %SYS>DO ^JRNDUMP
    
       Journal                 Directory & prefix
     
       20151113.001            C:\MyCache\Mgr\Journal\
       20151113.002 [JRNSTART] C:\MyCache\mgr\journal\
       20151113.003            C:\MyCache\mgr\journal\
       20151113.004            C:\MyCache\mgr\journal\
       20151114.001            C:\MyCache\mgr\journal\
       20151115.001            C:\MyCache\mgr\journal\
       20151115.002            C:\MyCache\mgr\journal\
       20151115.003            C:\MyCache\mgr\journal\
    >  20151115.004            C:\MyCache\mgr\journal\
     
    
    
  2. The routine displays a list of journal files. A greater-than sign (>) appears to the left of the currently selected file followed by a prompt:

    Pg(D)n,Pg(U)p,(N)ext,(P)rev,(G)oto,(E)xamine,(Q)uit =>
    
    

    Use these options to navigate to the journal file you wish to locate:

    • If the instance is a mirror member, enter M to limit the list to mirror journal files only. (For information about mirror journal files, see Journal Files and Journal History Log in this chapter and Mirror Synchronization in the “Mirroring” chapter of the Caché High Availability Guide.)

    • Enter D or U to page through the list of journal files.

    • Enter N or P to move the > to the desired journal file.

    • Enter G to directly enter the full pathname of the journal file to display.

    • Enter E to display the contents of the selected journal file.

    • Enter I to display information about the selected journal file and, optionally, a list of databases from the journal.

    • Enter Q or <Enter> to quit the routine.

  3. When you enter I, when you accept the currently selected journal file or specify a different one, information like the following is displayed:

    Journal: C:\MyCache\mgr\journal\20151113.003
    File GUID: 97734819-CA75-4CB1-9C3E-74D294784D23
    Max Size: 1073741824
    Time Created: 2015-11-13 10:44:52
    File Count: 22
    Min Trans: 22,3497948
    Prev File: C:\MyCache\mgr\journal\20151113.002
    Prev File GUID: 8C5D3476-F12C-4258-BF6C-7423876653A4
    Prev File End: 0
    Next File: C:\MyCache\mgr\journal\20151113.004
    Next File GUID: 4F4D20B1-D38C-473E-8CF0-4D04C6AF90B0
    
    (D)atabases,(Q)uit =>
    

    Min Trans is the file count and offset of the minimal transaction position, that is, any open transaction must have started at or later than that point.

    If the selected file is a mirror journal file, addition information is displayed.

    Entering Q at the prompt at the bottom returns you to the journal file list. Enter D to display database information like the following:

    Journal: C:\MyCache\mgr\journal\20151113.003
      sfn  Directory or Mirror DB Name
    ==========================================================================
        0  C:\MyCache\mgr\
        1  C:\MyCache\mgr\cachelib\
        2  C:\MyCache\mgr\cachetemp\
        3  :mirror:MIR:MIRTEST
        5  C:\MyCache\mgr\cache\
        6  C:\MyCache\mgr\user\
     
    (P)rev,(N)ext,(Q)uit =>
    

    Enter Q to return to the journal file information display.

  4. After you enter G or E, the utility displays the journal file name and begins listing the contents of the file by offset address. For example:

    Journal: C:\MyCache\mgr\journal\20150330.002
       Address   Proc ID Op Directory        Global & Value
    ===============================================================================
        131088      2980 S  C:\MyCache\mgr\  SYS("shdwcli","doctest","remend") = 1+
        131156      2980 S  C:\MyCache\mgr\  SYS("shdwcli","doctest","end") = 1013+
        131220      2980 S  C:\MyCache\mgr\  SYS("shdwcli","doctest","jrnend") = 1+
    ...
    
    
  5. At the bottom of the current listing page is information about the journal file and another prompt:

    Last record:     573004;   Max size: 1073741824
    (N)ext,(P)rev,(G)oto,(F)ind,(E)xamine,(Q)uit =>
    
    

    Use these options to navigate to the journal record you wish to display:

    • Enter N or P to display the next or previous page of addresses.

    • Enter G to move the display to a particular address.

    • Enter F to search for a particular string within the journal file.

    • Enter E to enter the address of a journal record and display its contents.

    • Enter Q to return to the list of journal files.

  6. After entering E or G, enter an address at the prompt. The E option displays the contents of the journal record at or near the address you entered; the G option displays the page of journal records starting at that location.

    For either option, the utility locates the record that is the closest to the offset address you specify; it does not need to be a valid address of a journal record. Also, you may enter 0 (zero) to go to the beginning of the journal file, or enter -1 to go to the end of the journal file.

  7. You may browse through a display of the journal records using N or P to display the next or previous journal record contents, respectively. When you are finished displaying records, enter Q at the prompt to return to the list of journal records.

There are different types of journal records:

  • The journal header is 8192 bytes long. It appears once at the start of every journal file. The ^JRNDUMP utility does not display the journal header record.

  • Journal data records.

  • Journal markers

The following is a sample journal file data record as displayed by ^JRNDUMP. The example shows how a Set command is recorded. The new value is recorded, but not the old value, because the Set occurred outside a transaction:

Journal: C:\MyCache\mgr\journal\20150119.004
 
Address:                 233028
Type:                    Set
In transaction:          No
Process ID:              4836
ECP system ID:           0
Time stamp:              60284,53240
Collation sequence:      5
Prev address:            232984
Next address:            0
 
Global:    ^["^^C:\MyCache\mgr\"]ABC
New Value: 2
 
 
(N)ext,(P)rev,(Q)uit =>

In a transaction, the old value is also recorded, to allow transaction rollback, as seen in this second example:

Journal: C:\MyCache\mgr\journal\20151115.004
 
Address:                 204292
Type:                    Set
In transaction:          Yes
Process ID:              458772
ECP system ID:           0
Time stamp:              60584,52579 - 11/15/2015 14:36:19
Collation sequence:      5
Prev address:            204224
Next address:            204372 

Global:    ^["^^C:\MyCache\mgr\"]ABC
New Value: 5
Old Value: 2
 
 
(N)ext,(P)rev,(Q)uit =>

The following is an example of a journal marker record created by an incremental backup:

Journal: C:\MyCache\mgr\journal\20151115.004
 
Address:                 210848
Type:                    JrnMark
Marker ID:               -1
Marker text:             NOV 15 2015;03:14PM;Incremental
Marker seq number:       1
Prev marker address:     0
Time stamp:              60584,52579 - 11/15/2015 14:36:19
Prev address:            210744
Next address:            210940
 
 
 
(N)ext,(P)rev,(Q)uit =>

The following table describes each field in the journal data record.

Journal Data Record Fields Displayed by ^JRNDUMP
Field Description
Address Location of this record in number of bytes from beginning of file. This is the only field where you enter a value to select a record.
Type The type of operation recorded in this journal record entry. See the Journal File Operations table for possible types.
In transaction Whether or not the update occurred in a transaction.
Process ID Process ID number for the process issuing the command.
ECP system ID ECP system ID number (0 if a local process).
Time stamp Creation time of the journal buffer, in $HOROLOG and human-readable format. This is not the time the Set or Kill operation occurs, so it represents the earliest this particular operation could have happened.
Collation sequence Collation sequence of the global being updated.
Prev address Location of previous record (0 indicates this is the first record).
Next address Location of next record (0 indicates this is the last record).
Cluster sequence # Sequencing for globals in cluster-mounted databases. During cluster failover, journal entries from different nodes are updated in order of this cluster time sequencing.
Mirror Database Name If a mirror journal file, the mirror name for the database on which the operation occurred.
Global Extended reference of global being updated.
New Value For a Set operation, the value assigned to the global.
Old Value For a Set or Kill operation in a transaction, the value that was in the global before the operation.

The following table lists and describes the journal operations displayed in the Op column of a ^JRNDUMP journal file display and the Type field of a ^JRNDUMP journal record listing. For example, in the previous example of a journal file display, S in the Op column represents a JRNSET operation, while in the examples of journal record displays, Set appears in the Type field to indicate a JRNSET operation. Note that the Type column of the journal record display in the management portal (see View Journal Files) differs for some operations from the Type field of the ^JRNDUMP listing; for example, a JRNSET operation is indicated by RemoteSET in the portal and by NSet in ^JRNDUMP output. These differences are shown in the table.

The table also shows the codes that can be specified to filter journal records by operation when using the SELECT^JRNDUMP function.

Journal File Operations
Operation Description Op in file listing Type in ^JRNDUMP record listing Type in Management Portal record listing Numeric SELECT code Alpha SELECT code
JRNSET set a node, local S1 Set

SET

6 s

JRNNSET

set a node, remote

S1 NSet

RemoteSET

10 s

JRNMIRSET

internal mirror operation2

S1

Mirror Set

MirrorSET

19

s

JRNBITSET

set a specified bit position in a node

b1

BitSet

BitSET

14

bs

JRNKILL

kill a node, local

K1

KillNode

KILL

7

k

JRNNKILL

kill a node, remote

K1

NKill

RemoteKILL

11

k

JRNKILLDES

kill a descendant node

k1

KillDesc

KILLdes

8

k

JRNMIRKILL

internal mirror operations2

k1

Mirror Kill

MirrorKILL

20

k

JRNZKILL

kill a node without killing subordinate nodes, local

k1

ZKill

ZKILL

9

zk

JRNNZKILL

kill a node without killing subordinate nodes, remote

k1

NZKill

RemoteZKILL

12

zk

JRNBEGTRANS

begin a transaction

BT

BeginTrans

BeginTrans

4

--

JRNTBEGINLEVEL

begin transaction level

BTL

BeginTrans with Level

BeginTrans with level

16

--

JRNCOMMIT

commit a transaction

CT

CommitTrans

CommitTrans

5

--

JRNTCOMMITLEVEL

commit isolated transaction level

CTL

CommitTrans with Level

CommitTrans with level

18

--

JRNTCOMMITPENDLEVEL

commit pending transaction level

PTL

CommitTrans Pending with Level

CommitTrans Pending with level

17

--

JRNMARK

journal marker

M

JrnMark

Marker

13

--

JRNBIGNET

ECP networking

NN

NetReq

netsyn

15

--

JRNTROLEVEL

roll back a transaction

RB

Rollback

Rollback

21

--

1 T is appended when the operation occurs within a transaction, for example ST for a Set operation within a transaction or kT for a ZKill operation within a transaction.

2 Operation is ignored during journal restore.

Select Journal Records to Dump

The function SELECT^JRNDUMP lets you display any or all of the records in the journal file. Caché dumps selected records from the journal file, starting from the beginning of the file, based on the arguments passed to the function.

The syntax to use the SELECT entry point of the ^JRNDUMP utility is as follows:

SELECT^JRNDUMP(%jfile,%pid,%dir,%glo,%gloall,%operation,%remsysid)
Argument Description
%jfile Journal file name. Default is the current journal file. You must specify the fully qualified path of the journal file.
%pid

Process ID in the journal record. Default is any process.

%dir Directory in the journal record. Default is any directory.
%glo Global reference in the journal record. Default is any global.
%gloall Global indicator whether to list entries related to all global nodes containing the name represented by glo: 0 — Exact match of global reference with the name specified in glo, 1 — Partial match; all records with a global reference that contains the name specified in glo. Default is 0.
%operation Operation type of the journal record. Default is any operation. Use the numeric or alphabetic codes listed in the Journal File Operations table in the previous section.
%remsysid ECP system ID of journal record. Default is any system.1

1 If %pid is specified, then %remsysid defaults to local system (0); otherwise, it defaults to any system, the same as if it is specified as 0. That is, you cannot select journal entries only from the local system.

You may pass the null string for any argument, in which case the routine uses the defaults.

As with other terminal functions, you can use the Device: prompt to direct the output of SELECT^JRNDUMP to a device other than the terminal, or to a file. (See the chapter “I/O Devices and Commands” in the Caché I/O Device Guide for information on user device selection.) If you direct the output to a file, you are prompted for parameters. You must include R and W when writing to a file; if it is an existing file, include A to append output to the existing content rather than overwriting it; if a new file, you must include N. Enter ? at the Parameters? prompt to display all possible choices.

Note:

If the file you are overwriting is longer than the current output, the excess lines from the original file may not be removed from the updated file.

SELECT^JRNDUMP Examples

The following examples show different ways to select specific journal records.

To select all records in a journal file with the process ID 1203 and send the output to a new file called JRNDUMP.OUT:

%SYS>Do SELECT^JRNDUMP("C:\MyCache\mgr\journal\120020507.009","1203")

Device: SYS$LOGIN:JRNDUMP.OUT   
Parameters: "RWN"=>

To select all records in the journal file that contain the global reference ^ABC:

DO SELECT^JRNDUMP("C:\MyCache\mgr\journal\20050327.001","","","^ABC",1)

To select only records that have an exact match to the global reference ^ABC:

DO SELECT^JRNDUMP("C:\MyCache\mgr\journal\20050327.001","","","^ABC",0)

Note:

Records that are not an exact match, such as ^ABC(1) or ^ABC(100), are not selected.

To select only records for local Set operations of global ^ABC:

DO SELECT^JRNDUMP("C:\MyCache\mgr\journal\20050327.001","","","^ABC","","6")

To select only records for local and remote Set operations of global ^ABC:

DO SELECT^JRNDUMP("C:\MyCache\mgr\journal\20050327.001","","","^ABC","","s")

Purge Journal Files Using PURGE^JOURNAL

To purge files, use the PURGE^JOURNAL routine or enter 6 at the Option prompt of the ^JOURNAL menu, as shown in the following examples.

Example of running PURGE^JOURNAL directly:

zn "%SYS"
%SYS>Do PURGE^JOURNAL

Example of starting journaling from the ^JOURNAL menu:

%SYS>Do ^JOURNAL
 
 ...
 6) Purge Journal Files (PURGE^JOURNAL)
 ...
Option? 6

1) Purge any journal NOT required for transaction rollback or crash recovery
2) Purge journals based on existing criteria (2 days or 2 backups)
 
Option?

The routine reports on the action taken in response to the option you specify. For example:

Option? 1

The following files have been purged (listed from latest to oldest):

3. c:\intersystems\cache\mgr\journal\20150714.001
2. c:\intersystems\cache\mgr\journal\20150713.001
1. c:\intersystems\cache\mgr\journal\20150710.003

If no files are purged, the following message is displayed:


None purged

Update Journal Settings Using ^JRNOPTS

As an alternative to using the Journal Settings page of the Management Portal, you can update the basic journal configuration settings using the ^JRNOPTS routine or by entering 7 at the Option prompt of the ^JOURNAL menu. To change the setting, type the new value at the prompt and press Enter. For example:

SYS>Do ^JRNOPTS
 
1) Primary Journal Directory: C:\MyCache\Mgr\Journal\
2) Alternate Journal Directory: D:\cachesys\altjournal\ 
3) Journal File Size Limit (MB) 1024
4) Journal File Prefix: 
5) Journal Purge Options: 2 days OR 2 backups, whichever comes first

Entering a question mark (?) displays Help. For example:

Journal File Prefix: ?
  Enter an alphanumeric string ('_' allowed) or . to reset prefix to null

If you change any of the settings, then press Enter at a Change Property? prompt, you are given the option to activate the changes:

Change Property?
Save and activate changes? Yes =>
*** Journal options updated.

If you do not change any settings, you see the following message:

  
*** Nothing changed

Journal Encryption Using ENCRYPT^JOURNAL

For information on option 8) Activate or Deactivate Journal Encryption (ENCRYPT^JOURNAL), see the Configuring Caché Database Encryption Startup Settings section of the “Managed Key Encryption” chapter of the Caché Security Administration Guide, which describes details on journal file encryption.

Display Journal Status Using Status^JOURNAL

Choosing option 9) Display Journal status displays a concise overview of journal status information including the following:

  • Current journal directory and its remaining space

  • Alternate journal directory (if different) and its remaining space

  • Current journal file, its maximum size, and space used

  • Journaling state, which can be one of the following:

    • Enabled

    • Disabled (stopped)

    • Disabled due to I/O error (suspended)

    • Frozen due to I/O error

    • Journal switch in progress (paused)

    Though suspended and frozen due to I/O error are the same journal state, the system takes different action; when frozen, it discards journal data.

  • If applicable, the process IDs of any process running ^JRNSTART, ^JRNSTOP, or ^JRNSWTCH

For example:

%SYS>Do ^JOURNAL
 
 ...
 9) Display Journal status (Status^JOURNAL)
 ...
Option? 9
 
Current journal directory:  C:\MyCache\Mgr\Journal\
Current journal directory free space (KB):  53503904
Alternate journal directory:  C:\MyCache\Mgr\
Alternate journal directory free space (KB): 53503904
Current journal file:  C:\MyCache\mgr\journal\20151129.001
Current journal file maximum size:    1073741824
Current journal file space used:         1979276
Journaling is enabled.

Restore Cluster Journal Using CLUMENU^JRNRESTO

Option 10) Cluster Journal Restore (CLUMENU^JRNRESTO) is displayed only on a cluster node; for information, see Cluster Journal Restore in the “Cluster Journaling” chapter of this guide.

Manage Transaction Rollback Using Manage^JRNROLL

Caché provides the ^JRNROLL utility to roll back partially completed transactions for records in the journal; use the Manage entry point (Manage^JRNROLL) when transaction rollbacks are pending or in progress at system startup or primary mirror member startup.

To start managing transaction rollback, run Manage^JRNROLL or enter 11 at the Option prompt of the ^JOURNAL menu, as shown in the following example:

%SYS>Do ^JOURNAL
 
 ...
 11) Manage pending or in progress transaction rollback (Manage^JRNROLL)
 ...
Option? 11

Choosing option 11) Manage pending or in progress transaction rollback (Manage^JRNROLL) displays a message similar to the following:

Transaction rollback is pending or in progress 
Do you wish to run Manage^JRNROLL? Yes => Yes 
Rollback operations currently in progress 
     ID   Phase     MB Remaining   Current Open Transaction Count 
     1    scan      307            2 
          Rollback at system startup at 11/29/2015 15:54:35 (578MB) 
          20151129.004 has 2 open transaction(s) starting at offset 11303 
          2 file(s) remaining to process   

1) Restart pending rollback 
2) Interrupt transaction rollback 
3) Redisplay rollback information  

Option?  

This option displays the state of transaction rollbacks, including what phase (scanning or rollback) it’s in, the amount of data (MBs) remaining to be processed, the number of open transactions it has found, etc.

In addition, it lists sub-options that let you manage the listed transaction rollbacks. For example, you can interrupt the operation, in which case it is queued as a “pending” operation; then you can restart pending rollbacks.

Note:

On a mirror, transaction rollback is executed twice: once for non-mirrored databases (at system startup); then for mirrored databases (when a system becomes the primary mirror member). As a result, when starting the primary mirror member, it may be necessary to interrupt the rollback twice, resulting in two pending operations. Restarting the pending operations performs the non-mirror and mirror rollbacks separately.

During rollback, messages are written to the console log (cconsole.log) every 10% of the way through (more or less) indicating how much space is left to process and how many open transactions are listed.

When journal files are purged, files required for pending transaction rollback are retained (for example, if they would otherwise have been deleted).

Restore Journal to Mirrored Database Using MirrorCatchup^JRNRESTO

You can restore mirror journal files to mirrored databases by entering 12 at the Option prompt of the ^JOURNAL menu, or by answering yes at the Catch-up mirrored databases? prompt when using Option 4, Restore Globals From Journal (^JRNRESTO). For example:

%SYS>Do ^JOURNAL

 ...
 12) Journal catch-up for mirrored databases (MirrorCatchup^JRNRESTO) 
 ...
Option? Option? 12

Specify the list of mirrored databases you want to catch-up.
Enter database, * for all, ? for a list or to end list? *
Enter database or to end list?
Starting catch-up for the following mirrored database(s):
     sfn #6: c:\intersystems\cache20121x\mgr\mirrordb3\
Catch-up succeeded. 

To catch up mirrored databases, journaling need not be running, but it must have been started at least once to ensure that the current journal directory is available from memory.

Recover from Startup Errors Using ^STURECOV

During the Caché startup procedure if the journal or transaction restore process encounters errors, such as <FILEFULL> or <DATABASE>, the procedure logs the errors in the console log (cconsole.log) and starts the system in single-user mode.

Caché provides a utility, ^STURECOV, to help you recover from the errors and start Caché in multiuser mode. The routine has several options which you can use to retry the failed operation and bring the system up, or ignore the errors and bring the system up. The journal restore phase tries to do as much work as possible before it aborts. If a database triggers more than three errors, it aborts the recovery of that database and leaves the database dismounted.

Note:

The ^STURECOV utility does not work on a mirror member on which transaction rollback is pending or in progress because the system does not activate a mirrored database read/write until transaction rollback has been completed. In this case, Caché lets you run the Manage^JRNROLL routine, which provides a way to force the system to come up and store transaction rollback information which can be used to roll back transactions after the system is up and running. For more information, see Manage Transaction Rollback Using Manage^JRNROLL in this section.

During transaction rollback, the first error in a database causes the rollback process to skip that database in the future. The process does not fully replay transactions that reference that database; it stores them for rollback during the recovery process.

When Caché encounters a problem during the dejournaling phase of startup it generates a series of console log messages similar to the following:

08/10-11:19:47:024 ( 2240) System Initialized. 
08/10-11:19:47:054 ( 2256) Write daemon started. 
08/10-11:19:48:316 ( 1836) Performing Journal Recovery 
08/10-11:19:49:417 ( 1836) Error in JRNRESTB: <DATABASE>restore+49^JRNRESTB 
     C:\MyCache\mgr\journal\20150810.004 addr=977220 
     ^["^^C:\MyCache\mgr\jo1666\"]test(4,3,28) 
08/10-11:19:49:427 ( 1836) Error in JRNRESTB: <DATABASE>restore+49^JRNRESTB 
     C:\MyCache\mgr\journal\20150810.004 addr=977268 
     ^["^^C:\MyCache\mgr\test\"]test(4,3,27) 
08/10-11:19:49:437 ( 1836) Error in JRNRESTB: <DATABASE>restore+49^JRNRESTB 
     C:\MyCache\mgr\journal\20150810.004 addr=977316 
     ^["^^C:\MyCache\mgr\test\"]test(4,3,26) 
08/10-11:19:49:447 ( 1836) Error in JRNRESTB: <DATABASE>restore+42^JRNRESTB 
     C:\MyCache\mgr\journal\20150810.004 addr=977748 
     ^["^^C:\MyCache\mgr\test\"]test(4,2,70) 
08/10-11:19:50:459 ( 1836) Too many errors restoring to C:\MyCache\mgr\test\. 
 Dismounting and skipping subsequent records 
08/10-11:19:50:539 ( 1836) 4 errors during journal restore, 
see console.log file for details. 
Startup aborted, entering single user mode. 
 

If the errors are from transaction rollback, then the output looks similar to this:

08/11-08:55:08:732 ( 428) System Initialized. 
08/11-08:55:08:752 ( 1512) Write daemon started. 
08/11-08:55:10:444 ( 2224) Performing Journal Recovery 
08/11-08:55:11:165 ( 2224) Performing Transaction Rollback 
08/11-08:55:11:736 ( 2224) Max Journal Size: 1073741824 
08/11-08:55:11:746 ( 2224) START: C:\MyCache\mgr\journal\20150811.011 
08/11-08:55:12:487 ( 2224) Journaling selected globals to 
     C:\MyCache\mgr\journal\20150811.011 started. 
08/11-08:55:12:487 ( 2224) Rolling back transactions ... 
08/11-08:55:12:798 ( 2224) Error in %ROLLBACK: <DATABASE>set+2^%ROLLBACK 
     C:\MyCache\mgr\journal\20150811.010 addr=984744 
     ^["^^C:\MyCache\mgr\test\"]test(4,1,80) 
08/11-08:55:12:798 ( 2224) Rollback of transaction for process id #2148 
 aborted at offset 984744 in C:\MyCache\mgr\journal\20150811.010. 
08/11-08:55:13:809 ( 2224) C:\MyCache\mgr\test\ dismounted - 
      Subsequent records will not be restored 
08/11-08:55:13:809 ( 2224) Rollback of transaction for process id #924 
 aborted at offset 983464 in C:\MyCache\mgr\journal\20150811.010. 
08/11-08:55:14:089 ( 2224) STOP: C:\MyCache\mgr\journal\20150811.011 
08/11-08:55:14:180 ( 2224) 1 errors during journal rollback, 
see console.log file for details. 
Startup aborted, entering single user mode. 
 

Both output listings end with the same instructions:

Enter Cache' with 
     C:\MyCache\bin\cache -sC:\MyCache\mgr -B 
and D ^STURECOV for help recovering from the errors. 

When Caché cannot start properly, it starts in single-user mode. While in this mode, execute the special commands indicated in these instructions to enter Caché. For example, for a Windows installation, enter the following:

C:\MyCache\bin\>cache -sC:\MyCache\mgr -B

UNIX®/Linux systems have a slightly different syntax.

This runs the Caché executable from the Caché installation bin directory (install-dir\bin) indicating the pathname (by using the -s argument) of the system manager’s directory (install-dir\mgr) and inhibits all logins except one emergency login (by using the -B argument).

You are now in the manager’s namespace and can run the startup recovery routine, ^STURECOV:

Do ^STURECOV

The ^STURECOV journal recovery menu appears as follows:


Journal recovery options 
-------------------------------------------------------------- 
1) Display the list of errors from startup 
2) Run the journal restore again 
3) Bring down the system prior to a normal startup
4) Dismount a database 
5) Mount a database 
6) Database Repair Utility 
7) Check Database Integrity 
8) Reset system so journal is not restored at startup 
9) Display instructions on how to shut down the system 
10) Display Journaling Menu (^JOURNAL)
-------------------------------------------------------------- 
H) Display Help
E) Exit this utility
--------------------------------------------------------------
 
Enter choice (1-10) or [Q]uit/[H]elp?

Only UNIX®/Linux systems contain option 9 on the menu.

Before starting the system in multiuser mode, correct the errors that prevented the journal restore or transaction rollback from completing. You have several options regarding what to do:

  • Option 1 — The journal restore and transaction rollback procedure tries to save the list of errors in the ^%SYS() global. This is not always possible depending on what is wrong with the system. If this information is available, this option displays the errors.

  • Option 2 — This option performs the same journal restore and transaction rollback which was performed when the system was started. The amount of data is small so it should not be necessary to try and restart from where the error occurred.

  • Option 3 — When you are satisfied that the system is ready for use, use this option to bring the instance down prior to restarting it in a normal fashion.

  • Option 4 — This option lets you dismount a database. Generally, use this option if you want to let users back on a system but you want to prevent them from accessing a database which still has problems (^DISMOUNT utility).

  • Option 5 — This option lets you mount a database (^MOUNT utility).

  • Option 6 — This option lets you edit the database structure (^REPAIR utility).

  • Option 7 — This option lets you validate the database structure (^INTEGRIT utility).

  • Option 8 — This updates the system so that it does not attempt journal restore or transaction rollback at startup. This applies only to the next time the startup process is run. Use this in situations where you cannot get journal recovery to complete and you need to allow users back on the system. Consider dismounting the databases which have not been recovered. This operation is not reversible. You can perform journal restore manually using the ^JRNRESTO utility.

  • Option 9 — It is not possible to shut down the system from this utility, but this option displays instructions on how to shut the system down from the UNIX® command line.

  • Option 10 — This option brings up the journaling menu which allows you to browse and restore journal files. There are options which start and stop journaling but these are not generally of interest when resolving problems with journaling at startup.

Take whatever corrective action is necessary to resolve the problem. This may involve using the ^DATABASE routine to extend the maximum size of the database, or it may require freeing space on the file system or using the ^INTEGRIT and ^REPAIR utilities to find and correct database degradation. As you do this work, you can use Option 2 of the ^STURECOV utility to retry the journal replay/transaction rollback as many times as necessary. You can display any errors you encounter, including those from when the system started, using Option 1. When you correct all the problems, and run Option 2 without any errors, use Option 3 to bring the system up in multiuser mode.

If you find that you cannot resolve the problem, but you still want to bring the system up, use Option 8 to clear the information in the Caché image journal (.wij file) that triggers journal restore and transaction rollback at startup. The option also logs the current information in the console log. Once this completes, use Option 3 to start the system. Use this facility with care, as it is not reversible.

If Caché was unable to store the errors during startup in the ^%SYS() global for ^STURECOV to display, you may get an initial message before the menu that looks like this:


There is no record of any errors during the prior startup 
This could be because there was a problem writing the data 
Do you want to continue ? No => yes 
Enter error type (? for list) [^] => ? 

Supported error types are: 
     JRN - Journal and transaction rollback 

Enter error type (? for list) [^] => JRN 
 

Journaling errors are one type of error that this utility tries to handle and that is the scope of this chapter. Other error types are discussed in the appropriate sections of the documentation.

Caution:

Only use the ^STURECOV utility when the system is in single-user mode following an error during startup. Using it while the system is in any other state (for example, up running normally) can cause serious damage to your data as it restores journal information if you ask it to and this information may not be the most current data. The ^STURECOV utility warns you, but it lets you force it to run.

Convert Journal Files Using ^JCONVERT and ^%JREAD

The ^JCONVERT routine is a utility that reads journal files and converts them to a common file in variable record format. The ^%JREAD utility can then read this file and apply journal transactions to databases on a different system. The ^JCONVERT utility exists on older InterSystems database products as well as all versions of Caché. Use these utilities to move journal data between different system versions that do not have compatible journal files.

For example, if you are converting to a new version of Caché and need to minimize downtime, perform the following steps:

  1. Enable journaling on the old system.

  2. Run a backup on the old system; this switches to a new journal file on the old system.

  3. Continue journaling on the old system.

  4. Restore the backup of the old system on the new system and perform any necessary conversions.

  5. Stop the old system and run ^JCONVERT on the journal files created on the old system since the backup.

  6. Apply the transactions from the old system to the new system using the file created from ^JCONVERT as input to ^%JREAD on the new system.

The ^JCONVERT utility uses the same process as the journal restore utility to select and filter the journal files for processing. You can include a range of journal files as input and create one output file. See the Restore Globals From Journal Files Using ^JRNRESTO section for details on selecting and filtering journal files.

The converted file is in variable record format. The default character encoding is UTF8, which is compatible with the current ^%JREAD utility on all platforms, and can be moved among platforms with binary FTP. If you answer NO at the Use UTF8 character translation? prompt, no character encoding is applied.

Globals in the journal file are stored with a specific directory reference appended to the global reference. You can choose either to include the directory reference in the converted file, or exclude it. If you include it, you can always filter it out or change it later during the ^%JREAD procedure.

The directory reference determines where ^%JREAD sets the global on the target system. If you do not include the directory reference, ^%JREAD makes all sets in the current directory. If you do include the directory reference, the utility makes sets in the same directory as on the source system unless translated by a ^%ZJREAD program you supply. If the target system is on a different operating system or the databases reside in different directories on the target system, you must supply a ^%ZJREAD routine to translate the directory reference.

The ^%JREAD routine reads a common journal file format and applies the journal transactions to the databases on the target system. During the import of records, if a ^%ZJREAD routine exists, the utility calls it for each journal transaction allowing you to manipulate the journal records. You can reference the following variables in your ^%ZJREAD routine:

      type    - Transaction type
      gref    - Global reference
      value   - Global value
      %ZJREAD - 1:Apply transaction, 0:Do not apply transaction

If you decide not to apply a transaction, set the variable %ZJREAD to 0 (zero) to skip the record. You can also modify the other variables. For example, you can change the directory specification by modifying gref.

The following is an example ^%ZJREAD routine. It looks for transactions that contain updates to %SYS(“JOURNAL”, and prevents them from being applied. You can copy this and modify it to suit your needs:

%ZJREAD;
 /*The following variables are defined; you can modify them
   before the transaction gets applied
 
        type - Transaction type
        gref - Global reference
        value - Global value
        %ZJREAD - 1:Apply transaction, 0:Do not apply transaction
 */
 If gref["SYS(""JOURNAL""" Set %ZJREAD=0
 Quit

Sample Run of ^JCONVERT

The following is a sample run of the ^JCONVERT utility:

%SYS>Do ^JCONVERT
 
Journal Conversion Utility  [ Cache Format --> Common Format ]


The converted file will be in variable record format.
The default character translation UTF8 is compatible with current ^%JREAD 
on all platforms and can be moved among platforms with binary FTP.
If you answer NO, no character translation will be applied.
 
Use UTF8 character translation? <Yes>

 
Globals in the journal file are stored with a specific directory reference
appended to the global reference. You can choose either to include
the directory reference in the converted file, or exclude it. Note that
if you include it, you can always filter it out or change it later during
the %JREAD procedure.  The directory reference determines where ^%JREAD sets
the global on the target system.  If the directory reference is not included,
all sets are made to the current directory.  If the directory reference is
included, sets will be made to the same directory as on the source system
unless translated by a ^%ZJREAD program you supply.  If the target system
is on a different operating system or the databases reside in different
directories on the target system, the ^%ZJREAD program must be used to
translate the directory reference.
 
Include the directory reference? <Yes>
 
Enter common journal file name:  common.jrn
 
Common journal file: common.jrn
Record separator: Variable
Directory reference: Yes
 
Use current journal filter (ZJRNFILT)? no
Use journal marker filter (MARKER^ZJRNFILT)? no
Process all journaled globals in all directories?   enter Yes or No, please
Process all journaled globals in all directories? yes
Specify range of files to process (names in YYYYMMDD.NNN format)
 
from:     <20151201.001> [?] => 20151202.001
 
through:  <20151204.001> [?] =>
 
Prompt for name of the next file to process? No => No
 

Provide or confirm the following configuration settings:
 
Journal File Prefix: =>
 
Files to dejournal will be looked for in:
     C:\MyCache\mgr\journal\
     C:\MyCache\mgr\
in addition to any directories you are going to specify below, UNLESS
you enter a minus sign ('-' without quotes) at the prompt below,
in which case ONLY directories given subsequently will be searched
 
Directory to search: <return when done>
Here is a list of directories in the order they will be searched for files:
     C:\MyCache\mgr\journal\
     C:\MyCache\mgr\

You may tailor the response to errors by choosing between the alternative
actions described below.  Otherwise you will be asked to select an action
at the time an error actually occurs.
 
     Either Continue despite database-related problems (e.g., a target
     database is not journaled, cannot be mounted, etc.), skipping affected
     updates
 
     or     Abort if an update would have to be skipped due to a
     database-related problem (e.g., a target database is not journaled,
     cannot be mounted, etc.)
 
     Either Abort if an update would have to be skipped due to a
     journal-related problem (e.g., journal corruption, some cases of missing
     journal files, etc.)
 
     or     Continue despite journal-related problems (e.g., journal
     corruption, some missing journal files, etc.), skipping affected updates
 
     Either Apply sorted updates to databases before aborting
 
     or     Discard sorted, not-yet-applied updates before aborting (faster)
 
Would you like to specify error actions now? No => yes
 
 
     1.  Continue despite database-related problems (e.g., a target database
     is not journaled, cannot be mounted, etc.), skipping affected updates
 
     2.  Abort if an update would have to be skipped due to a database-related
     problem (e.g., a target database is not journaled, cannot be mounted,
     etc.)
 
Select option [1 or 2]:  1
 
     1.  Abort if an update would have to be skipped due to a journal-related
     problem (e.g., journal corruption, some cases of missing journal files,
     etc.)
 
     2.  Continue despite journal-related problems (e.g., journal corruption,
     some missing journal files, etc.), skipping affected updates
 
Select option [1 or 2]:  2
 
     1.  Apply sorted updates to databases before aborting
 
     2.  Discard sorted, not-yet-applied updates before aborting (faster)
 
Select option [1 or 2]:  2
Based on your selection, this restore will
 
** Continue despite database-related problems (e.g., a target database is not
journaled, cannot be mounted, etc.), skipping affected updates
 
** Continue despite journal-related problems (e.g., journal corruption, some
missing journal files, etc.), skipping affected updates
 
** Discard sorted, not-yet-applied updates before aborting (faster)
 
 
 
C:\MyCache\mgr\journal\20151202.001
  13.98%  14.93%  15.95%  17.14%  18.25%  19.27%  20.49%  21.63%  22.65%  23.84%
  24.99%  25.97%  27.10%  28.25%  29.31%  30.50%  31.72%  32.84%  33.84%  34.84%
  35.84%  36.85%  37.91%  38.99%  40.10%  41.08%  42.03%  42.97%  43.93%  44.94%
  45.95%  47.05%  48.11%  49.07%  50.04%  51.02%  52.03%  53.07%  54.14%  55.25%
  56.21%  57.17%  58.15%  59.14%  60.18%  61.24%  62.33%  63.28%  64.20%  65.15%
  66.10%  67.11%  68.13%  69.05%  69.94%  70.83%  71.61%  72.41%  73.09%  73.85%
  74.59%  75.32%  76.06%  76.75%  77.73%  78.70%  79.65%  80.59%  81.53%  82.46%
  83.40%  84.33%  85.27%  86.05%  86.59%  87.13%  87.67%  88.23%  88.78%  89.34%
  89.89%  90.61%  93.28%  94.38%  97.12%  98.21%  99.93%100.00%
***Journal file finished at 11:31:36
 
 
C:\MyCache\mgr\journal\20151203.001
  14.01%  14.96%  15.98%  17.18%  18.29%  19.31%  20.53%  21.67%  22.69%  23.88%
  25.03%  26.01%  27.15%  28.30%  29.36%  30.55%  31.78%  32.90%  33.90%  34.90%
  35.91%  36.92%  37.99%  39.06%  40.17%  41.16%  42.11%  43.05%  44.01%  45.03%
  46.04%  47.14%  48.20%  49.17%  50.14%  51.11%  52.13%  53.17%  54.25%  55.36%
  56.33%  57.29%  58.27%  59.26%  60.30%  61.36%  62.46%  63.40%  64.33%  65.28%
  66.23%  67.24%  68.26%  69.19%  70.08%  70.97%  71.76%  72.56%  73.25%  74.01%
  74.75%  75.47%  76.22%  76.91%  77.89%  78.87%  79.83%  80.77%  81.70%  82.64%
  83.58%  84.52%  85.46%  86.24%  86.78%  87.32%  87.87%  88.42%  88.98%  89.53%
  90.09%  90.81%  93.49%  94.59%  97.33%  98.42%100.00%
***Journal file finished at 11:31:37
 
 
C:\MyCache\mgr\journal\20151204.001
  13.97%  14.92%  15.93%  17.12%  18.24%  19.25%  20.47%  21.61%  22.62%  23.82%
  24.96%  25.94%  27.07%  28.22%  29.28%  30.46%  31.69%  32.80%  33.80%  34.80%
  35.80%  36.81%  37.87%  38.94%  40.05%  41.04%  41.98%  42.92%  43.88%  44.89%
  45.90%  47.00%  48.06%  49.02%  49.98%  50.96%  51.97%  53.01%  54.08%  55.19%
  56.15%  57.11%  58.08%  59.07%  60.12%  61.17%  62.26%  63.20%  64.13%  65.07%
  66.02%  67.03%  68.05%  68.97%  69.86%  70.75%  71.53%  72.33%  73.01%  73.77%
  74.51%  75.23%  75.98%  76.67%  77.64%  78.61%  79.56%  80.50%  81.43%  82.37%
  83.30%  84.24%  85.17%  85.95%  86.49%  87.03%  87.57%  88.13%  88.68%  89.23%
  89.79%  90.51%  93.18%  94.27%  97.01%  98.10%  99.81%100.00%
***Journal file finished at 11:31:38
 
[journal operation completed]
Converted 26364 journal records 

Set Journal Markers Using ^JRNMARK

To set a journal marker in a journal file, use the following routine:

SET rc=$$ADD^JRNMARK(id,text)

Argument Description
id Marker ID (for example, -1 for backup)
text Marker text of any string up to 256 characters (for example, “timestamp” for backup)
rc Journal location of the marker (journal offset and journal file name, delimited by a comma) or, if the operation failed, a negative error code followed by a comma and a message describing the error. Note that a journal offset must be a positive number.

Manipulate Journal Files Using ^JRNUTIL

InterSystems provides several functions in the ^JRNUTIL routine. You can use these functions for writing site-specific routines to manipulate journal records and files.

The following table lists the functions available in the routine.

Functions Available in ^JRNUTIL
Journaling Task Function Syntax
Close a journal file $$CLOSEJRN^JRNUTIL(jrnfile)
Delete a journal file $$DELFILE^JRNUTIL(jrnfile)
Read a record from a journal file into a local array $$GETREC^JRNUTIL(addr,jrnode)
Switch to a different journal file directory $$JRNSWCH^JRNUTIL(newdir)
Open a journal file $$OPENJRN^JRNUTIL(jrnfile)
Use an opened journal file $$USEJRN^JRNUTIL(jrnfile)
Important:

The DELFILE^JRNUTIL function does not check for open transactions before deleting the journal file.

The following table describes the arguments used in the utility.

Argument Description
addr Address of the journal record.
jrnfile Name of journal file.
newdir New journal file directory.
jrnode Local variable passed by reference to return journal record information.

Manage Journaling at the Process Level Using %NOJRN

If journaling is enabled system-wide, you can stop journaling for Set and Kill operations on globals within a particular process by issuing a call to the ^%NOJRN utility from within an application or from programmer mode as follows:

%SYS>DO DISABLE^%NOJRN

Journaling remains disabled until one of the following events occurs:

  • The process halts.

  • The process issues the following call to reactivate journaling:

    %SYS>DO ENABLE^%NOJRN
    
Note:

Disabling journaling using DISABLE^%NOJRN does not affect mirrored databases.

You must have at least read access to the %Admin_Manage resource to use DISABLE^%NOJRN.

Journal I/O Errors

When Caché encounters a journal file I/O error, the response depends on the Freeze on error journal setting, which is on the Journal Settings page of the Management Portal (System Administration > Configuration > System Configuration > Journal Settings). The Freeze on error setting works as follows:

  • When the Freeze on error setting is No (the default), the journal daemon retries the failed operation until it succeeds or until one of several conditions is met, at which point all journaling is disabled. This approach keeps the system available, but disabling journaling compromises data integrity and recoverability.

  • When Freeze on error is set to Yes, all journaled global updates are frozen. This protects data integrity at the expense of system availability.

The Freeze on error setting also affects application behavior when a local transaction rollback fails.

InterSystems recommends you review your business needs and determine the best approach for your environment. The following sections describe the impact of each choice:

Journal Freeze on Error Setting is No

If you configure Caché not to freeze on a journal file I/O error, the journal daemon retries the failed operation periodically (typically at one second intervals) until either it succeeds or one of the following conditions is met:

  • The daemon has been retrying the operation for a predetermined time period (typically 150 seconds)

  • The system cannot buffer any further journaled updates

When one of these conditions is met, journaling is disabled and database updates are no longer journaled. As a result, the journal is no longer a reliable source from which to recover databases if the system crashes. The following conditions exist when journaling is disabled:

  • Transaction rollback fails, generating <ROLLFAIL> errors and leaving transactions partly committed.

  • Shadowing becomes undependable once updates to the source databases are no longer journaled because it relies on journaling of the source databases.

  • Crash recovery of uncommitted data is nonexistent.

  • Full recovery no longer exists. You are able to recover only to the last backup.

  • ECP lock and transaction recoverability guarantees are compromised.

  • If the system crashes, Caché startup recovery does not attempt to roll back incomplete transactions started before it disabled journaling because the transactions may have been committed, but not journaled.

What to do if journaling is disabled?

To summarize, if journaling is disabled, perform the following steps:

  1. Resolve the problem — As soon as possible, resolve the problem that disabled journaling.

  2. Switch the journal file — The Journal daemon retries the failed I/O operation periodically in an attempt to preserve the journal data accumulated prior to the disabling. If necessary, you can switch the journal file to a new directory to resolve the error; however, Caché does not re-enable journaling automatically even if it succeeds with the failed I/O operation and switches journaling to a new file. It also does not re-enable journaling if you switch the journal file manually.

  3. Back up the databases — on the main server (the backup automatically re-enables journaling if you have not done so).

    InterSystems strongly recommends backing up your databases as soon as possible after the error to avoid potential data loss. In fact, performing a Caché online backup when journaling is disabled due to an I/O error restarts journaling automatically, provided that the error condition that resulted in the disabling of journaling has been resolved and you have sufficient privileges to do so. You can also enable journaling by running ^JRNSTART.

    When a successful backup operation restarts journaling, Caché discards any pending journal I/O, since any database updates covered by the pending journal I/O are included in the backup.

    Important:

    Starting journaling requires higher privileges than running a backup.

  4. Restore shadow databases — If using shadowing, restore the backup to the shadow(s) to synchronize the databases, and restart the shadow from the new journal file started since the backup.

Journal Freeze on Error Setting is Yes

If you configure Caché to freeze on a journal file I/O error, all journaled global updates are frozen immediately upon such an error. This prevents the loss of journal data at the expense of system availability. Global updates are also frozen if the journal daemon has been unable to complete a journal write for at least 30 seconds.

The journal daemon retries the failed I/O operation and unfreezes global updates after it succeeds. Meanwhile, the freezing of global updates causes other jobs to hang. The typical outcome is that Caché hangs until you resolve the journaling problem, with the system appearing to be down to operational end-users. While Caché is hung you can take corrective measures, such as freeing up disk space, switching the journal to a different disk, or correcting a hardware failure.

The advantage to this option is that once the problem is resolved and Caché resumes normal operation, no journal data has been lost. The disadvantage is that the system is less available or unavailable while the problem is being solved.

Caché posts alerts (severity 3) to the cconsole.log file periodically while the journal daemon is retrying the failed I/O operation.

Impact of Journal Freeze on Error Setting on Transaction Rollback with TROLLBACK

It is important to be aware that the Freeze on error setting you choose can have significant implications for application behavior unrelated to journaling. When an application attempts to roll back an open transaction using the TROLLBACK command (see TROLLBACK in the Caché ObjectScript Reference) and the attempt fails, the same tradeoff presents itself as is faced when a journal I/O error is encountered: that of data integrity versus availability. Like journaling, TROLLBACK uses the Freeze on error setting to determine the appropriate behavior, as follows:

  • When the Freeze on error setting is No (the default), the process initiating the transaction and the TROLLBACK receives an error, the transaction is closed, and the locks retained for the transaction are released. This approach keeps the application available, but compromises data integrity and recoverability.

  • When Freeze on error is set to Yes, the initiating process halts and CLNDMN makes repeated attempts to roll back the open transaction. During the CLNDMN retry period, locks retained for the transaction remain intact, and as a result the application might hang. This protects data integrity at the expense of application availability.

If CLNDMN repeatedly tries and fails to roll back an open transaction for a dead job (as reported in the console log), you can use the Manage^CLNDMN utility to manually close the transaction.

Note:

The Freeze on error setting affects local (non-ECP) transaction rollback only.

Special Considerations for Journaling

Review the following special considerations when using Caché journaling:

Performance

While journaling is crucial to ensuring the integrity of your database, it can consume disk space and slow performance, depending on the number of global updates being journaled.

Journaling affects performance because updates result in double processing, as the change is recorded in both the database and the journal file. Caché uses a flat file journaling scheme to minimize the adverse effect on performance.

UNIX® File System Recommendations

The “Supported File Systems” table in the “Supported Technologies” section of the online InterSystems Supported PlatformsOpens in a new tab document for this release, which outlines file systems recommended and supported by InterSystems on UNIX®/Linux platforms, includes notes about mount options for optimum journaling performance.

Note:

When you configure the primary or alternate journal directory on a file system that does not have the recommended mount option, a message like the following is entered in the console log:

The device for the new journal file was not mounted with a recommended option (cio).

System Clock Recommendations

All operating systems supported by Caché provide Network Time Protocol (NTP) clients, which keep the system clock synchronized to a reference system, as well as facilities that automatically adjust the system clock between daylight saving time and standard time.

It is recommended that you rely on the automatic clock management features of the operating system to keep the system clock synchronized and regulated rather than adjust the system clock manually.

If you must make manual time adjustments for tasks such as testing, be sure to use a test environment (rather than the production environment) when performing such tasks. Furthermore, manual adjustments should be made with care because non-chronological events – such as adjusting the clock forward or backward – may cause issues for some utilities.

Disabling Journaling for Filing Operations

Under certain circumstances, it may be useful or necessary to disable journaling for filing operations, such as object saves and deletes. There are two ways to do this:

  • When you open an object (typically with %OpenIdOpens in a new tab or %OpenOpens in a new tab), specify a concurrency value of 0. However, if the object is already open with a higher concurrency value, then specifying a concurrency of 0 is not effective.

  • Suspend object filer transaction processing for the current process. To do this, call $system.OBJ.SetTransactionMode(0) (which is the SetTransactionModeOpens in a new tab method of the %SYSTEM.OBJOpens in a new tab class; you can invoke it through the special $system object). The SetTransactionMode method takes a value of 0 or 1: 0 turns off object filer transactions and 1 turns them on. Note that this setting affects the entire process, not just the current filing operation.

Important:

While certain circumstances call for disabling journaling, make sure that this is necessary before doing it. Otherwise, there may be a journal that does not include all the data required, which can result in the permanent loss of data.

FeedbackOpens in a new tab