Skip to main content
Previous section   Next section

System Monitor Health State

System Monitor Health State

Based on notifications posted to the messages log (see Monitoring Log Files in the “Monitoring InterSystems IRIS Using the Management Portal” chapter of this guide), including both system alerts generated directly by the InterSystems IRIS instance and alerts and warnings generated by System Monitor and its Health Monitor component, System Monitor maintains a single value summarizing overall system health in a register in shared memory.

At startup, the system health state is set based on the number of system (not System Monitor) alerts posted to the messages log during the startup process. Once System Monitor is running, the health state can be elevated by either system alerts or System Monitor alerts or warnings. Status is cleared to the next lower level when 30 minutes have elapsed since the last system alert or System Monitor alert or warning was posted. The following table shows how the system health state is determined.

System Monitor Health State
State Set at startup when ... Set following startup when ... Cleared to ...
GREEN (0)
no system alerts are posted during startup 30 minutes (if state was YELLOW) or 60 minutes (if state was RED) have elapsed since the last system alert or System Monitor alert or warning was posted n/a
YELLOW (1)
up to four system alerts are posted during startup state is GREEN and
  • one system alert is posted
    OR
  • one or more System Monitor alerts and/or warnings are posted, but not alerts sufficient to set RED, as below
GREEN when 30 minutes have elapsed since the last system alert or System Monitor alert or warning was posted
RED (2)
five or more system alerts are posted during startup
  • state is YELLOW and one system alert is posted
    OR
  • state is GREEN or YELLOW and during a 30 minute period, System Monitor alerts from at least five different sensors or three System Monitor alerts from a single sensor are posted
YELLOW when 30 minutes have elapsed since the last system alert or System Monitor alert or warning was posted
Note:

A fourth state, HUNG, can occur when global updates are blocked. Specifically, the following events change the state to HUNG:

    The journal daemon is paused for more than 5 seconds or frozen (see Journal I/O Errors in the “Journaling” chapter of the Data Integrity Guide).

    Any of switches 10, 11, 13, or 14 are set (see Using Switches in the “Managing InterSystems IRIS Remotely” chapter of Specialized System Tools and Utilities).

    The write daemon is stopped for any reason or sets the updates locked flag for more than 3 seconds.

    The number of available buffers falls into the critical region and remains there for more than 5 seconds.

When the health state changes to HUNG, the reason is written to the messages log.

You can view the System Monitor health state using:

    The View System Health option on the View System Data menu of ^%SYSMONMGR (which does not report HUNG).

    The $SYSTEM.Monitor API, which lets you access the system status directly. Use $SYSTEM.Monitor.State() to return the system status; see also the SetState, Clear, Alert, GetAlerts, and ClearAlerts methods.

    The iris list and iris qlist commands (which do not include health state on Windows).

Note:

When System Monitor is not running, the System Monitor health state is always GREEN.

Previous section   Next section