System Monitor Health State
System Monitor Health State
Based on notifications posted to the messages log (see Monitoring Log Files in the “Monitoring InterSystems IRIS Using the Management Portal” chapter of this guide), including both system alerts generated directly by the InterSystems IRIS instance and alerts and warnings generated by System Monitor and its Health Monitor component, System Monitor maintains a single value summarizing overall system health in a register in shared memory.
At startup, the system health state is set based on the number of system (not System Monitor) alerts posted to the messages log during the startup process. Once System Monitor is running, the health state can be elevated by either system alerts or System Monitor alerts or warnings. Status is cleared to the next lower level when 30 minutes have elapsed since the last system alert or System Monitor alert or warning was posted. The following table shows how the system health state is determined.
|State||Set at startup when ...||Set following startup when ...||Cleared to ...|
|no system alerts are posted during startup||30 minutes (if state was YELLOW) or 60 minutes (if state was RED) have elapsed since the last system alert or System Monitor alert or warning was posted||n/a|
|up to four system alerts are posted during startup||state is GREEN and
||GREEN when 30 minutes have elapsed since the last system alert or System Monitor alert or warning was posted|
|five or more system alerts are posted during startup||
||YELLOW when 30 minutes have elapsed since the last system alert or System Monitor alert or warning was posted|
A fourth state, HUNG, can occur when global updates are blocked. Specifically, the following events change the state to HUNG:
The journal daemon is paused for more than 5 seconds or frozen (see Journal I/O Errors in the “Journaling” chapter of the Data Integrity Guide).
Any of switches 10, 11, 13, or 14 are set (see Using Switches in the “Managing InterSystems IRIS Remotely” chapter of Specialized System Tools and Utilities).
The write daemon is stopped for any reason or sets the updates locked flag for more than 3 seconds.
The number of available buffers falls into the critical region and remains there for more than 5 seconds.
When the health state changes to HUNG, the reason is written to the messages log.
You can view the System Monitor health state using:
The $SYSTEM.Monitor API, which lets you access the system status directly. Use $SYSTEM.Monitor.State() to return the system status; see also the SetState, Clear, Alert, GetAlerts, and ClearAlerts methods.
The iris list and iris qlist commands (which do not include health state on Windows).
When System Monitor is not running, the System Monitor health state is always GREEN.