Skip to main content
Previous sectionNext section

Monitoring InterSystems IRIS Using REST API

Every InterSystems IRIS® data platform instance contains a REST interface that provides statistics about the instance. The REST API provides a way to gather information from multiple machines running InterSystems IRIS, allowing you to monitor in detail all instances that comprise your application.

This appendix describes the metrics the /api/monitor service provides. These metrics are compatible with Prometheus, an open-source monitoring and alerting tool. Configuring Prometheus to scrape multiple connected InterSystems IRIS instances provides a cohesive view of your entire system, making it easier to evaluate whether the system is behaving properly and efficiently.

Note:

For an introduction to creating and using REST interfaces, see First Look: Developing Rest Interfaces with InterSystems Products.

/api/monitor Service

The /api/monitor service provides information about the InterSystems IRIS Instance on which it runs. By default, the /api/monitor web application is enabled with “Unauthenticated” access. For information about setting up authentication for this service, see the Securing REST Services chapter in Creating REST Services.

This API has the following two endpoints:

  • /metrics Endpoint, which returns all instance metrics, and can be configured to return specific application metrics.

  • /alerts Endpoint, which returns any system alerts that have been posted since the endpoint was last scraped.

Note:

InterSystems IRIS logs any errors in the SystemMonitor.log file, which is located in the install-dir/mgr directory.

/metrics Endpoint

The /metrics endpoint returns a list of metrics, which are described in the Metric Descriptions section. The Create Application Metrics section contains instructions for how to define custom metrics.

To configure Prometheus to scrape an instance of InterSystems IRIS, follow the instructions in First Steps With Prometheus (https://prometheus.io/docs/introduction/first_steps/).

Metric Descriptions

The metrics are returned in a text-based format, described in the Exposition Formats page of the Prometheus documentation (https://prometheus.io/docs/instrumenting/exposition_formats/). Each metric is listed on a single line with only one space, which separates the name from the value.

All the included InterSystems IRIS metrics are listed in the table below. Metric names with a label appear here with line breaks to improve readability.

Metric Name Description
iris_cpu_pct
{id="ProcessType"}
Percent of CPU usage by InterSystems IRIS process type. ProcessType can be any of the following:
ECPWorker, ECPCliR, ECPCliW, ECPSrvR, ECPSrvW, LICENSESRV, WDSLAVE, WRTDMN, JRNDMN, GARCOL, CSPDMN, CSPSRV, ODBCSRC, MirrorMaster, MirrorPri, MirrorBack, MirrorPre, MirrorSvrR, MirrorJrnR, MirrorSK, MirrorComm
(For more information about InterSystems IRIS Processes, see Securing InterSystems Products and Operating System Resources.)
iris_cpu_usage Percent of CPU usage for all programs on the operating system
iris_csp_activity
{id="IPaddress:port"}
Number of web requests served by the Web Gateway Server since it was started
iris_csp_actual_connections
{id="IPAddress:port"}
Number of current connections to this server by the Web Gateway Server
iris_csp_gateway_latency
{id="IPaddress:port"}
Amount of time to obtain a response from the Web Gateway Server when fetching iris_csp_ metrics, in milliseconds
iris_csp_in_use_connections
{id="IPaddress:port"}
Number of current connections to this server by the Web Gateway Server that are processing a web request
iris_csp_private_connections
{id="IPaddress:port"}
Number of current connections to this server by the Web Gateway Server that are reserved for state-aware applications (Preserve mode 1)
iris_csp_sessions Number of currently active web session IDs on this server
iris_cache_efficiency Ratio of global references to physical reads and writes, as a percent
iris_db_expansion_size_mb
{id="database"}
Amount by which to expand database, in megabytes
iris_db_free_space
{id="database"}
Free space available in database, in megabytes (This metric is only updated once per day, and may not reflect recent changes.)
iris_db_latency
{id="database"}
Amount of time to complete a random read from database, in milliseconds
iris_db_max_size_mb
{id="database"}
Maximum size to which database can grow, in megabytes
iris_db_size_mb
{id="database",dir="path"}
Size of database, in megabytes
iris_directory_space
{id="database",dir="path"}
Free space available on the database directory’s storage volume, in megabytes
iris_disk_percent_full
{id="database",dir="path"}
Percent of space filled on the database directory’s storage volume
iris_ecp_conn Total number of active client connections on this ECP application server
iris_ecp_conn_max Maximum active client connections from this ECP application server
iris_ecp_connections Number of servers synchronized when this ECP application server synchronizes with its configured ECP data servers
iris_ecp_latency Latency between the ECP application server and the ECP data server, in milliseconds
iris_ecps_conn Total active client connections to this ECP data server per second
iris_ecps_conn_max Maximum active client connections to this ECP data server
iris_glo_a_seize_per_sec Number of Aseizes on the global resource per second (For more information, see Considering Seizes, ASeizes, and NSeizes in the “Monitoring Performance Using ^mgstat” section of Monitoring Guide.)
iris_glo_n_seize_per_sec Number of Nseizes on the global resource per second (For more information, see Considering Seizes, ASeizes, and NSeizes in the “Monitoring Performance Using ^mgstat” section of Monitoring Guide.)
iris_glo_ref_per_sec Number of references to globals located on local databases per second
iris_glo_ref_rem_per_sec Number of references to globals located on remote databases per second
iris_glo_seize_per_sec Number of seizes on the global resource per second (For more information, see Considering Seizes, ASeizes, and NSeizes in the “Monitoring Performance Using ^mgstat” section of Monitoring Guide.)
iris_glo_update_per_sec Number of updates (SET and KILL commands) to globals located on local databases per second
iris_glo_update_rem_per_sec Number of updates (SET and KILL commands) to globals located on remote databases per second
iris_jrn_block_per_sec Journal blocks written to disk per second
iris_jrn_free_space
{id="JournalType",dir="path"}
Free space available on each journal directory’s storage volume, in megabytes. JournalType can be WIJ, primary, or secondary
iris_jrn_size
{id="JournalType"}
Current size of each journal file, in megabytes. JournalType can be WIJ, primary, or secondary
iris_license_available Number of licenses not currently in use
iris_license_consumed Number of licenses currently in use
iris_license_percent_used Percent of licenses currently in use
iris_log_reads_per_sec Logical reads per second
iris_obj_a_seize_per_sec Number of Aseizes on the object resource per second (For more information, see Considering Seizes, ASeizes, and NSeizes in the “Monitoring Performance Using ^mgstat” section of Monitoring Guide.)
iris_obj_del_per_sec Number of objects deleted per second
iris_obj_hit_per_sec Number of object references per second, in process memory
iris_obj_load_per_sec Number of objects loaded from disk per second, not in shared memory
iris_obj_miss_per_sec Number of object references not found in memory per second
iris_obj_new_per_sec Number of objects initialized per second
iris_obj_seize_per_sec Number of seizes on the object resource per second (For more information, see Considering Seizes, ASeizes, and NSeizes in the “Monitoring Performance Using ^mgstat” section of Monitoring Guide.)
iris_page_space_percent_used Percent of maximum allocated page file space used
iris_phys_mem_percent_used Percent of physical memory (RAM) currently in use
iris_phys_reads_per_sec Physical database blocks read from disk per second
iris_phys_writes_per_sec Physical database blocks written to disk per second
iris_process_count Total number of active InterSystems IRIS processes
iris_rtn_a_seize_per_sec Number of Aseizes on the routine resource per second (For more information, see Considering Seizes, ASeizes, and NSeizes in the “Monitoring Performance Using ^mgstat” section of Monitoring Guide.)
iris_rtn_call_local_per_sec Number of local routine calls per second to globals located on remote databases per second
iris_rtn_call_miss_per_sec Number of routines calls not found in memory per second
iris_rtn_call_remote_per_sec Number of remote routine calls per second
iris_rtn_load_per_sec Number of routines locally loaded from or saved to disk per second
iris_rtn_load_rem_per_sec Number of routines remotely loaded from or saved to disk per second
iris_rtn_seize_per_sec Number of seizes on the routine resource per second (For more information, see Considering Seizes, ASeizes, and NSeizes in the “Monitoring Performance Using ^mgstat” section of Monitoring Guide.)
iris_sam_get_db_sensors_seconds Amount of time it took to collect iris_db* sensors, in seconds
iris_sam_get_jrn_sensors_seconds Amount of time it took to collect iris_jrn* sensors, in seconds
iris_smh_available
{id="purpose"}
Shared memory available by purpose, in kilobytes (For more information, including a list of identifiers for purpose, see Generic (Shared) Memory Heap Usage in “Monitoring InterSystems IRIS Using the Management Portal” section of Monitoring Guide.)
iris_smh_percent_full
{id="purpose"}
Percent of allocated shared memory in use by purpose (For more information, including a list of identifiers for purpose, see Generic (Shared) Memory Heap Usage in “Monitoring InterSystems IRIS Using the Management Portal” section of Monitoring Guide.)
iris_smh_total Shared memory allocated for current instance, in kilobytes
iris_smh_total_percent_full Percent of allocated shared memory in use for current instance
iris_smh_used
{id="purpose"}
Percent of shared memory in use by purpose, in kilobytes (For more information, including a list of identifiers for purpose, see Generic (Shared) Memory Heap Usage in “Monitoring InterSystems IRIS Using the Management Portal” section of Monitoring Guide.)
iris_system_alerts The number of alerts posted to the messages log since system startup
iris_system_alerts_new Whether new alerts are available on the /api/monitor/alerts endpoint, as a Boolean
iris_system_state A number representing the system monitor health state (For more information, see System Monitor Health State in the “Using System Monitor” section of Monitoring Guide.)
iris_trans_open_count Number of open transactions on the current instance
iris_trans_open_secs Average duration of open transactions on the current instance, in seconds
iris_trans_open_secs_max Duration of longest currently open transaction on the current instance, in seconds
iris_wd_buffer_redirty Number of database buffers the write daemon wrote during the most recent cycle that were also written in prior cycle
iris_wd_buffer_write Number of database buffers the write daemon wrote during its most recent cycle
iris_wd_cycle_time Amount of time the most recent write daemon cycle took to complete, in milliseconds
iris_wd_proc_in_global Number of processes actively holding global buffers at start of the most recent write daemon cycle
iris_wd_size_write Size of database buffers the write daemon wrote during its most recent cycle, in kilobytes
iris_wd_sleep Amount of time that the write daemon was inactive before its most recent cycle began, in milliseconds
iris_wd_temp_queue Number of in-memory buffers the write daemon used at the start of its most recent cycle
iris_wd_temp_write Number of in-memory buffers the write daemon wrote during its most recent cycle
iris_wdwij_time Amount of time the write daemon spent writing to the WIJ file during its most recent cycle, in milliseconds
iris_wd_write_time Amount of time the write daemon spent writing buffers to databases during its most recent cycle, in milliseconds
iris_wij_writes_per_sec WIJ physical block writes per second

Create Application Metrics

To add custom application metrics to those returned by the /metrics endpoint:

  1. Create a new class that inherits from %SYS.Monitor.SAM.Abstract.

  2. Define the PRODUCT parameter as the name of your application. This can be anything except for iris, which is reserved for the InterSystems IRIS metrics.

  3. Implement the GetSensors() method to define the desired custom metrics, as follows:

    • The method must contain one or more calls to the SetSensor() method. This method sets the name and value for an application metric. The values should be integers or floating point numbers to ensure compatibility with Prometheus and InterSystems SAM.

      You can optionally define a label for the metric, though if you do, you must always define a label for that particular metric.

      Note:

      For best practices when choosing metric and label names, see Metric and Label Naming in the Prometheus documentation (https://prometheus.io/docs/practices/naming/).

    • The method must return $$$OK if successful.

    Important:

    A slow implementation of GetSensors() can negatively impact system performance. Be sure to test that your implementation of GetSensors() is efficient, and avoid implementations that could time out or hang.

  4. Compile the class. An example is shown below:

    /// Example of a custom class for the /metric API
    Class MyMetrics.Example Extends %SYS.Monitor.SAM.Abstract
    {
    
    Parameter PRODUCT = "myapp";
    
    /// Collect metrics from the specified sensors
    Method GetSensors() As %Status
    {
       do ..SetSensor("my_counter",$increment(^MyCounter),"my_label")
       do ..SetSensor("my_gauge",$random(100))
       return $$$OK
    }
    
    }
    Copy code to clipboard
  5. Use the AddApplicationClass() method of the SYS.Monitor.SAM.Config class to add the custom class to the /metrics configuration. Pass as arguments the name of the class and the namespace where it is located.

    For example, enter the following in the Terminal from the %SYS namespace:

    %SYS>set status = ##class(SYS.Monitor.SAM.Config).AddApplicationClass("MyMetrics.Example", "USER")
    
    %SYS>w status
    status=1
    Copy code to clipboard
  6. Review the output of the /metrics endpoint by pointing your browser to http://<instance-host>:52773/api/monitor/metrics (where 52773 is the default WebServer port). The metrics you defined should appear after the InterSystems IRIS metrics, such as:

    [...]
    myapp_my_counter{id="my_label") 1
    myapp_my_gauge 92
    

The /metrics endpoint now returns the custom metrics you defined. The InterSystems IRIS metrics include an “iris_” prefix, while your custom metrics use the value of PRODUCT as a prefix.

/alerts Endpoint

The /alerts endpoint fetches the most recent alerts from the alerts.log file and returns them in JSON format, such as:

{"time":"2019-08-15T10:36:38.313Z","severity":2,\
"message":"Failed to allocate 1150MB shared memory using large pages.  Switching to small pages."}
Copy code to clipboard

When /alerts is called, it returns the alerts that have been generated since the previous time /alerts was called. The iris_system_alerts_new metric is a Boolean that indicates whether new alerts have been generated.

For more information about when and how alerts are generated, see the Using Log Monitor chapter of this guide.