Monitoring InterSystems IRIS via REST
Every InterSystems IRIS® data platform instance contains a REST interface that provides statistics about the instance. This REST API provides a way to gather information from multiple machines running InterSystems IRIS, allowing you to monitor in detail all instances that comprise your application. The API follows the OpenMetrics standardOpens in a new tab.
This topic describes the metrics provided by the /api/monitor service. These metrics are compatible with Prometheus, an open-source monitoring and alerting tool. Configuring Prometheus to scrape multiple connected InterSystems IRIS instances provides a cohesive view of your entire system, making it easier to evaluate whether the system is behaving properly and efficiently.
Introduction to /api/monitor Service
The /api/monitor service provides information about the InterSystems IRIS Instance on which it runs. By default, the /api/monitor web application is enabled with “Unauthenticated” access. For information about setting up authentication for this service, see Securing REST Services.
This API has the following two endpoints:
-
/api/monitor/metrics, which returns all instance metrics, and can be configured to return specific application metrics.
-
/api/monitor/alerts, which returns any system alerts that have been posted since the endpoint was last scraped.
-
/api/monitor/interop/interfaces, which returns the number and type of production interfaces running within a specified time span.
InterSystems IRIS logs any errors in the SystemMonitor.log file, which is located in the install-dir/mgr directory.
/api/monitor/metrics
The /api/monitor/metrics endpoint returns a list of metrics, which are described in Metric Descriptions. You can also enable the collection of additional metrics about active interoperability productions, as described in Interoperability Metrics. Create Application Metrics contains instructions for how to define custom metrics.
To configure Prometheus to scrape an instance of InterSystems IRIS, follow the instructions in First Steps With Prometheus (https://prometheus.io/docs/introduction/first_steps/Opens in a new tab).
Metric Descriptions
The metrics are returned in a text-based format, described in the Prometheus documentation (https://prometheus.io/docs/instrumenting/exposition_formats/Opens in a new tab). Each metric is listed on a single line with only one space, which separates the name from the value. Each unique metric is preceded by # HELP and # TYPE comment lines (described in https://prometheus.io/docs/instrumenting/exposition_formats/#comments-help-text-and-type-informationOpens in a new tab). Where applicable, InterSystems IRIS also includes a # UNIT comment line; this comment specifies the unit of measurement used for the metric. You can configure custom application metrics to supply these comment lines as well.
InterSystems IRIS metrics are listed in the table below. Metric names with a label appear here with line breaks to improve readability.
This table contains metrics for the version of InterSystems IRIS documented here. As metrics may be added in newer versions, be sure this documentation matches your version of InterSystems IRIS.
Metric Name | Description |
---|---|
iris_cpu_pct
{id="ProcessType"} |
Percent of CPU usage by InterSystems IRIS process type. ProcessType can be any of the following:
ECPWorker, ECPCliR, ECPCliW, ECPSrvR, ECPSrvW, LICENSESRV, WDAUX, WRTDMN, JRNDMN, GARCOL, CSPDMN, CSPSRV, ODBCSRC, MirrorMaster, MirrorPri, MirrorBack, MirrorPre, MirrorSvrR, MirrorJrnR, MirrorSK, MirrorComm(see Secure InterSystems Processes and Operating System Resources.) |
iris_cpu_usage | Percent of CPU usage for all programs on the operating system |
iris_csp_activity
{id="IPaddress:port"} |
Number of web requests served by the Web Gateway Server since it was started |
iris_csp_actual_connections
{id="IPAddress:port"} |
Number of current connections to this server by the Web Gateway Server |
iris_csp_gateway_latency
{id="IPaddress:port"} |
Amount of time to obtain a response from the Web Gateway Server when fetching iris_csp_ metrics, in milliseconds |
iris_csp_in_use_connections
{id="IPaddress:port"} |
Number of current connections to this server by the Web Gateway Server that are processing a web request |
iris_csp_private_connections
{id="IPaddress:port"} |
Number of current connections to this server by the Web Gateway Server that are reserved for state-aware applications (Preserve mode 1) |
iris_csp_sessions | Number of currently active web session IDs on this server |
iris_cache_efficiency | Ratio of global references to physical reads and writes, as a percent |
iris_db_expansion_size_mb
{id="database"} |
Amount by which to expand database, in megabytes |
iris_db_free_space
{id="database"} |
Free space available in database, in megabytes (This metric is only updated once per day, and may not reflect recent changes.) |
iris_db_latency
{id="database"} |
Amount of time to complete a random read from database, in milliseconds |
iris_db_max_size_mb
{id="database"} |
Maximum size to which database can grow, in megabytes |
iris_db_size_mb
{id="database",dir="path"} |
Size of database, in megabytes |
iris_directory_space
{id="database",dir="path"} |
Free space available on the database directory’s storage volume, in megabytes |
iris_disk_percent_full
{id="database",dir="path"} |
Percent of space filled on the database directory’s storage volume |
iris_ecp_conn | Total number of active client connections on this ECP application server |
iris_ecp_conn_max | Maximum active client connections from this ECP application server |
iris_ecp_connections | Number of servers synchronized when this ECP application server synchronizes with its configured ECP data servers |
iris_ecp_latency | Latency between the ECP application server and the ECP data server, in milliseconds |
iris_ecps_conn | Total active client connections to this ECP data server per second |
iris_ecps_conn_max | Maximum active client connections to this ECP data server |
iris_glo_a_seize_per_sec | Number of Aseizes on the global resource per second (see Considering Seizes, ASeizes, and NSeizes) |
iris_glo_n_seize_per_sec | Number of Nseizes on the global resource per second (see Considering Seizes, ASeizes, and NSeizes) |
iris_glo_ref_per_sec | Number of references to globals located on local databases per second |
iris_glo_ref_rem_per_sec | Number of references to globals located on remote databases per second |
iris_glo_seize_per_sec | Number of seizes on the global resource per second (see Considering Seizes, ASeizes, and NSeizes) |
iris_glo_update_per_sec | Number of updates (SET and KILL commands) to globals located on local databases per second |
iris_glo_update_rem_per_sec | Number of updates (SET and KILL commands) to globals located on remote databases per second |
iris_jrn_block_per_sec | Journal blocks written to disk per second |
iris_jrn_free_space
{id="JournalType",dir="path"} |
Free space available on each journal directory’s storage volume, in megabytes. JournalType can be WIJ, primary, or secondary |
iris_jrn_size
{id="JournalType"} |
Current size of each journal file, in megabytes. JournalType can be WIJ, primary, or secondary |
iris_license_available | Number of licenses not currently in use |
iris_license_consumed | Number of licenses currently in use |
iris_license_days_remaining | Number of days before the InterSystems IRIS license expires. Supports up to one decimal place |
iris_license_percent_used | Percent of licenses currently in use |
iris_log_reads_per_sec | Logical reads per second |
iris_obj_a_seize_per_sec | Number of Aseizes on the object resource per second (see Considering Seizes, ASeizes, and NSeizes) |
iris_obj_del_per_sec | Number of objects deleted per second |
iris_obj_hit_per_sec | Number of object references per second, in process memory |
iris_obj_load_per_sec | Number of objects loaded from disk per second, not in shared memory |
iris_obj_miss_per_sec | Number of object references not found in memory per second |
iris_obj_new_per_sec | Number of objects initialized per second |
iris_obj_seize_per_sec | Number of seizes on the object resource per second (see Considering Seizes, ASeizes, and NSeizes) |
iris_page_space_percent_used | Percent of maximum allocated page file space used |
iris_phys_mem_percent_used | Percent of physical memory (RAM) currently in use |
iris_phys_reads_per_sec | Physical database blocks read from disk per second |
iris_phys_writes_per_sec | Physical database blocks written to disk per second |
iris_process_count | Total number of active InterSystems IRIS processes |
iris_rtn_a_seize_per_sec | Number of Aseizes on the routine resource per second (see Considering Seizes, ASeizes, and NSeizes) |
iris_rtn_call_local_per_sec | Number of local routine calls per second to globals located on remote databases per second |
iris_rtn_call_miss_per_sec | Number of routines calls not found in memory per second |
iris_rtn_call_remote_per_sec | Number of remote routine calls per second |
iris_rtn_load_per_sec | Number of routines locally loaded from or saved to disk per second |
iris_rtn_load_rem_per_sec | Number of routines remotely loaded from or saved to disk per second |
iris_rtn_seize_per_sec | Number of seizes on the routine resource per second (see Considering Seizes, ASeizes, and NSeizes) |
iris_sam_get_db_sensors_seconds | Amount of time it took to collect iris_db* sensors, in seconds |
iris_sam_get_jrn_sensors_seconds | Amount of time it took to collect iris_jrn* sensors, in seconds |
iris_sam_get_sql_sensors_seconds | Amount of time it took to collect iris_sql* sensors, in seconds |
iris_sam_get_wqm_sensors_seconds | Amount of time it took to collect iris_wqm* sensors, in seconds |
iris_smh_available
{id="purpose"} |
Shared memory available by purpose, in kilobytes (For more information, including a list of identifiers for purpose, see Generic (Shared) Memory Heap Usage.) |
iris_smh_percent_full
{id="purpose"} |
Percent of allocated shared memory in use by purpose (For more information, including a list of identifiers for purpose, see Generic (Shared) Memory Heap Usage.) |
iris_smh_total | Shared memory allocated for current instance, in kilobytes |
iris_smh_total_percent_full | Percent of allocated shared memory in use for current instance |
iris_smh_used
{id="purpose"} |
Shared memory in use by purpose, in kilobytes (For more information, including a list of identifiers for purpose, see Generic (Shared) Memory Heap Usage.) |
iris_sql_active_queries
{id="namespace"} |
The number of SQL statements currently executing |
iris_sql_active_queries_95_percentile
{id="namespace"} |
For the current set of active SQL statements, the 95th percentile elapsed time since a statement began executing |
iris_sql_active_queries_99_percentile
{id="namespace"} |
For the current set of active SQL statements, the 99th percentile elapsed time since a statement began executing |
iris_sql_commands_per_second
{id="namespace"} |
Average number of ObjectScript commands executed to perform SQL queries, per second |
iris_sql_queries_avg_runtime
{id="namespace"} |
Average SQL statement runtime, in seconds |
iris_sql_queries_avg_runtime_std_dev
{id="namespace"} |
Standard deviation of the average SQL statement runtime |
iris_sql_queries_per_second
{id="namespace"} |
Average number of SQL statements, per second |
iris_system_alerts | The number of alerts posted to the messages log since system startup |
iris_system_alerts_log | The number of alerts currently located in the alerts log |
iris_system_alerts_new | Whether new alerts are available on the /api/monitor/alerts endpoint, as a Boolean |
iris_system_state | A number representing the system monitor health state (see System Monitor Health State.) |
iris_trans_open_count | Number of open transactions on the current instance |
iris_trans_open_secs | Average duration of open transactions on the current instance, in seconds |
iris_trans_open_secs_max | Duration of longest currently open transaction on the current instance, in seconds |
iris_wd_buffer_redirty | Number of database buffers the write daemon wrote during the most recent cycle that were also written in prior cycle |
iris_wd_buffer_write | Number of database buffers the write daemon wrote during its most recent cycle |
iris_wd_cycle_time | Amount of time the most recent write daemon cycle took to complete, in milliseconds |
iris_wd_proc_in_global | Number of processes actively holding global buffers at start of the most recent write daemon cycle |
iris_wd_size_write | Size of database buffers the write daemon wrote during its most recent cycle, in kilobytes |
iris_wd_sleep | Amount of time that the write daemon was inactive before its most recent cycle began, in milliseconds |
iris_wd_temp_queue | Number of in-memory buffers the write daemon used at the start of its most recent cycle |
iris_wd_temp_write | Number of in-memory buffers the write daemon wrote during its most recent cycle |
iris_wdwij_time | Amount of time the write daemon spent writing to the WIJ file during its most recent cycle, in milliseconds |
iris_wd_write_time | Amount of time the write daemon spent writing buffers to databases during its most recent cycle, in milliseconds |
iris_wij_writes_per_sec | WIJ physical block writes per second |
iris_wqm_active_worker_jobs
{id="category"} |
Average number of worker jobs running logic that are not blocked |
iris_wqm_commands_per_sec
{id="category"} |
Average number of commands executed in this Work Queue Management category, per second |
iris_wqm_globals_per_sec
{id="category"} |
Average number of global references run in this Work Queue Management category, per second |
iris_wqm_max_active_worker_jobs
{id="category"} |
Maximum number of active workers since the last log entry was recorded |
iris_wqm_max_work_queue_depth
{id="category"} |
Maximum number of entries in the queue of this Work Queue Management category since the last log |
iris_wqm_waiting_worker_jobs
{id="category"} |
Average number of idle worker jobs waiting for a group to connect to and do work for |
Interoperability Metrics
The interoperability production data collected by the /metrics endpoint described in this section is very granular, providing detailed information about number of messages processed, average number of characters processed, etc. Users who wish to do a broader check-in about the statuses of their production interfaces can take advantage of the /interop/interfaces endpoint.
In addition to the metrics described in the previous section, an InterSystems IRIS instance can also record metrics about active interoperability productions and include them in the output of the /metrics endpoint. The recording of these interoperability metrics is disabled by default. To enable it, you must perform the following steps for each interoperability production you want to monitor:
-
Open a Terminal session for the InterSystems IRIS instance running the production you want to monitor. If necessary, switch to the namespace associated with the production by executing the following command:
set $namespace = "[interopNS]"
where [interopNS] is the namespace name.
-
In the Terminal, execute the following command to enable the collection of metrics for the active production within the current namespace (SAM refers to System Alerting and MonitoringOpens in a new tab, the InterSystems monitoring solution):
do ##class(Ens.Util.Statistics).EnableSAMForNamespace()
Note:If the recording of metrics is enabled for a namespace but the corresponding production is not active, the /metrics endpoint does not return any metrics.
The Ens.Util.Statistics class provides methods for customizing the output of the /metrics endpoint. For example, invoking the method DisableSAMIncludeHostLabel will provide aggregated metrics for the entire production instead of providing them for each host individually.
The metrics available after completing this step are described in the Basic Interoperability Metrics and Message Retention Metrics tables below.
-
To collect additional metrics for a production, enable activity monitoring by invoking the class method Ens.Util.Statistics.EnableStatsForProduction in the corresponding namespace using the Terminal. You must also add the Ens.Activity.Operation.Local business operation to the production. This process is detailed in Enabling Activity Monitoring.
The additional metrics available after completing this step are described in the Activity Volume Metrics table below.
-
To collect additional HTTP transmission metrics for EnsLib.HTTP.OutboundAdapter or the EnsLib.SOAP.OutboundAdapter (used in productions), enable the reporting of HTTP metrics for the corresponding business operation by performing the following steps:
-
Open the Management Portal for the InterSystems IRIS instance containing the web client you want to monitor.
-
Select Interoperability and choose the namespace containing the web client.
-
Select Configure > Production to open the Production Configuration page.
-
Select the operation which uses the HTTP or SOAP outbound adapter.
-
In the Alerting Control section of the Production Settings > Settings panel, select the Provide Metrics for SAM check box.
-
Select Apply to save your settings.
The additional metrics available after completing this step are described in the HTTP Metrics table below.
Note:Currently, HTTP transmission metrics are only collected for business operations which invoke actors using the Queue style (not inProc). For more information on the difference between these invocation styles, see Defining a Business Operation Class.
-
InterSystems IRIS interoperability metrics are listed in the tables below. Metric names with a label appear here with line breaks to improve readability.
These tables contain metrics for the version of InterSystems IRIS documented here. As metrics may be added in newer versions, be sure this documentation matches your version of InterSystems IRIS.
Metric Name | Description |
---|---|
iris_interop_alert_delay
{id="namespace",host="host",production="production"} |
Number of hosts within the production and namespace that have triggered a Queue Wait Alert. If output has been configured to include host labels, the hosts that have triggered Queue Wait Alerts are provided separately and the value will be 1. |
iris_interop_hosts
{id="namespace",status="status",host="host",production="production"} |
Number of hosts within the production and namespace which currently have the specified status. If output has been configured to include host labels, the status of each host is provided separately and the value will be 1. status can be OK, Error, Retry, Starting, Inactive, or Unconfigured. |
iris_interop_messages
{id="namespace",host="host",production="production"} |
Number of messages processed since the production started. If output has been configured to include host labels, the number of messages processed by each host is provided separately |
iris_interop_messages_per_sec
{id="namespace",host="host",production="production"} |
Average number of messages processed within the production and namespace in a second over the most recent sampling interval. If output has been configured to include host labels, the number of messages processed by each host is provided separately |
iris_interop_queued
{id="namespace",host="host",production="production"} |
Number of messages currently queued within the production and namespace. If output has been configured to include host labels, the number of messages currently queued for each host is provided separately. |
The Message Retention Metrics, described in the following table, can help you to determine if your production data is being purged appropriately. These metrics are tabulated daily, or upon restart of the Ens.MonitorServiceOpens in a new tab service.
Metric Name | Description |
---|---|
iris_interop_oldest_message_header_days{id="namespace",production="production"} |
Age (in days) of the oldest message header that is currently stored within the namespace and production. |
iris_interop_oldest_message_header_count{id="namespace",production="production"} |
Number of message headers of the age specified by iris_interop_oldest_message_header_days which are currently stored within the namespace and production. |
iris_interop_header_count_older_than{id="namespace",days="n",production="production"} |
Number of message headers more than n days old which are currently stored within the namespace and production. InterSystems IRIS provides this metric for the following values of n: 1, 7, 14, 28, 56. |
iris_interop_namespace_storage_mb{id="namespace",production="production"} |
Total volume (in megabytes) of occupied storage space within databases which are used exclusively by the namespace and production. |
iris_interop_session_count{id="namespace",production="production"} |
Number of unique sessions recorded for anamespace and production. |
iris_interop_session_storage_kb{id="namespace",production="production"} |
Average volume (in kilobytes) of storage space used per session within databases for the namespace and production. |
InterSystems IRIS provides Activity Volume Metrics and HTTP Metrics (described in the two tables which follow) only if you perform additional activation steps as described at the beginning of this section.
Metric Name | Description |
---|---|
iris_interop_avg_processing_time
{id="namespace",hosttype="HostType",host="host",production="production",messagetype="MessageType"} |
Average length of time required to process a message of the specified MessageType within the production and namespace, in seconds. HostType can be service, operation, or actor (that is, process). MessageType is user-defined; if no MessageType is specified,"-" is returned. If output has been configured to include host labels, the message processing time for each host is provided separately. |
iris_interop_avg_queueing_time
{id="namespace",hosttype="HostType",host="host",production="production",messagetype="MessageType"} |
Average duration that a message of the specified MessageType spent in the queue while being processed by a host of HostType within the production and namespace, in seconds. HostType can be service, operation, or actor (that is, process). MessageType is user-defined; if no MessageType is specified,"-" is returned. If output has been configured to include host labels, the queueing time for each host is provided separately. |
iris_interop_sample_count
{id="namespace",hosttype="HostType",host="host",production="production",messagetype="MessageType"} |
Number of messages of the specified MessageType processed by a host of HostType within the production and namespace over the most recent sampling interval. HostType can be service, operation, or actor (that is, process). MessageType is user-defined; if no MessageType is specified,"-" is returned. If output has been configured to include host labels, the number of messages processed by each host is provided separately. |
iris_interop_sample_count_per_sec
{id="namespace",hosttype="HostType",host="host",production="production",messagetype="MessageType"} |
Number of messages of the specified MessageType processed per second by a host of HostType within the production and namespace, averaged over the most recent sampling interval. HostType can be service, operation, or actor (that is, process). MessageType is user-defined; if no MessageType is specified,"-" is returned. If output has been configured to include host labels, the number of messages processed by each host is provided separately. |
Metric Name | Description |
---|---|
iris_interop_avg_http_received_chars
{id="namespace",host="host",production="production"} |
Average number of characters received per HTTP or SOAP response within the production and namespace over the most recent sampling interval. If output has been configured to include host labels, the average number of characters received by each host is provided separately. |
iris_interop_avg_http_sent_chars
{id="namespace",host="host",production="production"} |
Average number of characters sent per HTTP or SOAP request within the production and namespace over the most recent sampling interval. If output has been configured to include host labels, the average number of characters sent by each host is provided separately. |
iris_interop_avg_http_ttfc
{id="namespace",host="host",production="production"} |
Time to First Character (TTFC): average length of time between the start of an HTTP or SOAP request and the first character of the corresponding response, in seconds. If output has been configured to include host labels, the TTFC for each host is provided separately |
iris_interop_avg_http_ttlc
{id="namespace",host="host",production="production"} |
Time to Last Character (TTLC): average length of time between the start of an HTTP or SOAP request and the last character of the corresponding response. If output has been configured to include host labels, the TTLC for each host is provided separately. |
iris_interop_http_sample_count
{id="namespace",host="host",production="production"} |
Number of HTTP or SOAP transmissions sent within the production and namespace over the most recent sampling interval. If output has been configured to include host labels, the number of transmissions sent by each host is provided separately. |
iris_interop_http_sample_count_per_sec
{id="namespace",host="host",production="production"} |
Number of HTTP or SOAP transmissions sent per second within the production and namespace, averaged over the most recent sampling interval. If output has been configured to include host labels, the number of transmissions sent by each host per second is provided separately. |
The /interop/interfaces Endpoint
This section describes the /interop/interfaces endpoint and the parameters that can be used to filter its results. Calls to this endpoint return the number of unique interfaces that have run within a specified time span.
The metrics returned by the /interop/interfaces endpoint are enumerated by interface type. Interface types are as follows:
-
Inbound Interfaces — Inbound business services.
-
Outbound Interfaces — Outbound business operations.
-
Web APIs — Manually created CSP applications. These are defined on the Web Applications page of the management portal, not within a production. Web API interfaces can feed in to REST and SOAP services or can be used for calls to custom code outside of a production.
For a discussion of inbound and outbound production interfaces, see Formal Overview of Productions.
The /interop/interfaces endpoint can be used in the following ways:
When passed without parameters, returns the number of unique interfaces that have ever been run, and the number that are currently active, enumerated by interface type.
Returns the number of unique interfaces active during the specified time span, enumerated by interface type.
The format for start-date and end-date is YYYY-MM-DD{THH:MM:SS}, and they are in the current time zone. For example, 2024–02–19T16:30:00 would be 4:30 PM on February 19, 2024, in the local time zone.
If the time is excluded for either parameter, the default time for start-date is the first second of the specified day, ie midnight; the default for end-date is the last second of the specified day.
If end-date is not included, the time span is from the specified start until the current date-time.
Returns the number of unique interfaces active during the year specified by N (January 1 through December 31 for past years; January 1 through the current date for the current year), enumerated by interface type. For example, if N is 2, the specified year is two years ago. If N is 0 or not specified, the time span is the current year to date. That is, if N is 0 and the current date is February 19, 2024, the time span is January 1, 2024 to February 19, 2024. If N is 2 and the current year is 2024, the time span is January 1, 2022 to December 31, 2022.
Returns the number of unique interfaces active during the month specified by N, enumerated by interface type. For example, if N is 2, the specified month is two months ago. If N is 0 or not specified, the time span is the current month to date. That is, if N is 0 and the current date is February 19, the time span is February 1 to February 19. If N is 2 and the current month is June, the time span is April 1 to April 30.
Returns the number of unique interfaces active during the day specified by N, enumerated by interface type. For example, if N is 2, the specified day is two days ago. If N is 0 or not specified, the time span is the current day. That is, if N is 0 and the current time is 11:53AM, the time span is today from midnight to 11:53AM. If N is 2 and the current day is February 19, the time span is midnight on February 17 to 11:59:59 PM on February 17.
Returns the number of unique interfaces active in the specified namespace, enumerated by interface type.
The Interoperability Usage page in the management portal also provides easy access to these metrics, but direct use of the API endpoint enables more flexible output filtering.
Create Application Metrics
To add custom application metrics to those returned by the /metrics endpoint:
-
Create a new class that inherits from %SYS.Monitor.SAM.AbstractOpens in a new tab.
-
Define the PRODUCT parameter as the name of your application. This can be anything except for iris, which is reserved for the InterSystems IRIS metrics.
-
Implement the GetSensors()Opens in a new tab method to define the desired custom metrics, as follows:
-
The method must contain one or more calls to the SetSensor()Opens in a new tab method. This method sets the name and value for an application metric. The values should be integers or floating point numbers to ensure compatibility with Prometheus.
You can optionally define a label for the metric, though if you do, you must always define a label for that particular metric.
Note:For best practices when choosing metric and label names, see Metric and Label Naming in the Prometheus documentation (https://prometheus.io/docs/practices/naming/Opens in a new tab).
-
The method can optionally include labels for a metric by invoking the SetSensorLabels() method. However: if you define a label for a metric, you must include that label every time you set that metric.
-
The method can optionally define HELP, TYPE, and UNIT information for a metric by invoking the SetSensorInfo() method. The output of the /metrics endpoint will include this information as comments. Refer to the Prometheus documentation for more information: https://prometheus.io/docs/instrumenting/exposition_formats/#comments-help-text-and-type-informationOpens in a new tab
-
The method must return $$$OK if successful.
Important:A slow implementation of GetSensors() can negatively impact system performance. Be sure to test that your implementation of GetSensors() is efficient, and avoid implementations that could time out or hang.
-
-
Compile the class. An example is shown below:
/// Example of a custom class for the /metric API Class MyMetrics.Example Extends %SYS.Monitor.SAM.Abstract { Parameter PRODUCT = "myapp"; /// Collect metrics from the specified sensors Method GetSensors() As %Status { do ..SetSensor("my_counter",$increment(^MyCounter),"my_label") do ..SetSensor("my_gauge",$random(100)) return $$$OK } }
-
Use the AddApplicationClass()Opens in a new tab method of the SYS.Monitor.SAM.ConfigOpens in a new tab class to add the custom class to the /metrics configuration. Pass as arguments the name of the class and the namespace where it is located.
For example, enter the following in the Terminal from the %SYS namespace:
%SYS>set status = ##class(SYS.Monitor.SAM.Config).AddApplicationClass("MyMetrics.Example", "USER") %SYS>w status status=1
Note:When you upgrade your InterSystems IRIS system, you will need to redo this step.
-
Ensure that /api/monitor web application has the necessary Application Roles to access the custom metrics. For details on how to edit application roles, see Edit an Application: The Application Roles Tab.
This step grants /api/monitor access to the data needed for the custom metric. For example, if the custom metric class is located in the USER database (protected by the %DB_USER resource), grant /api/monitor the %DB_USER role.
-
Review the output of the /metrics endpoint by pointing your browser to a URL with the following form, using the <baseURL> for your system: http://<baseURL>/api/monitor/metrics. The metrics you defined should appear after the InterSystems IRIS metrics, such as:
[...] myapp_my_counter{id="my_label") 1 myapp_my_gauge 92
The /metrics endpoint now returns the custom metrics you defined. The InterSystems IRIS metrics include an iris_ prefix, while your custom metrics use the value of PRODUCT as a prefix.
/api/monitor/alerts
The /api/monitor/alerts endpoint fetches the most recent alerts from the alerts.log file and returns them in JSON format, such as:
{"time":"2019-08-15T10:36:38.313Z","severity":2,\
"message":"Failed to allocate 1150MB shared memory using large pages. Switching to small pages."}
When /api/monitor/alerts is called, it returns the alerts that have been generated since the previous time /api/monitor/alerts was called. The iris_system_alerts_new metric is a Boolean that indicates whether new alerts have been generated.
For more information about when and how alerts are generated, see Using Log Monitor.