Caché Distributed Data Management Guide
Monitoring Distributed Applications
[Back] [Next]
   
Server:docs1
Instance:LATEST
User:UnknownUser
 
-
Go to:
Search:    

A running ECP application consists of one or more ECP data server systems—data providers—connected to one or more ECP application server systems—data consumers. Between each application server and data server that share data, there is an ECP connection: a TCP/IP connection that ECP uses to send data and commands.

You can monitor the status of the servers and connections in an ECP application from the [Home] > [Configuration] > [ECP Settings] page of the Management Portal.
The ECP Settings page has two subsections:
  1. This System as an ECP Data Server displays settings for the data server as well as the status of the ECP service. See Configuring Distributed Systems for more information.
  2. This System as an ECP Application Server displays settings for the application server and a list of data servers—systems providing data to this node—connected to this application server as well as their status.
The following sections describe status information for connections:
ECP Connection Information
The [Home] > [Configuration] > [ECP Settings] page of the Management Portal displays a list of the current ECP data server connections on the application server side. The [Home] > [Configuration] > [ECP Settings] > [ECP Application Servers] page displays a list of the current ECP application server connections on the data server side of a connection.
ECP Data Server Connections
The This System as an ECP Application Server list displays the following information for each ECP data server connection:
Server Name
The logical name of the ECP data server system on the application server for this connection.
Host Name
The host name of the ECP server system for this connection as entered when the server was added to the application server configuration.
IP Port
The IP port number used to connect to the ECP server system.
Status
The current status of this connection. Each connection has a current operating state. These states are described in the ECP Connection States section.
Edit
If the current status of this connection is not connected or disabled, you can edit the port and host information of the data server.
Change Status
From each data server row you can change the status of an existing ECP connection with that data server. See the ECP Connection Operations section for more information.
Delete
You can delete the data server information from the application server side.
ECP Application Server Connections
Click ECP Application Servers to view a list of ECP application servers that are connected to this system:
The This System as an ECP Application Server list displays the following information for each ECP data server connection:
Client Name
The logical name of the ECP application server system on the data server for this connection.
Status
The current status of this connection. Each connection has a current operating state. These states are described in the ECP Connection States section.
Client IP
The host name of the ECP server system for this connection as entered when the server was added to the data server configuration.
IP Port
The IP port number used to connect to the ECP server system.
ECP Connection States
In a running system, an ECP connection can be in one of the following states:
ECP Connection States
State Description
Not Connected The connection is defined but has not been used yet.
Connection in Progress The connection is in the process of establishing itself. This is a transitional state that lasts only until the connection is established.
Normal The connection is operating normally and has been used recently.
Trouble The connection has encountered a problem. If possible, the connection automatically corrects itself.
Disabled The connection has been manually disabled by a system administrator. Any application making use of this connection receives a <NETWORK> error.
The following sections describe each connection state as it relates to being on the application server or data server side:
Application Server Connection States
The following sections describe the application server side of each of the connection states:
Application Server Not Connected State
An application server-side ECP connection starts out in the Not Connected state. In this state, there are no ECP daemons for the connection. If an application server process makes a network request, daemons are created for the connection and the connection enters the Connection in Progress state.
Application Server Connection in Progress State
In the Connection in Progress state, a network daemon exists for the connection and actively tries to establish a new connection. A user process must wait until the connection completes before it can submit requests to the network. While the connection is in the Connection in Progress state, the user process waits on each request for up to 20 seconds for the connection to complete. When the connection is established, it enters the Normal state. If the connection is not established within that time, the user process receives a <NETWORK> error.
The application server ECP daemon attempts to create a new connection to the data server in the background. If no connection is established within 20 minutes, the connection returns to the Not Connected state and the daemon for the connection goes away.
Application Server Normal State
After a connection completes, it enters the Normal, data transfer, state. In this state, the ECP application server-side daemons exist and actively send requests and receive answers across the network. The connection stays in the Normal state until the connection becomes unworkable or until the application server or the data server requests a shutdown of the connection.
Application Server Trouble State
If the connection from application server to data server encounters problems, the application server ECP connection enters the Trouble state. In this state, application server ECP daemons exist and actively try to restore the connection. An underlying TCP connection may or may not still exist. The recovery method is similar whether or not the underlying TCP connection gets reset and must be recreated, or if it stops working temporarily.
During the application server Trouble state interval, the application server attempts to reconnect to the data server to perform ECP connection recovery. During this interval, existing network requests are preserved. The originating application server-side user process blocks new network requests, waiting for the connection to resume. If the connection returns within the trouble timeout (Time to wait for recovery currently defaults to 20 minutes), it returns to the Normal state and the blocked network requests proceed.
For example, if a data server goes offline, any application server connected to it has its state set to Trouble until the data server becomes available. If the problem is corrected gracefully, a connection’s state reverts to Normal; otherwise, if the trouble state is not recovered, it reverts to Not Connected.
Applications continue running until they require network access. All locally cached data is available to the application while the server is not responding.
Application Server Transitional Recovery States
Transitional recovery states are part of the Trouble state. If there is no current TCP connection to the data server, and a new connection is established, the application server and data server engage in a recovery protocol which flushes the application server cache, recovers transactions and locks, and returns to the Normal state.
Similarly, if the data server shuts down, either gracefully or as a result of a crash, and then restarts, it enters a short period (approximately 30 seconds) where it allows application servers to reconnect and recover their existing sessions. Once again, the application server and the data server engage in the recovery protocol.
If connection recovery is not complete within 20 minutes, the application server gives up on connection recovery. Specifically, the application server returns errors to all pending network requests and changes the connection state to Not Connected. If it has not already done so, the data server rolls back all the transactions from this application server and releases all the locks from this application server the next time this application server connects to the data server.
If the recovery is successful, the connection returns to the Normal state and the blocked network requests proceed.
Application Server Disabled State
An ECP connection is marked Disabled if an administrator declares that it is disabled. In this state, no daemons exist and any network requests that would use that connection immediately receive <NETWORK> errors.
Data Server Connection States
The following sections describe the data server side of each of the connection states:
Data Server Free States
When an ECP server first comes up, all incoming ECP connections are in an initial “unassigned” Free state and are available for connections from any ECP application server that is listed in the connection access control list. If a connection from an application server previously existed and has since gone away, but does not require any recovery steps, the connection is placed in the “idle” Free state. The only difference between these two states is that in the idle state, this connection block is already assigned to a particular application server, rather than being available for any application server that passes the access control list.
Data Server Normal State
In the data server Normal state, the application server connection is normal. At any point in the processing of incoming connections, whenever the application server disconnects from the data server (except as part of the data server’s own shutdown sequence), the data server rolls back any pending transactions and releases any incoming locks from that application server, and places the application server connection in the “idle” Free state.
Data Server Trouble States
If the application server is not responding, the data server shows a Trouble state. If the data server crashes or shuts down, it remembers the connections that were active at the time of the crash or shutdown. After restarting, the data server waits for a brief time (usually 30 seconds) for application servers to reclaim their sessions (locks and open transactions). If an ECP application server does not complete recovery during this awaiting recovery interval, all pending work on that connection is rolled back and the connection is placed in the “idle” state.
Data Server Recovering State
The data server connection is in a recovery state for a very short time when the application server is in the process of reclaiming its session. The data server keeps the application server in trouble state for a brief time (Time interval for Troubled state currently defaults to 60 seconds) for it to reclaim the connection; otherwise, it releases the application resources (rolls back all open transactions and releases locks) and then sets the state to Free.
ECP Connection Operations
From the [Home] > [Configuration] > [ECP Settings] page of the Management Portal on an ECP application server, you can change the status of the ECP connection. From each data server row, click Change Status to display the connection information and perform the appropriate choices of the following:
Change to Disabled
Set the state of this connection to Disabled. This releases any locks held for the ECP application server, rolls back any open transactions involving this connection, and purges cached blocks from the data server. If this is an active connection, the change in status sends an error to all applications waiting for network replies from the data server.
Change to Normal
Set the state of this connection to Normal.
Change to Not Connected
Set the state of this connection to Not Connected. As with changing the state to disabled, this releases any locks held for the ECP application server, rolls back any open transactions involving this connection, and purges cached blocks from the data server. If this is an active connection, the change in status sends an error to all applications waiting for network replies from the data server.