How alarms work
Alarms allow you to continuously monitor the web services running on the GAS for signs of failure and to measure performance.
- Counting alarms report on failures such as DVM crashes, failures to start or connect, and request/response errors. Monitoring of these alarms allow you to detect configuration or network issues.
- Timer alarms report the time taken to start a DVM and process requests, and measure
request frequencies in averages and real time, for example
AVG_REQUEST_TIME
andREAL_REQUEST_TIME
. Monitoring of these alarms allow you to detect network issues.
DVM_NOT_STARTED
and DVM_NOT_CONNECTED
alarms for each
service.You can configure an alarm on a service even if it is not running. The configuration takes effect the next time the service is started.
Alarm thresholds
- For counting alarms, this threshold is typically a single (1) occurrence.
- For timer alarms, this threshold is typically one (1) second.
You may need to modify alarm thresholds in order to better represent conditions in your network.
For instance, if you know you system is slow, you may decide to raise a
DVM_NOT_CONNECTED
alarm after 5 attempts, instead of the default one attempt.
There are considerations for setting thresholds. Setting a threshold too low can create complacency, as you attempt to manage a large volume of monitoring data on a system that shows no signs of any disruption or issues. Setting a threshold too high reduces the volume of monitoring data collected and ultimately fetched for analysis; however, it can also result in not being alerted of potential problems in a timely manner.
When an alarm is raised
When the GAS raises an alarm, it writes the event to a monitoring data (.dat) file for the session. The monitoring data written to the file specifies which alarm was raised, along with details of the event. To receive an email alert, configure the GAS to send you notification when an alarm is raised. For details, see Alert script example.
To better understand what is happening, you may need to raise the monitoring level of the service to gather more information.
To analyze the monitoring data, and to debug potential issues, you fetch the data to a file for transfer to a Microsoft® Excel™ spreadsheet, or to a third party monitoring system where you can process and graph the data.