SiteScope User's Guide


SiteScope Alerts

You can instruct SiteScope to alert you when it detects a problem in your Web environment. SiteScope offers several types of alerts including e-mail, electronic pager, and SNMP Trap. An alert definition contains instructions that tell SiteScope how to respond when there is a change in state for a monitor, for example a change from normal-to-error or normal-to-warning condition. You can create an alert that includes instructions for SiteScope to notify you via your pager or send you e-mail when a specific condition is detected. You can also have SiteScope respond to problems by automatically initiating a recovery or action script with the versatile Script Alert.

This section describes:

Introducing SiteScope Alert Media

THe following table describes the alert media types available with SiteScope.

Alert Type

Description

E-Mail Alert

The E-Mail Alert sends a problem notification and description as an e-mail message. The message content can be customized to include custom text and specific monitoring results.

Pager Alert

The Pager Alert is used to send notification to an electronic pager. Alerts can be sent as alphanumeric pages that include specific monitoring information.

Script Alert

The Script Alert is used to initiate the execution of a script or other program. Script Alerts provide you with the capability to run automated recovery actions based on particular monitoring results. This might include automatically rebooting a service or moving files.

SNMP Trap Alert

The SNMP Trap Alert can be used to send an SNMP trap to an enterprise management console. This gives you interoperability with network management applications.

Sound Alert

The Sound Alert allows you to have SiteScope play an audio file on the server where SiteScope is running.

Database Alert

The Database Alert allows you to log problem notification and descriptions to a database.

Disable or Enable Monitor Alert

The Disable or Enable Monitor Alert is used to automatically disable or enable an individual monitor or group of monitors. This is useful to suppress redundant alerting from monitors watching elements that are dependent on a single service that may have gone down.

Log Event Alert

The Log Event Alert allows you to log problem notifications as events into the Windows NT Event Log.

Post Alert

The Post Alert can be used to send problem notification and descriptions to another server application as a CGI POST.

Using SiteScope Alerts

SiteScope alerts can be used in several ways to notify you of conditions in your Web environment. Alerts can be associated explicitly with one or more individual monitors, with one or more groups of monitors, a combination of monitors and groups, or globally for all monitors on a particular installation of SiteScope. Global and group-wise alerting is generally the most efficient but may not provide the needed control. You can use the Global and Group Alert Filtering feature on each alert definition page to create filter criteria to control global and group alerts. Filter criteria can be used to restrict the alert to only those monitors that meet the filter criteria. For example, creating a global alert with a filter criteria for Monitor Type: CPU Monitor will create an alert that is only triggered for CPU monitor types.

The table below shows an overview of the different alert types, associations, and considerations.

Alert Class

Description

Global Alerts

Alerts that are triggered when any monitor on a given SiteScope reports the category status defined for the alert. New groups and monitors added after the alert definition is created are automatically associated with the alert.

Group Alerts

Alerts that are triggered when any monitor within the associated group or groups reports the category status defined for the alert. New subgroups and monitors added within the associated group or groups after the alert definition is created are automatically associated with the alert.

Individual Monitor Alerts

Alerts that are triggered when any associated monitor reports the category status defined for the alert. New monitors added after the alert definition is created are NOT automatically associated with the alert but can be added by editing the alert definition.

Understanding when SiteScope Alerts are sent

By default, SiteScope sends one alert as soon as any monitor it is associated with detects an error condition. The options presented in the When section of the alert definition page allow you to control when alerts are actually sent in relation to when a given condition is detected. For example, you can choose to have SiteScope generate an alert only after an error condition persists for a specific interval corresponding to a given number of monitor runs. This is useful for monitors that run frequently that monitor dynamic, frequently changing environment parameters. In some cases, a single error condition may not warrant any intervention.

The options in the When section are as follows:

"When" Option

Description

Always, after the condition has occurred at least times Only cause an Alert after the condition occurs consecutively at least the number of times indicated in the text box. This is a repeating alert. Once this condition is met, the alert is triggered each time the associated monitor is run until such time that the monitored system reports a change in status. Enter a value of one to have the alert triggered for the first detected error or warning. Enter a number greater than one if you want to alert only on conditions that persist for more than a single scheduled monitor run.
Once, after condition occurs exactly times Only cause an Alert after the condition occurs consecutively for exactly the number of times indicated in the text box. This is a once-only alert. Once this condition is met, the alert is triggered once. Enter a value of one to have the alert triggered for the first detected error or warning. Enter a number greater than one if you want to alert only on conditions that persist for more than one scheduled monitor run.
Initial alert and repeat every times afterwards Only cause an Alert after the condition occurs X consecutive times and then repeat the alert every Y consecutive times thereafter. This is a repeating alert. Once the Initial alert condition is met, the alert is triggered again after the associated monitor is run the number of times indicated in the second text box until such time that the monitored system reports a change in status. Enter a value of one for the Initial alert value to have the alert triggered for the first detected error or warning.
Once, after group errors Cause an alert the first time that any monitor in the associated monitor group consecutively reports the trigger condition for the number of times indicated in the text box. This is a once-only group-wise alert. Once this condition is met, the alert is triggered once. Enter a value of one to have the alert triggered for the first detected group error or warning. Enter a number greater than one if you want to alert only on conditions that persist for more than one scheduled monitor run.
Once, when all monitors of this group are in error Only cause an alert when all of the monitors in the associated monitor group are in error. This is a once-only group-wise alert. Once this condition is met, the alert is triggered once. Use this alert for monitor groups used to watch redundant systems where a single failure may be acceptable but multiple failures are not.

The following diagrams show examples of different alert configurations that send alerts after the error condition has persisted for more than one monitor run. It is important to note that the sample interval corresponds to how often the monitor is run. If a monitor runs every fifteen seconds and the alert is set to be sent after the third error reading, the alert will be sent 30 seconds after the error was detected. If the monitor run interval is once every hour with the same alert setup the alert would not be sent until 2 hours later.

Example 1a. Alert sent for each error reading after condition persists for at least three monitor runs. Compare with Example 1b below.

Alert setup

Always, after the condition has occurred at least times

sample interval

0

1

2

3

4

5

6

7

8

9

10

status

good

error

error

error

error

error

error

error

good

error

good

count

c=0

c=1

c=2

c=3
alert!

c=4
alert!

c=5
alert!

c=6
alert!

c=7
alert!

c=0

c=1

c=0

Example 1b. Alert sent for each error reading after condition persists for at least three monitor runs. Shows how the count is reset when the monitor returns one non-error reading between consecutive error readings. Compare with Example 1a above.

Alert setup:

Always, after the condition has occurred at least times

sample interval

0

1

2

3

4

5

6

7

8

9

10

status

good

error

error

good

error

error

error

warning

good

error

good

count

c=0

c=1

c=2

c=0

c=1

c=2

c=3
alert!

c=0

c=0

c=1

c=0

Example 2. Alert sent ONLY ONCE after condition persists for at least three monitor runs, regardless of how long the error is returned thereafter.

Alert setup:

Once, after the condition occurs exactly times

sample interval

0

1

2

3

4

5

6

7

8

9

10

status

good

error

error

error

error

error

error

error

error

error

error

count

c=0

c=1

c=2

c=3
alert!

c=4

c=5

c=6

c=0

c=1

c=0

c=0

Example 3a. Alert sent on the fifth error reading and for ever third consecutive error reading thereafter. Compare with Example 3b below.

Alert setup:

Initial alert and repeat every times afterwards.

sample interval

0

1

2

3

4

5

6

7

8

9

10

status

good

error

error

error

error

error

error

error

error

error

error

count

c=0

c=1

c=2

c=3

c=4

c=5
alert!

c=6

c=7

c=8
alert!

c=9

c=10

Example 3b. Alert sent on the third error reading and for ever fifth consecutive error reading thereafter. Compare with Example 3a above.

Alert setup:

Initial alert and repeat every times afterwards.

sample interval

0

1

2

3

4

5

6

7

8

9

10

status

good

error

error

error

error

error

error

error

error

error

error

count

c=0

c=1

c=2

c=3
alert!

c=4

c=5

c=6

c=7

c=8
alert!

c=9

c=10

Because you can create multiple alerts and associate more than one alert to a monitor, you can tell SiteScope to take more than one action for a given situation. For example, you can create one alert that tells SiteScope to page you whenever any monitor returns an error status. You can then create another alert that tells SiteScope to run a script file to delete files in the /tmp directory on your server if your Disk Space Monitor returns an error. Then if your disk ever became too full, SiteScope would page you because of the first alert definition and would run the script to delete files in the /tmp directory because of the second alert definition.

SiteScope alerts are generated when there is a change in state for a monitor reading. Thus you can set an alert for OK or warning conditions as well as error conditions. One way to take advantage of this is to add two alerts, one alert on error, and one alert on OK. Set alerts to be sent after the condition is detected 3 time. For the OK alert, check the box marked "Only allow alert if monitor was previously in error at least 3 times". This will prevent unmatched OK alerts, such as when a monitor was disabled for any reason (manually, by schedule, or by depends on) and then starts up again. This can also be used to ensure that an OK alert is only sent after a corresponding error alert was sent. With these two alerts you will get a page when a link or service goes down (monitor detects change from OK to error), and another when it comes back up (monitor detecting change from error to OK). The following diagram is an example of using two alerts with a monitor.

Example 4. Alert on error sent once for error after condition persists for at least three monitor runs. Alert on OK sent once for good status after at least one error or warning interval.

Alert on Error
setup

On
Error
Once, after the condition occurs exactly times

Alert on OK
setup

On
Ok
Once, after the condition occurs exactly times
Only allow alert if monitor was previously in error at least times

sample interval

0

1

2

3

4

5

6

7

8

9

10

status

good

error

error

error

error

error

error

error

good

good

good

count

c=0

c=1

c=2

c=3
alert!

c=4

c=5

c=6

c=7

c=1
alert!

c=2

c=3


Index