SiteScope User's Guide


SunONE Server Monitor

The SiteScope SunONE Server Monitor allows you to monitor the availability of SunONE or iPlanet 6.x servers using the stats-xml performance metrics file (iwsstats.xml or nesstats.xml) facility. By providing the URL of this stats-xml file, SiteScope can parse and display all metrics reported in this file and allow you to choose those metrics you need to be monitored as counters. In addition, several derived counters are provided for your selection which measure percent utilization of certain system resources. Error and warning thresholds for the monitor can be set on as many as ten SunONE server performance statistics or HTTP response codes.

Usage Guidelines

Use the SunONE Server Monitor to monitor performance metrics reported in the stats-xml file of SunONE servers. You can monitor multiple parameters or counters with a single monitor instance. This allows you to watch server loading for performance, availability, and capacity planning. Create a separate monitor instance for each SunONE server you are running.

Before you can use the SunONE Server Monitor, the "stats-xml" service option must be enabled on each Web server you want to monitor. This normally requires that you manually edit the obj.conf configuration file for each server instance. For iPlanet 6.0 servers, the entry has the following syntax:

<Object name="stats-xml">
ObjectType fn="force-type" type="text/xml"
Service fn="stats-xml"
</Object>

Each server instance must be restarted for the changes to become effective.

The default run schedule for this monitor is every 10 minutes, but you can change it to run more or less often using the Update every setting.

Completing the SunONE Server Monitor Form

To display the SunONE Server Monitor Form, either click the Edit link for an existing SunONE Server Monitor in a monitor table, or click the add a Monitor link on a group's detail page and click the SunONE monitor type link from the list of available monitor types.

Complete the items on the SunONE Server Monitor Form as follows. First you need to click the Choose Server link to specify details of the target SunONE server. Then click the Browse Counters button to display the entire list of metrics found in the stats-xml file retrieved from the target SunONE server you specified. When you have completed your counter selections, click the Add Monitor button.

Server
The Web server to be monitored. Clicking on the choose server link on the same line to bring up the server selection screen. The server selection screen presents the following options:
  • Stats-XML URL: Specify the URL to the stats-xml file on the target SunONE server. This is usually in the form http://server_id:port/stats-xml/stats-xml-file, where stats-xml-file is either nesstats.xml or iwsstats.xml.
  • HTTP Proxy: Optionally, a proxy server can be used to access the server. Enter the domain name and port of an HTTP Proxy Server.
  • Proxy Server User Name: If the proxy server requires a name and password to access the server, enter the name here. Note: your proxy server must support Proxy-Authenticate for these options to function.
  • Proxy Server Password: If the proxy server requires a name and password to access the server, enter the password here.

Counters
This text box is read-only and can only be specified by clicking on the choose counters link. This brings up the Browse counters page. Use the selection features on this page to expand or contract the counter tree and select the counters you want to monitor. An explanation of the counters available for the SunONE/iPlanet 6.0 server is found below. When you have selected the counters you want to monitor, click the Choose Counters button to record your selection.

The default timeout period used when retrieving the list of counters from the server (for you to then choose from) is 120 seconds. If your server has an unusually large set of counters the retrieve process might exceed this timeout period. You can specify a different or longer timeout period in the master.config file by adding the following line:

_sunOneMonitorGetBrowseTreeTimeout=nnn

where nnn is the number of seconds to wait before timing out (must be greater than 0).

Update every
Select how often the monitor should access the URL entered above. The default interval is to run or update the monitor once every 10 minutes. Use the drop-down list to the right of the text box to specify another update interval in increments of seconds, minutes, hours, or days. The update interval must be 15 seconds or longer.

Title
Enter a title text for this monitor. This text is displayed in the group detail page, in report titles, and other places in the SiteScope interface. If you do not enter a title text, SiteScope will create a title based on the host, server, or URL being monitored.

Advanced Options

The Advanced Options section presents a number of ways to customize monitor behavior and display. Use this section to customize error and warning thresholds, disable the monitor, set monitor-to-monitor dependencies, customize display options, and enter other monitor specific settings required for special infrastructure environments. The options for this monitor type are described below. Complete the entries as needed and click the Add or Update button to save the settings.

Disable
Check this box to temporarily disable this monitor and any associated alerts. To enable the monitor again, clear the box.

Timeout
The number of seconds that the monitor should wait for a response from the server before timing-out. Once this time period passes, the monitor will log an error and report an error status.

Note: Depending on the activity on the server, the time to build the server monitor statistics Web page may take more than 15 seconds. You should test the monitor with an Timeout value of more than 60 seconds to ensure that the server can build and serve the server monitor statistics Web page before the SiteScope monitor is scheduled to run again.

Verify Error
Check this box if you want SiteScope to automatically run this monitor again if it detects an error. When an error is detected, the monitor will immediately be scheduled to run again once.

Note: In order to change the run frequency of this monitor when an error is detected, use the Update every (on errors) option below.

Note: The status returned by the Verify Error run of the monitor will replace the status of the originally scheduled run that detected an error. This may cause the loss of important performance data if the data from the verify run is different than the initial error status.

Warning: Use of this option across many monitor instances may result in significant monitoring delays in the case that multiple monitors are rescheduled to verify errors at the same time.

Update Every (on error)
You use this option to set a new monitoring interval for monitors that have registered an error condition. For example, you may want SiteScope to monitor this item every 10 minutes normally, but as often as every 2 minutes if an error has been detected. Note that this increased scheduling will also affect the number of alerts generated by this monitor.

Schedule
By default, SiteScope monitors are enabled every day of the week. You may, however, schedule your monitors to run only on certain days or on a fixed schedule. Click the Edit schedule link to create or edit a monitor schedule. For more information about working with monitor schedules, see the section on Schedule Preferences for Monitoring.

Monitor Description
Enter additional information about this monitor. The Monitor Description can include HTML tags such as the <BR> <HR>, and <B> tags to control display format and style. The description will appear on the Monitor Detail page.

Report Description
Enter an optional description for this monitor that will make it easier to understand what the monitor does. For example, network traffic or main server response time. This description will be displayed on with each bar chart and graph in Management Reports and appended to the tool-tip displayed when you pass the mouse cursor over the status icon for this monitor on the monitor detail page.

Depends On
To make the running of this monitor dependent on the status of another monitor or monitor group, use the drop-down list to select the monitor on which this monitor is dependent. Select None to remove any dependency.

Depends Condition
If you choose to make the running of this monitor dependent on the status of another monitor, select the status condition that the other monitor or monitor group should have in order for the current monitor to run normally. The current monitor will be run normally as long as the monitor on which it depends reports the condition selected in this option.

List Order
By default, new monitors are listed last on the Monitor Detail page. You can use this drop-down list to choose a different placement for this monitor.

Setting Monitor Status Thresholds

SiteScope Application Monitors allow you to set multiple threshold conditions to determine the status reported by each monitor. The individual conditions are combined as logical OR relationships so that when one or more of the conditions (for example any of the conditions for Error if) are met the monitor status is set to the applicable condition. If multiple conditions are met for more than one status condition (such as conditions for both error and for warning), the status for the monitor is set to the highest valued condition. Thus a match of an error condition and a warning condition would be reported as an error status, error being the highest value, warning the next highest and good the lowest value.

Error if
Use one or more of the selection boxes in this item to define one or more error conditions for this monitor. Use the drop-down lists in these items to change error threshold(s) relative to the counters you have selected to check with this monitor. After choosing a counter or parameter, use the comparison operator drop-down list to specify an error threshold such as: >= (greater than or equal to), != (not equal to), or < (less than) and enter a comparison value in the box provided. Comparison values should be entered as whole numbers.

Warning if
Use one or more of the selection boxes in this item to define one or more warning conditions for this monitor. Use the drop-down lists in these items to change warning threshold(s) relative to the counters you have selected to check with this monitor. Set these values relative to those you set for the error threshold in the Error if item.

Good if
You can set this monitor to return a good status for certain conditions. You may define those conditions here. Complete this item as you would for the Error if and Warning if items.

SunONE Server Counters

After you have specified the target SunONE server to monitor, click the Browse Counters button. The stats-xml file you specified will be retrieved and parsed for all metrics listed in the file, and a browse tree will be displayed. See the Browsable Counters Utility help page for instructions on how to navigate this hierarchy tree and select your counters of interest. You will notice that certain counter names listed are "qualified" with an '@' sign. The reason is that these counters can occur in multpile instances, and without qualifying them by an attribute like "id" or "pid", the multiple counter instances would be indistinguishable.

Note: At this time there is limited support for servers with multiple running processes (that is, multiple occurring <process> elements in the returned stats-xml file). Due to the dynamic nature of processes and process IDs, the scheme mentioned above cannot be used to disambiguate process elements. Therefore only the first <process> element encountered in the stats-xml file will be monitored.

Derived Counters

If you expand the stats node of the browse counters hierarchy, you will see a Derived Counters node. When you expand this node you should see a list of "virtual" counters whose values are derived from stats-xml counters. These derived counters are provided for your convenience, and are defined as follows:

Cache table utilization
This counter may help determine the efficiency of your file cache, and is defined as: the number of entries currently in the file cache divided by the maximum number of file cache entries allowed. That is, process/cache-bucket/countEntries / process/cache-bucket/maxEntries

Cache heap utilization
This counter may help determine the efficiency of your file content cache, and is defined as: the current size of the file content cache heap divided by the maximum file content cache heap size. That is, process/cache-bucket/sizeHeapCache / process/cache-bucket/maxHeapCacheSize

Percent file cache hits
This counter may help determine the efficiency of your file cache, and is defined as: the number of successful file cache lookups divided by the total number of file cache lookup attempts. That is, process/cache-bucket/countHits / (process/cache-bucket/countMisses + process/cache-bucket/countHits)

Percent idle threads
This counter may help determine the efficiency of your thread pool, and is defined as: the number of request-processing threads currently idle divided by the total number of request-processing threads that currently exist on the system. That is, process/thread-pool-bucket/countThreadsIdle / process/thread-pool-bucket/countThreads

DNS cache utilization
This counter may help determine the efficiency of your DNS cache, and is defined as: the number of entries currently in DNS cache divided by the maximum number of entries that the cache can accomodate . That is, process/dns-bucket/countCacheEntries / process/dns-bucket/maxCacheEntries

Percent DNS cache hits
This counter may help determine the efficiency of your DNS cache, and is defined as: the number of successful DNS cache lookups divided by the total number of DNS cache lookup attempts. That is, process/dns-bucket/countCacheHits / (process/dns-bucket/countCacheMisses + process/dns-bucket/countCacheHits)

Percent DNS cache misses
This counter may help determine the efficiency of your DNS cache, and is defined as: the number of unsuccessful DNS cache lookups divided by the total number of DNS cache lookup attempts. That is, process/dns-bucket/countCacheMisses / (process/dns-bucket/countCacheMisses + process/dns-bucket/countCacheHits)

Cache memory utilization
This counter may help determine the efficiency of your memory mapped file content cache, and is defined as: the amount of address space currently used by the memory mapped file content cache divided by the maximum amount of address space that the file cache uses for memory mapped file content. That is, process/cache-bucket/sizeMmapCache / process/cache-bucket/maxMmapCacheSize

Percent file info cache hits
This counter may help determine the efficiency of your file information cache, and is defined as: the number of successful file information lookups divided by the total number of file information lookup attempts. That is, process/cache-bucket/countInfoHits / (process/cache-bucket/countInfoMisses + process/cache-bucket/countInfoHits)

Percent file content cache hits
This counter may help determine the efficiency of your file content cache, and is defined as: the number of successful file content lookups divided by the total number of file content lookup attempts. That is, process/cache-bucket/countContentHits / (process/cache-bucket/countContentMisses + process/cache-bucket/countContentHits)

At this time there is no support for adding new derived counter definitions.