Monitoring

your ArsDigita Community System installation by Tracy Adams and Jon Salz

The Big Picture

The ArsDigita Community System has an integrated set of monitoring tools.

Parameters

Monitoring parameters as centralized in the monitoring section of the .ini file. Add a new PersontoNotify for each person who should receive monitoring alerts.
[ns/server/yourservername/acs/monitoring]
; People to email for alerts
PersontoNotify=nerd1@yourservicename.com
;PersontoNotify=nerd2@yourservicename.com
; location of the watchdog perl script
WatchDogParser=/web/yourservicename/bin/aolserver-errors.pl
; watchdog frequency in minutes
WatchDogFrequency=15

Current page requests - monitor.tcl

The "current page request" section (linked from /admin/monitoring/) will produce a report like the following.

There are a total of 8 requests being served right now (to 8 distinct IP addresses). Note that this number seems to include only the larger requests. Smaller requests, e.g., for .html files and in-line images, seem to come and go too fast for this program to catch.
conn #client IPstatemethodurln secondsbytes
17899212.252.145.38runningGET/photo/pcd3255/chappy-store-31.4.jpg59158544
1818538.27.213.213runningGET/wtr/thebook/html.html210
18247171.210.228.91runningGET/photo/nikon/nikon-reviews.html150
18367209.86.54.190runningGET/bboard/image.tcl834228
18454199.174.160.135runningGET/photo/pcd1669/treptower-big-view-51.4.jpg134376
18464207.100.29.220running??10
18468216.214.210.53runningGET/chat/js-refresh.tcl00
18481216.34.106.252runningGET/monitor.tcl00

This report will inform you which users are waiting on pages from your server. In the report above, users asking for large images or pages are waiting. This is normal because some users have very slow connections.

If you see the same .tcl or .adp file often, especially with the longest wait times, it is likely that the script is extremely slow or is hogging database handles. You should

If you see a large number of requests from the same IP address, it is likely that a poorly-designed spider is attacking your web service. To stop it, ban that IP address from your system.

Cassandracle (Oracle)

Cassandracle is a Web-based monitor for an Oracle installation. The goal is that, at a glance, a novice Oracle DBA ought to be able to identify problems and find pointers to relevant reference materials.

To use Cassandracle in your installation, you will need to give the web service's database user read access to some core Oracle tables.

  1. Log into Oracle via sqlplus
  2. Execute:
    SQL> connect internal
  3. Run the commands in /sql/cassandracle.sql
  4. Execute
    SQL> grant ad_cassandracle to username;

    Configuration

    This is a simple section with information about the current machine and connection. The information provided is pretty sparse and should expand in the future.

    WatchDog (Error log)

    Every WatchDogFrequency seconds, the service's error logs will be scanned. If errors are found, they will be emailed to those configured as a PersontoNotify. The administration pages have a tool to search the error log for errors.

    Registered Filters and Schedule Procedures

    The ad_register_filter and ad_schedule_proc procs are wrappers around the corresponding ns_ calls, which allow us to more carefully track what's happening on the server and when. /admin/monitoring/filters.tcl shows which filters are called for which URLs and methods, and /admin/monitoring/scheduled-procs.tcl shows which procedures are scheduled to be called in the future.
    teadams@arsdigita.com
    ,
    jsalz@mit.edu