Wednesday, March 26, 2008

OS Watcher (OSW) and Lite Onboard Monitor (LTOM)

While searching for something else, I came across Metalink Note 370936.1 (Previous Announcements from New in the Knowledge Base). What captured my attention was this:

December 12, 2007 - Oracle's Center of Expertise Releases New Documents

The following four new white papers have just been released by Oracle's Center of Expertise:

I checked the second and third white papers, both of which are written by Roger Snyde from Oracle Support's Center of Expertise. These white papers describe a tool called OSW (OS Watcher) . Oracle Support’s Center of Expertise has developed OSWatcher, a script-based tool for Unix and Linux systems that runs and archives output from a number of operating system monitoring utilities, such as vmstat, top, iostat, mpstat and ps.

OSWatcher is available from Metalink as note 301137.1. It is a shell script tool and will run on Unix and Linux servers. It operates as a background process and runs the native operating system utilities at user-settable intervals, by default 30 seconds, and retains an archive of the output for a user settable period, defaulting to 48 hours. This value may be increased in order to retain more information when evaluating performance, and to capture baseline information during important cycle-end periods.

Oracle recommends customers download and install OSWatcher on all production and test servers that need to be monitored.

While going through 301137.1, I found the mention of another tool called LTOM(The embedded Lite Onboard Monitor): To collect database metrics in addition to OS metrics consider running LTOM. The Lite Onboard Monitor (LTOM) is a java program designed as a real-time diagnostic platform for deployment to a customer site. LTOM differs from other support tools, as it is proactive rather than reactive. LTOM provides real-time automatic problem detection and data collection. LTOM runs on the customer's UNIX server, is tightly integrated with the host operating system and provides an integrated solution for detecting and collecting trace files for system performance issues. The ability to detect problems and collect data in real-time will hopefully reduce the amount of time it takes to solve problems and reduce customer downtime.

Both OSW and LTOM now provide a graphing utility to graph the data collected. This greatly reduces the need to manually inspect all the output files.

Sample graph from OSW:

Sample graphs from LTOM:

