Friday, September 23, 2011

Automated nmon collection

I’m working with a customer and I get this question
“My system is running slow, do you know why?”
“How do you know it’s running slow?”, I ask.
“I don't know it just seems like it’s taking longer to do <whatever>”, they say.
“Do you have any trending that we can look at?”
“No – it was running fine. Why would I need that if it everything was fine?”, they ask.
I shake my head and die a little inside…
As a result of a few conversations like this I decided that rather than leaving it up to the customer to collect performance statistics on a regular basis, I need to do it for them. This gives me a few bits of information to use when they finally start coming to me with the inevitable performance problems and/or questions:
  • We have a baseline to work from!!!
  • We can determine what changed, if anything
  • It may only be in your head – we need to prove it’s not! (Some crazy people work on computers)
  • If there really is a problem, where do we even start looking?
I whipped up this simple script to make it a lot easier to collect that baseline. Nmon is now part of the AIX operating system and there is really no reason why you shouldn’t be using it to collect data. It’s a very good “whole system health” type monitoring tool that you can get down and dirty with if needed.
  1. Change the variables to valid values for you!
    1. REPORT_RCPT – Who is going to receive this analysis?
    2. COMPANY – If you are collecting this for multiple companies (I am) put the company name here so the reports make sense when you get them.
    3. REPORTS_TO_KEEP – How many are we going to keep on the local system (not in your email!)
  2. Schedule the script in cron.
    1. A lot of people like to schedule this for midnight, however that’s when a lot of people schedule maintenance or backups. This splits those periods of high activity into multiple reports. Consider scheduling it for a regular “slow” period, like when people leave work ( 17:00 to 19:00 ) or usually about 06:00 before they come in to hit the system and nightly processes are finished.
  3. Collect the reports it sends to you – DON’T DELETE THEM! Copy them to your system if you need to get them out of email. Remember this is data collection not data collect and toss!!!
  4. Analyze periodically so you have a quantitative value and idea what your system is doing – how often depends on your environment.
  5. When you have a performance problem later – reference earlier reports to determine what changed.


export DATE=`date +%m%d%Y`
export CURRENT_DIR=`pwd`

export REPORT_RCPT=""
export COMPANY="MyCompany"

# if the directory doesn't exist - create it

if [[ ! -d "/tmp/nmon" ]] then
 mkdir /tmp/nmon
cd /tmp/nmon

# Now lets get the one from yesterday and email to where is needs to be
export NMON_FILES=`ls -ctl | awk '{print $9}' | grep nmon | grep -v gz`
for i in $NMON_FILES; do
 export NEW_FILENAME=`echo $COMPANY ${i} | awk '{print $1 "_" $2}'`
 sort $i > $NEW_FILENAME
 tar cvf - $NEW_FILENAME | gzip -c > $NEW_FILENAME.tar.gz
 # we have the file we need now lets email it to whomever need it
 uuencode $NEW_FILENAME.tar.gz $NEW_FILENAME.tar.gz | mail -s "$COMPANY nmon report for $DATE" $REPORT_RCPT
 # Cleanup just a bit
 rm -f $i

find /tmp/nmon -type f -mtime +$REPORTS_TO_KEEP -exec rm -f {} \;

# Start NMON for the next day!
nmon -f -s 60 -c 1440

# Just incase you run it interactivly - return to where you started


No comments:

Post a Comment