Re: [redhat] utilization
- On Tue, Feb 24, 2009 at 10:21 AM, Scott <scottro@...> wrote:
> Jeff, I believe he means that they can't install software on theYeah, I know, but I thought I'd suggest it anyway. Nagios was a
> machines. I assumed he meant because of company rules.
> However, if you DO decide the go the nagios route, look at the CentOS
> wiki article on it by Max Hetrick. (Or Max's original articles on his
> The wiki version is here.
> Max's original is
> (He also has a good article on his site about nrpe, which can work with
> nagios to monitor Windows machines.)
lifesaver many times when I was an admin... mostly because the VP kept
the monitoring station at her house, so every time her cable went out
(frequently), I'd get a barrage of pages at 2 and 3 AM that made it
sound like our NOC was on fire and every server was gone... After
driving in to the office at 3 am TWICE (a 45 mile drive each way), I
found Nagios and never went back to the office outside my normal hours
BUT that being said, I gave him a better way to get just the CPU and
MEM info from top as well, plus, if "can't install packages" means "We
can't install any 3rd party apps on our production servers" and not
"we can't install ANYTHING on our production servers" he should still
be able to install sysstat, at least on the RHEL 4 and RHEL 5 boxes as
they are part of the base distro (and really SHOULD have been
installed anyway in the first place).
Even if they don't want to use Nagios or some other centralized
monitoring software, installing sysstat on a server in a production
environment seems kinda critical to me, but that's just my humble
opinion, and I fully understand the insanity that is often handed down
by the people who don't actually have to maintain these things...
"Huh? What do you mean you have to reboot the server."
Well, there was a major security update to the kernel, so we need to
boot the new kernel.
"Kernel? What's that? No, we need uptime, you can't reboot the
servers to suit your whim."
Then later on...
"What do you mean the server was compromised? Don't we pay you to
keep that kind of thing from happening?"
*the previous re-enactment was real, the personal data was changed to
protect the innocent... [queue Dragnet theme]
And also, there is probably a FAR better way of extracting the data
from top than just using grep like I did, but that would require a bit
of awk and shell scripting that I don't want to worry about right now
>> Hi Group,AHEM.... fixed the top posting. Please do not top post (I am not a
>> Please suggest how can i check my cpu and ram utilization b/w 18:00 to
>> these are production server on rhel 4 rhel 3 and redhat 9,
>> we can not install any package on thease machines
>> sar mpstat and iostat is not working
> Hi Jitin,
> You can try top command to record it into a file for this period like:-
> #top -b -d 5 -n 2160 >/tmp/loadreport.txt
> you can schedule this command to run at 18:00,
> Explanation of the options:-
> -b option forces your top command to run in batch mode
> -d 5 will run the top command each 5 seconds
> -n 2160 this will run the top command 2160 times that is equal to 3 hrs.
> Now you need to analyze this /tmp/loadreport.txt file.
> Rajveer Singh
moderator, but I play one on other groups...)
The top option works well, but you also mentioned sar, iostat and
mpstat and said that they are not working... what do you mean that
they are not working?
Do they error out when you run them, or do you just not have them
installed? All three of those commands are part of the sysstat
package which is available at least on RHEL 5, though you may not have
sysstat also provides sadf and sa.
As for the top command listed above, you can do that, but the problem
is you'll end up with the info you want, and possibly a few thousand
lines of extra info depending on how many processes are running. Top
will dump the utilization as well as info for every PID active.
If you want to go the top route, and don't need the extra data, you
probably will want to trim the output a bit like this:
top -b -d 5 -n 2160 |egrep -A 2 Cpu > /tmp/topreport.txt
which should give you something like this in your report file:
Cpu(s): 14.6%us, 1.9%sy, 0.0%ni, 82.7%id, 0.3%wa, 0.1%hi, 0.3%si, 0.0%st
Mem: 2066664k total, 2016692k used, 49972k free, 220760k buffers
Swap: 3132632k total, 32296k used, 3100336k free, 617068k cached
Cpu(s): 24.6%us, 2.0%sy, 0.0%ni, 71.4%id, 0.0%wa, 1.5%hi, 0.5%si, 0.0%st
Mem: 2066664k total, 2017244k used, 49420k free, 220760k buffers
Swap: 3132632k total, 32296k used, 3100336k free, 617084k cached
Cpu(s): 12.0%us, 1.5%sy, 0.0%ni, 86.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 2066664k total, 2017236k used, 49428k free, 220760k buffers
Swap: 3132632k total, 32296k used, 3100336k free, 617088k cached
A much more elegant solution may be for you to use Nagios to monitor
You can use Nagios to check CPU load, memory usage, disk usage, and a
WHOLE LOT of other things on remote systems, PLUS it gives you pretty
It's used by a lot of people for monitoring server farms and is very
configurable and in the past I've found it to be quite reliable.
PLUS, given your 1800 - 2100 timeframe, you can also use it to check
other things like network usage and individual services to see if you
are getting increased traffic during those times, having an inordinate
number of web servers spawn during those hours, etc...
Hope that helps
Thanks dear I could resolve this issue with your help only
As suggested I used below script in cron
top -b -d 5 -n 2160 |egrep -A 2 Cpu > /tmp/`"date"`