Loading ...
Sorry, an error occurred while loading the content.

Re: [redhat] utilization

Expand Messages
  • Jeff Lane
    ... Yeah, I know, but I thought I d suggest it anyway. Nagios was a lifesaver many times when I was an admin... mostly because the VP kept the monitoring
    Message 1 of 6 , Feb 24, 2009
    • 0 Attachment
      On Tue, Feb 24, 2009 at 10:21 AM, Scott <scottro@...> wrote:

      > Jeff, I believe he means that they can't install software on the
      > machines. I assumed he meant because of company rules.
      >
      > However, if you DO decide the go the nagios route, look at the CentOS
      > wiki article on it by Max Hetrick. (Or Max's original articles on his
      > pages.
      >
      > The wiki version is here.
      > http://wiki.centos.org/HowTos/Nagios
      >
      > Max's original is
      >
      > http://www.maxsworld.org/index.php/how-tos/nagios
      >
      > (He also has a good article on his site about nrpe, which can work with
      > nagios to monitor Windows machines.)

      Yeah, I know, but I thought I'd suggest it anyway. Nagios was a
      lifesaver many times when I was an admin... mostly because the VP kept
      the monitoring station at her house, so every time her cable went out
      (frequently), I'd get a barrage of pages at 2 and 3 AM that made it
      sound like our NOC was on fire and every server was gone... After
      driving in to the office at 3 am TWICE (a 45 mile drive each way), I
      found Nagios and never went back to the office outside my normal hours
      again ;-)

      BUT that being said, I gave him a better way to get just the CPU and
      MEM info from top as well, plus, if "can't install packages" means "We
      can't install any 3rd party apps on our production servers" and not
      "we can't install ANYTHING on our production servers" he should still
      be able to install sysstat, at least on the RHEL 4 and RHEL 5 boxes as
      they are part of the base distro (and really SHOULD have been
      installed anyway in the first place).

      Even if they don't want to use Nagios or some other centralized
      monitoring software, installing sysstat on a server in a production
      environment seems kinda critical to me, but that's just my humble
      opinion, and I fully understand the insanity that is often handed down
      by the people who don't actually have to maintain these things...

      "Huh? What do you mean you have to reboot the server."
      Well, there was a major security update to the kernel, so we need to
      boot the new kernel.
      "Kernel? What's that? No, we need uptime, you can't reboot the
      servers to suit your whim."
      Ummm....

      Then later on...
      "What do you mean the server was compromised? Don't we pay you to
      keep that kind of thing from happening?"

      *the previous re-enactment was real, the personal data was changed to
      protect the innocent... [queue Dragnet theme]

      And also, there is probably a FAR better way of extracting the data
      from top than just using grep like I did, but that would require a bit
      of awk and shell scripting that I don't want to worry about right now
      ;-)
    • jitin
      ... AHEM.... fixed the top posting. Please do not top post (I am not a moderator, but I play one on other groups...) The top option works well, but you also
      Message 2 of 6 , Mar 2, 2009
      • 0 Attachment
        >> Hi Group,
        >>
        >> Please suggest how can i check my cpu and ram utilization b/w 18:00 to
        >> 21:00
        >>
        >> these are production server on rhel 4 rhel 3 and redhat 9,
        >>
        >> we can not install any package on thease machines
        >>
        >> sar mpstat and iostat is not working
        >>
        >> jitin

        > Hi Jitin,
        >
        > You can try top command to record it into a file for this period like:-
        >
        > #top -b -d 5 -n 2160 >/tmp/loadreport.txt
        > you can schedule this command to run at 18:00,
        >
        > Explanation of the options:-
        > -b option forces your top command to run in batch mode
        > -d 5 will run the top command each 5 seconds
        > -n 2160 this will run the top command 2160 times that is equal to 3 hrs.
        >
        > Now you need to analyze this /tmp/loadreport.txt file.
        >
        > Thanks,
        > Rajveer Singh
        >

        AHEM.... fixed the top posting. Please do not top post (I am not a
        moderator, but I play one on other groups...)

        The top option works well, but you also mentioned sar, iostat and
        mpstat and said that they are not working... what do you mean that
        they are not working?

        Do they error out when you run them, or do you just not have them
        installed? All three of those commands are part of the sysstat
        package which is available at least on RHEL 5, though you may not have
        it installed.

        sysstat also provides sadf and sa.

        As for the top command listed above, you can do that, but the problem
        is you'll end up with the info you want, and possibly a few thousand
        lines of extra info depending on how many processes are running. Top
        will dump the utilization as well as info for every PID active.

        If you want to go the top route, and don't need the extra data, you
        probably will want to trim the output a bit like this:

        top -b -d 5 -n 2160 |egrep -A 2 Cpu > /tmp/topreport.txt

        which should give you something like this in your report file:

        Cpu(s): 14.6%us, 1.9%sy, 0.0%ni, 82.7%id, 0.3%wa, 0.1%hi, 0.3%si, 0.0%st
        Mem: 2066664k total, 2016692k used, 49972k free, 220760k buffers
        Swap: 3132632k total, 32296k used, 3100336k free, 617068k cached
        --
        Cpu(s): 24.6%us, 2.0%sy, 0.0%ni, 71.4%id, 0.0%wa, 1.5%hi, 0.5%si, 0.0%st
        Mem: 2066664k total, 2017244k used, 49420k free, 220760k buffers
        Swap: 3132632k total, 32296k used, 3100336k free, 617084k cached
        --
        Cpu(s): 12.0%us, 1.5%sy, 0.0%ni, 86.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
        Mem: 2066664k total, 2017236k used, 49428k free, 220760k buffers
        Swap: 3132632k total, 32296k used, 3100336k free, 617088k cached

        A much more elegant solution may be for you to use Nagios to monitor
        your systems.

        You can use Nagios to check CPU load, memory usage, disk usage, and a
        WHOLE LOT of other things on remote systems, PLUS it gives you pretty
        output too.

        http://nagios.sourceforge.net

        It's used by a lot of people for monitoring server farms and is very
        configurable and in the past I've found it to be quite reliable.

        PLUS, given your 1800 - 2100 timeframe, you can also use it to check
        other things like network usage and individual services to see if you
        are getting increased traffic during those times, having an inordinate
        number of web servers spawn during those hours, etc...

        Hope that helps

        Jeff


        Hi Jeff


        Thanks dear I could resolve this issue with your help only
        As suggested I used below script in cron


        top -b -d 5 -n 2160 |egrep -A 2 Cpu > /tmp/`"date"`



        jitin
      Your message has been successfully submitted and would be delivered to recipients shortly.