Loading ...
Sorry, an error occurred while loading the content.

CPU Utilization at 100% with Postfix running

Expand Messages
  • Patrick - South Valley Internet
    Hi all, Starting around Saturday afternoon, our mail server has been having some issues. It seems our mail server has been getting around 400 emails a minute,
    Message 1 of 11 , Oct 4, 2006
    • 0 Attachment
      Hi all,

      Starting around Saturday afternoon, our mail server has been having some
      issues. It seems our mail server has been getting around 400 emails a
      minute, which I thought wasn't bad at all. Because of this, or because
      of something else, CPU utilization on the server has been maxed out to
      100% (~8% system, 92% user). If I stop Postfix, of course the system
      goes to under 30% utilization.

      We are running AIX 5.2 on this machine. The machine has 4gb of RAM, and
      is a RS6000 processor. No configuration changes have been made at all.

      I've ran iostat, pstat -s, and svmon, and everything looks fine. There
      doesn't look like there is anything wrong with the disks, and there are
      no errors in the error log. The only thing I can find is that when
      running a topas (top), the server has anywhere between 800 and 1100 page
      faults.

      Is 400 emails a minute a reasonable amount of email for a mail server to
      handle?

      Does anyone have any suggestions as to what I can do? Here's our main.cf:

      --------------------------------------------------------------------------------------------
      queue_directory = /spool-mqueue
      command_directory = /usr/local/postfix/bin
      daemon_directory = /usr/local/postfix/bin
      mail_owner = postfix
      myhostname = xxx.xxx.com
      myorigin = $mydomain
      inet_interfaces = all
      mydestination = $myhostname, localhost.$mydomain, /etc/postfix/local.domains
      unknown_local_recipient_reject_code = 550
      mynetworks = xxx.xxx.xxx.xxx/24, xxx.xxx.xxx.xxx/18

      virtual_alias_maps = dbm:/etc/postfix/virtual
      virtual_alias_domains = $virtual_alias_maps

      transport_maps = dbm:/etc/postfix/transport
      alias_maps = dbm:/etc/postfix/aliases, dbm:/etc/postfix/majordomo.aliases
      alias_database = dbm:/etc/postfix/aliases,
      dbm:/etc/postfix/majordomo.aliases
      recipient_delimiter = +

      mail_spool_directory = /var/spool/mail

      fast_flush_domains = $relay_domains, msn.com

      smtpd_banner = $myhostname ESMTP -- NO UCE ALLOWED

      local_destination_concurrency_limit = 2
      default_destination_concurrency_limit = 50
      default_destination_recipient_limit = 50

      debug_peer_level = 2

      debugger_command =
      PATH=/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin
      xxgdb $daemon_directory/$process_name $process_id & sleep 5

      sendmail_path = /usr/local/postfix/bin

      newaliases_path = /usr/local/postfix/bin

      mailq_path = /usr/local/postfix/bin

      setgid_group = mail

      manpage_directory = /usr/local/man

      sample_directory = /usr/local/postfix/sample

      readme_directory = no


      enable_sasl_authentication = no
      smtpd_recipient_restrictions =
      reject_invalid_hostname,
      reject_non_fqdn_hostname,
      reject_non_fqdn_sender,
      reject_unknown_sender_domain,
      reject_unknown_recipient_domain,
      permit_mynetworks,
      reject_unauth_destination,
      check_recipient_access dbm:/etc/postfix/smtp-rcpt-access,
      # check_helo_access dbm:/etc/postfix/helo_checks,
      # check_sender_access dbm:/etc/postfix/sender_checks,
      # check_client_access dbm:/etc/postfix/client_checks,
      # check_client_access pcre:/etc/postfix/client_checks.pcre,
      reject_rbl_client relays.ordb.org,
      reject_rbl_client list.dsbl.org,
      reject_rbl_client sbl.spamhaus.org,
      reject_rbl_client cbl.abuseat.org,
      reject_rbl_client dul.dnsbl.sorbs.net,
      permit

      smtpd_data_restrictions =
      reject_unauth_pipelining,
      permit

      allow_mail_to_files = alias,forward,include


      allow_mail_to_commands = alias,forward,include

      bounce_size_limit = 100000

      command_time_limit = 10000

      default_process_limit = 300

      deliver_lock_attempts = 5

      deliver_lock_delay = 1s

      duplicate_filter_limit = 1000

      fork_attempts = 5

      fork_delay = 1s

      ipc_idle = 1000s

      ipc_timeout = 3600s

      header_size_limit = 102400

      header_address_token_limit = 10240

      line_length_limit = 2048

      message_size_limit = 25600000

      qmgr_message_active_limit = 10000

      qmgr_message_recipient_limit = 10000

      queue_minfree = 38400000

      stale_lock_time = 500s

      transport_retry_time = 60s

      smtpd_helo_required = yes

      disable_vrfy_command = yes



      smtpd_sender_restrictions = reject_unknown_sender_domain,
      check_sender_access dbm:/etc/postfix/spammers,
      #reject_unverified_sender,
      reject_non_fqdn_sender



      smtpd_sasl_auth_enable = no
      broken_sasl_auth_clients=yes
      smtpd_sasl_security_options = noanonymous

      smtpd_sasl_local_domain = $mydomain

      smtpd_use_tls = yes
      smtpd_tls_key_file = /etc/postfix/xxx.xxx.com.key
      smtpd_tls_cert_file = /etc/postfix/xxx.xxx.com.crt

      body_checks = regexp:/etc/postfix/body_checks.regexp
      body_checks_size_limit = 100000

      virtual_mailbox_lock = dotlock

      swap_bangpath = yes

      address_verify_sender = postmaster+verify3@...
      address_verify_poll_count = 2
      address_verify_poll_delay = 10s
      address_verify_map = btree:/work/postfix/verify
      address_verify_positive_expire_time = 2d
      address_verify_positive_refresh_time = 2d
      address_verify_negative_cache = no
      address_verify_negative_expire_time = 1h
      address_verify_negative_refresh_time = 1h
      address_verify_transport_maps = $transport_maps
      address_verify_relayhost = $relayhost
      address_verify_default_transport = $default_transport
      address_verify_relay_transport = $relay_transport
      address_verify_virtual_transport = $virtual_transport
      address_verify_local_transport = $local_transport

      smtpd_client_connection_count_limit = 300
      smtpd_client_connection_rate_limit = 0
      smtpd_client_connection_limit_exceptions = $mynetworks

      maximal_queue_lifetime = 3d

      smtp_mx_session_limit=100
      smtp_destination_concurrency_limit=300
      maximal_backoff_time = 1000s
      minimal_backoff_time = 300s


      mailbox_size_limit = 73000000

      bounce_queue_lifetime=0

      --------------------------------------------------------------------------------------------

      Thanks in advance.

      Patrick
    • Wietse Venema
      ... Depends on what the patterns in that file. Why did you increase the body_checks_size_limit setting? It is a great CPU sink. Wietse
      Message 2 of 11 , Oct 4, 2006
      • 0 Attachment
        Patrick - South Valley Internet:
        > Is 400 emails a minute a reasonable amount of email for a mail server to
        > handle?
        ...
        > body_checks = regexp:/etc/postfix/body_checks.regexp
        > body_checks_size_limit = 100000

        Depends on what the patterns in that file.

        Why did you increase the body_checks_size_limit setting? It is
        a great CPU sink.

        Wietse
      • Patrick - South Valley Internet
        I didn t - the person before me did. What should I change this to? Should I leave it at the default? Did you notice anything else wrong with my setup? Thanks
        Message 3 of 11 , Oct 4, 2006
        • 0 Attachment
          I didn't - the person before me did.

          What should I change this to? Should I leave it at the default?

          Did you notice anything else wrong with my setup?

          Thanks for the response Wietse.

          Patrick




          Wietse Venema wrote:
          > Patrick - South Valley Internet:
          >
          >> Is 400 emails a minute a reasonable amount of email for a mail server to
          >> handle?
          >>
          > ...
          >
          >> body_checks = regexp:/etc/postfix/body_checks.regexp
          >> body_checks_size_limit = 100000
          >>
          >
          > Depends on what the patterns in that file.
          >
          > Why did you increase the body_checks_size_limit setting? It is
          > a great CPU sink.
          >
          > Wietse
          >
          >
          >
          >
        • Wietse Venema
          ... If your top says that cleanup processes use up all the CPU, then that would definitely help. ... I did not look further. Excessive use of body checks is
          Message 4 of 11 , Oct 4, 2006
          • 0 Attachment
            Patrick - South Valley Internet:
            > Is 400 emails a minute a reasonable amount of email for a mail server to
            > handle?
            > ...
            > body_checks = regexp:/etc/postfix/body_checks.regexp
            > body_checks_size_limit = 100000

            Wietse:
            > Depends on what the patterns in that file.
            >
            > Why did you increase the body_checks_size_limit setting? It is
            > a great CPU sink.

            Patrick - South Valley Internet:
            > I didn't - the person before me did.
            >
            > What should I change this to? Should I leave it at the default?

            If your "top" says that cleanup processes use up all the CPU, then
            that would definitely help.

            > Did you notice anything else wrong with my setup?

            I did not look further. Excessive use of body checks is a common
            cause of CPU load.

            Wietse
          • Patrick - South Valley Internet
            Thanks again Wietse. Unfortunately, that didn t solve the problem. We were running an older version of Postfix - 2.0.18. I upgraded it to 2.3.3, and I saw
            Message 5 of 11 , Oct 4, 2006
            • 0 Attachment
              Thanks again Wietse. Unfortunately, that didn't solve the problem.

              We were running an older version of Postfix - 2.0.18. I upgraded it to
              2.3.3, and I saw the CPU usage drop by half, but it's still being
              overloaded.

              Someone here at work mentioned that a few years back someone changed an
              entry in the /etc/passwd file which caused Postfix to freak out like
              this. I don't see how that could do anything, so I haven't pursued this
              road.

              Could something else aside from Postfix be causing Postfix to eat up all
              the available CPU resources?

              Patrick





              Wietse Venema wrote:
              > Patrick - South Valley Internet:
              >
              >> Is 400 emails a minute a reasonable amount of email for a mail server to
              >> handle?
              >> ...
              >> body_checks = regexp:/etc/postfix/body_checks.regexp
              >> body_checks_size_limit = 100000
              >>
              >
              > Wietse:
              >
              >> Depends on what the patterns in that file.
              >>
              >> Why did you increase the body_checks_size_limit setting? It is
              >> a great CPU sink.
              >>
              >
              > Patrick - South Valley Internet:
              >
              >> I didn't - the person before me did.
              >>
              >> What should I change this to? Should I leave it at the default?
              >>
              >
              > If your "top" says that cleanup processes use up all the CPU, then
              > that would definitely help.
              >
              >
              >> Did you notice anything else wrong with my setup?
              >>
              >
              > I did not look further. Excessive use of body checks is a common
              > cause of CPU load.
              >
              > Wietse
              >
              >
              >
              >
            • Wietse Venema
              ... Pardon my rudeness, but have you considered doing more detailed measurements than the total system CPU load? You could try to find out which Postfix
              Message 6 of 11 , Oct 4, 2006
              • 0 Attachment
                Patrick - South Valley Internet:
                > Thanks again Wietse. Unfortunately, that didn't solve the problem.
                >
                > We were running an older version of Postfix - 2.0.18. I upgraded it to
                > 2.3.3, and I saw the CPU usage drop by half, but it's still being
                > overloaded.
                >
                > Someone here at work mentioned that a few years back someone changed an
                > entry in the /etc/passwd file which caused Postfix to freak out like
                > this. I don't see how that could do anything, so I haven't pursued this
                > road.
                >
                > Could something else aside from Postfix be causing Postfix to eat up all
                > the available CPU resources?

                Pardon my rudeness, but have you considered doing more detailed
                measurements than the total system CPU load?

                You could try to find out which Postfix processes eat up the CPU.
                Postfix is not a monolith like Sendmail.

                Wietse
              • Patrick - South Valley Internet
                It appears as if the hard disk bounces anywhere between 40% and 70% while Postfix is running. As far as processes go, I have the following processes running,
                Message 7 of 11 , Oct 4, 2006
                • 0 Attachment
                  It appears as if the hard disk bounces anywhere between 40% and 70%
                  while Postfix is running.

                  As far as processes go, I have the following processes running, with
                  multiple instances of them, all taking up ~2-3% CPU:

                  named
                  qpopper
                  proxymap
                  smtpd
                  imapd

                  This is a dual CPU machine, and both CPU's show 80-95% utilization.

                  There is 91% free paging space.

                  This machine has 4gb of ram, with ~67% utlized by 'Client'

                  I get roughly 700 - 1000 page faults.

                  I've rebooted the machine this morning, didn't fix it. I upgraded from
                  2.0.18 to 2.3.3 - didn't fix it.

                  Someone around the office mentioned that something like this happened a
                  few years back. They said it might have something to do with somebody
                  editing the /etc/passwd file and not using smitty. That doesn't make
                  much sense, but I Thought I would throw it in there in case you might
                  have any insight on this.

                  Thanks again for the help.

                  Patrick


                  Wietse Venema wrote:
                  > What is the bottle neck?
                  >
                  > What has 100% utilization?
                  >
                  > Is it the disk?
                  >
                  > Is it memory?
                  >
                  > Is it the CPU?
                  >
                  > If CPU load is an issue:
                  >
                  >
                  >>> You could try to find out which Postfix processes eat up the CPU.
                  >>> Postfix is not a monolith like Sendmail.
                  >>>
                  >
                  > If it is none of the above, perhaps something is waiting for stuff
                  > to come across a network.
                  >
                  > Wietse
                  >
                  >
                  >
                  >
                • Wietse Venema
                  ... That would certainly contribute to the problem. If this is file I/O, then you need to reduce the mail load or increase the speed of the disk. If this is
                  Message 8 of 11 , Oct 4, 2006
                  • 0 Attachment
                    Patrick - South Valley Internet:
                    > It appears as if the hard disk bounces anywhere between 40% and 70%
                    > while Postfix is running.

                    That would certainly contribute to the problem.

                    If this is file I/O, then you need to reduce the mail load or
                    increase the speed of the disk.

                    If this is swapping I/O, then you need to reduced the number of
                    processes (main.cf:default_process_limit) or increase the amount
                    of memory.

                    Wietse
                  • Patrick - South Valley Internet
                    I just wanted to update everyone on this problem. We were having issues with our mail server with the CPU being utilized 100%. The box is an RS6000 running
                    Message 9 of 11 , Oct 5, 2006
                    • 0 Attachment
                      I just wanted to update everyone on this problem.

                      We were having issues with our mail server with the CPU being utilized
                      100%. The box is an RS6000 running AIX 5.2 with 4gb ram.

                      The problem was that someone manually edited the /etc/passwd file
                      instead of using SMIT. AIX keeps /etc/passwd in a database. When you
                      edit the file then the database is marked as invalid and the system does
                      text file parsing to find users. This will cause CPU load to skyrocket.

                      To fix this, we issued a 'mkpasswd -f' and everything was fine.

                      Thanks to all who emailed me and helped. You guys are awesome!

                      Patrick




                      Wietse Venema wrote:
                      > Patrick - South Valley Internet:
                      >
                      >> It appears as if the hard disk bounces anywhere between 40% and 70%
                      >> while Postfix is running.
                      >>
                      >
                      > That would certainly contribute to the problem.
                      >
                      > If this is file I/O, then you need to reduce the mail load or
                      > increase the speed of the disk.
                      >
                      > If this is swapping I/O, then you need to reduced the number of
                      > processes (main.cf:default_process_limit) or increase the amount
                      > of memory.
                      >
                      > Wietse
                      >
                      >
                      >
                      >
                    • lst_hoe01@kwsoft.de
                      ... Hmm. I guess this is only true if you have a lot of accounts in your passwd file isn t it? Parsing a text file with some 100 accounts should not be that
                      Message 10 of 11 , Oct 5, 2006
                      • 0 Attachment
                        Zitat von Patrick - South Valley Internet <patrickm@...>:

                        > I just wanted to update everyone on this problem.
                        >
                        > We were having issues with our mail server with the CPU being
                        > utilized 100%. The box is an RS6000 running AIX 5.2 with 4gb ram.
                        >
                        > The problem was that someone manually edited the /etc/passwd file
                        > instead of using SMIT. AIX keeps /etc/passwd in a database. When you
                        > edit the file then the database is marked as invalid and the system
                        > does text file parsing to find users. This will cause CPU load to
                        > skyrocket.
                        >
                        > To fix this, we issued a 'mkpasswd -f' and everything was fine.


                        Hmm. I guess this is only true if you have a lot of accounts in your
                        passwd file isn't it? Parsing a text file with some 100 accounts should
                        not be that expansive.

                        Just curious

                        Andreas
                      • Patrick - South Valley Internet
                        Yes, we have about 4000 accounts on that machine. Patrick
                        Message 11 of 11 , Oct 5, 2006
                        • 0 Attachment
                          Yes, we have about 4000 accounts on that machine.

                          Patrick



                          lst_hoe01@... wrote:
                          > Zitat von Patrick - South Valley Internet <patrickm@...>:
                          >
                          >> I just wanted to update everyone on this problem.
                          >>
                          >> We were having issues with our mail server with the CPU being
                          >> utilized 100%. The box is an RS6000 running AIX 5.2 with 4gb ram.
                          >>
                          >> The problem was that someone manually edited the /etc/passwd file
                          >> instead of using SMIT. AIX keeps /etc/passwd in a database. When you
                          >> edit the file then the database is marked as invalid and the system
                          >> does text file parsing to find users. This will cause CPU load to
                          >> skyrocket.
                          >>
                          >> To fix this, we issued a 'mkpasswd -f' and everything was fine.
                          >
                          >
                          > Hmm. I guess this is only true if you have a lot of accounts in your
                          > passwd file isn't it? Parsing a text file with some 100 accounts
                          > should not be that expansive.
                          >
                          > Just curious
                          >
                          > Andreas
                          >
                          >
                          >
                          >
                          >
                        Your message has been successfully submitted and would be delivered to recipients shortly.