Loading ...
Sorry, an error occurred while loading the content.

Re: performance problems

Expand Messages
  • Jeremie CEINTREY
    Thank you very much for your explanations. I m going to test with smtpd_client_connection_count_limit = 1 Three days ago I added
    Message 1 of 19 , Apr 1, 2012
    • 0 Attachment
      Thank you very much for your explanations.

      I'm going to test with smtpd_client_connection_count_limit = 1

      Three days ago I added smtpd_client_connection_rate_limit = 10, wich limit the number of connection by a client to 10 by time unit; a time unit equal to 60s by default.
      I noticed that it works well and permit to slow down big mailers. As you write it, when a mailing list campain was in progress, I was able to see hundreds of mails arriving from a domain with tail -f /var/log/mail.log | grep cleanup

      tail -f /var/log/mail.log | grep 'postfix/cleanup.*@domain_of_big_mailer

      Yet, i'm going to test with smtpd_client_connection_count_limit = 1, wich looks like smtpd_client_connection_rate_limit and smtpd_client_message_(rate|count)_limit parameters.

      I will give you news in a about week.





      De: "Stan Hoeppner" <stan@...>
      À: "Jeremie CEINTREY" <jeremie.ceintrey@...>
      Cc: postfix-users@...
      Envoyé: Samedi 31 Mars 2012 19:16:57
      Objet: Re: performance problems

      On 3/30/2012 2:40 AM, Jeremie CEINTREY wrote:

      Due to lack of any evidence I'll have to make some coarse educated
      guesses here.

      > I often encounter performance problems with postfix and mailing-list.

      Mailing list servers.  Implicates rapid and likely parallel delivery
      over a short interval.

      > I have a relay mail postfix filtering mail traffic for about 12 000 mailboxes. Mailboxes are hosted on other mail servers.
      >
      > I use also amavisd-new with spammassassin and clamav.
      > Postfix version is 2.8 and i use it with postscreen configuration.

      This is likely not the problem.

      > My problem is that I often see that mailq grows up to 7000 mails and it takes about 2 hours to deliver those mails. I don't understand why it is so slow. When i take a look at sender with qshape -s active i see who are sending mails and it is always mailling, but it is not spam traffic.

      Let's assume you don't have any obvious Postfix mis-configuration.

      So lets think this through logically, step by step.  The first and
      obvious issue is that mail is coming in faster than it can go out which
      is why the queue is filling up.  So we need to identify the cause of the
      slow outbound queue.  A couple of possible causes:

      1.  Downstream servers are limiting your delivery rate
      2.  You storage system doesn't have enough IOPS performance to handle
          both the incoming write load *and* outgoing read load

      #1 should be identifiable by lots of premature disconnects and/or 4xx
      rejections from downstream servers.  If this is indeed the problem you
      can have admins of those servers make necessary changes, or you can
      create individual relay transports configured with delays.

      #2 can be fixed by replacing the queue disk with a faster device with
      more IOPS capability, either a disk with higher RPM, an SSD, or multiple
      disks in a striped RAID configuration, accomplished either with a
      software RAID driver or with a hardware RAID controller.

      Both #1 and #2 can usually easily be fixed by limiting the rate of
      incoming mail, specifically by reducing the number of allowed parallel
      connections.  Both mailing lists and bulk mailers tend to send a large
      volume of mail in a short period of time.  The sending MTAs tend to do
      this by opening multiple connections to your MTA, Postfix in this case,
      which allows 50 connections per client by default.  To limit the number
      of inbound parallel connections, you would change the value of the
      following parameter

      smtpd_client_connection_count_limit

      from the default of 50 to something much lower, such as 1.  This will
      limit the number of connections from mailing list servers, bulk mailers,
      and spammers, while having no negative impact on normal inbound mail
      flow.  Regular/normal mail delivery is usually accomplished with one
      email being sent over a single connection which is then closed upon
      successful delivery.  With 50 open connections, a sending MTA such as a
      mailing list server, could potentially send hundreds of messages per
      second.  If your disk isn't fast enough, which is likely your problem,
      it will spend all of its IOPS writing the inbound mail to the queue,
      which takes precedence over reads, causing read starvation and thus slow
      delivery.  This is likely why your queue piles up and takes so long to
      drain.

      The net effect of this is that instead of your outbound queue piling up
      messages, these messages pile up in the outbound queue of the sending
      MTA, allowing you to receive them at a sane rate, and preventing your
      queue from piling up, and delaying delivery.

      An almost identical question came up on this list about a month or so
      ago.  Setting smtpd_client_connection_count_limit=1 solved the OP's
      queue problem.  It'll probably solve yours as well.

      Hope this information is helpful.  Please let us know if this fixes your
      problem.

      --
      Stan

    • Stan Hoeppner
      ... smtpd_client_connection_count_limit tends to only slow down bulk mailers and not normal non-bulk mailers, which is why I recommended it.
      Message 2 of 19 , Apr 2, 2012
      • 0 Attachment
        On 4/2/2012 1:51 AM, Jeremie CEINTREY wrote:
        > Thank you very much for your explanations.
        >
        > I'm going to test with smtpd_client_connection_count_limit = 1
        >
        > Three days ago I added smtpd_client_connection_rate_limit = 10, wich limit the number of connection by a client to 10 by time unit; a time unit equal to 60s by default.
        > I noticed that it works well and permit to slow down big mailers. As you write it, when a mailing list campain was in progress, I was able to see hundreds of mails arriving from a domain with tail -f /var/log/mail.log | grep cleanup
        >
        > tail -f /var/log/mail.log | grep 'postfix/cleanup.*@domain_of_big_mailer
        >
        > Yet, i'm going to test with smtpd_client_connection_count_limit = 1, wich looks like smtpd_client_connection_rate_limit and smtpd_client_message_(rate|count)_limit parameters.

        smtpd_client_connection_count_limit tends to only slow down bulk mailers
        and not 'normal' non-bulk mailers, which is why I recommended it.

        smtpd_client_connection_rate_limit and
        smtpd_client_message_(rate|count)_limit will delay delivery from
        'normal' mailers on occasion, possibly very frequently. This is a
        negative side effect most would want to avoid. This type of restriction
        should be configured only on a domain or IP subnet basis so you only
        affect the bulk mailers. Postfix doesn't have an inbuilt way to do so.
        These settings are global. Thus, if you want to use this type of rate
        delay you would want to use an add on policy daemon. The policy daemon
        method has a downside: it requires an smtpd process for each connection
        to be delayed, eating extra system resources.

        Setting smtpd_client_connection_count_limit also sets
        postscreen_client_connection_count_limit if you're using postfix 2.8 and
        postscreen. Thus the limit is enforced before connections are handed to
        smtpd processes, so you don't needlessly eat up additional smtpds.

        Thus, it's much simpler and more effective to use
        smtpd_client_connection_count_limit to achieve your goal, without
        multiple unwanted side effects.

        --
        Stan
      • Wietse Venema
        ... Note that postscreen either blocks a client or hands it off to a Postfix SMTP server process. The connection count limit in postscreen applies only to the
        Message 3 of 19 , Apr 3, 2012
        • 0 Attachment
          Stan Hoeppner:
          > Setting smtpd_client_connection_count_limit also sets
          > postscreen_client_connection_count_limit if you're using postfix 2.8 and
          > postscreen. Thus the limit is enforced before connections are handed to
          > smtpd processes, so you don't needlessly eat up additional smtpds.

          Note that postscreen either blocks a client or hands it off to a
          Postfix SMTP server process. The connection count limit in postscreen
          applies only to the SMTP clients that are (not yet) handed off to
          an SMTP server process. Once the hand-off is done, postscreen does
          not know when an SMTP session ends, so the session no longer counts
          towards the postscreen connection count limit. The code was tricky
          enough that I did not want to introduce a postscreen-to-anvil
          dependency.

          The postscreen connection count limit is still effective for "hit
          and run" spambots that make a burst of connections at approximately
          the same time. Such clients will exceed the connection limit while
          waiting for the pregreet timer to expire, or for DNS[BW]L lookups
          to complete.

          Wietse
        • Stan Hoeppner
          ... Ahh, thanks for the clarification Wietse. The smtpd_client_connection_count_limit is still enforced against post hand off client connections though,
          Message 4 of 19 , Apr 3, 2012
          • 0 Attachment
            On 4/3/2012 10:27 AM, Wietse Venema wrote:
            > Stan Hoeppner:
            >> Setting smtpd_client_connection_count_limit also sets
            >> postscreen_client_connection_count_limit if you're using postfix 2.8 and
            >> postscreen. Thus the limit is enforced before connections are handed to
            >> smtpd processes, so you don't needlessly eat up additional smtpds.
            >
            > Note that postscreen either blocks a client or hands it off to a
            > Postfix SMTP server process. The connection count limit in postscreen
            > applies only to the SMTP clients that are (not yet) handed off to
            > an SMTP server process. Once the hand-off is done, postscreen does
            > not know when an SMTP session ends, so the session no longer counts
            > towards the postscreen connection count limit. The code was tricky
            > enough that I did not want to introduce a postscreen-to-anvil
            > dependency.

            Ahh, thanks for the clarification Wietse. The
            smtpd_client_connection_count_limit is still enforced against post hand
            off client connections though, correct?

            > The postscreen connection count limit is still effective for "hit
            > and run" spambots that make a burst of connections at approximately
            > the same time. Such clients will exceed the connection limit while
            > waiting for the pregreet timer to expire, or for DNS[BW]L lookups
            > to complete.

            So the postscreen connection limit is good for slowing bots, no surprise
            since bots are the postscreen target, but the smtpd connection limit is
            still appropriate/needed for slowing legit bulk mailer clients, assuming
            one chooses to use it vs the other anvil based restrictions.

            --
            Stan
          • Wietse Venema
            ... Correct. postscreen by design has no effect on known, non-bot, clients. Wietse
            Message 5 of 19 , Apr 3, 2012
            • 0 Attachment
              Stan Hoeppner:
              > So the postscreen connection limit is good for slowing bots, no surprise
              > since bots are the postscreen target, but the smtpd connection limit is
              > still appropriate/needed for slowing legit bulk mailer clients, assuming
              > one chooses to use it vs the other anvil based restrictions.

              Correct. postscreen by design has no effect on known, non-bot, clients.

              Wietse
            Your message has been successfully submitted and would be delivered to recipients shortly.