Loading ...
Sorry, an error occurred while loading the content.

smtp_connection_cache_on_demand

Expand Messages
  • Peter Rabbitson
    Hi, I am experiencing 421 errors between my secondary and primary MXes, and it seems it is cause by lack of connection caching.
    Message 1 of 11 , May 1, 2007
    • 0 Attachment
      Hi,

      I am experiencing 421 errors between my secondary and primary MXes, and
      it seems it is cause by lack of connection caching.
      http://www.postfix.org/postconf.5.html#smtp_connection_cache_on_demand
      misses to explain what is "high volume of mail in the active queue".
      When is exactly connection caching activated?

      Thanks

      Peter

      P.S. I know that I can permanently enable connection caching on the
      secondary MX when talking to the primary but I am trying to keep the
      configuration concise.
    • Wietse Venema
      ... What is the error message? ... Roughly, it is activated when the active queue contains another message before the current delivery is completed. Wietse
      Message 2 of 11 , May 1, 2007
      • 0 Attachment
        Peter Rabbitson:
        > Hi,
        >
        > I am experiencing 421 errors between my secondary and primary MXes, and
        > it seems it is cause by lack of connection caching.

        What is the error message?

        > http://www.postfix.org/postconf.5.html#smtp_connection_cache_on_demand
        > misses to explain what is "high volume of mail in the active queue".
        > When is exactly connection caching activated?

        Roughly, it is activated when the active queue contains another
        message before the current delivery is completed.

        Wietse

        > Thanks
        >
        > Peter
        >
        > P.S. I know that I can permanently enable connection caching on the
        > secondary MX when talking to the primary but I am trying to keep the
        > configuration concise.
        >
        >
      • Peter Rabbitson
        ... There is no error message as such, see below. ... If the primary MX is down for an extended period of time and a large queue accumulates on the backup, all
        Message 3 of 11 , May 1, 2007
        • 0 Attachment
          Wietse Venema wrote:
          > Peter Rabbitson:
          >> Hi,
          >>
          >> I am experiencing 421 errors between my secondary and primary MXes, and
          >> it seems it is cause by lack of connection caching.
          >
          > What is the error message?
          >

          There is no error message as such, see below.

          >> http://www.postfix.org/postconf.5.html#smtp_connection_cache_on_demand
          >> misses to explain what is "high volume of mail in the active queue".
          >> When is exactly connection caching activated?
          >
          > Roughly, it is activated when the active queue contains another
          > message before the current delivery is completed.
          >

          If the primary MX is down for an extended period of time and a large
          queue accumulates on the backup, all messages are rushed to the primary
          MX in what it seems separate smtp connections. At least I was able to
          count as many smtp processes in `ps` as 2/3 of the number of queued
          messages, right after I issue `postfix flush`. If I specify explicit
          caching for the particular mx host, things work as expected. I guess
          there is not enough time for the caching on demand to activate when
          doing a flush or having enough queued messages to simulate one.

          Peter
        • Wietse Venema
          ... Any reasonable SMTP server sends 421 followed by some text that explains why it hangs up. Having looked at the text below, I think your problem is that you
          Message 4 of 11 , May 1, 2007
          • 0 Attachment
            Peter Rabbitson:
            > Wietse Venema wrote:
            > > Peter Rabbitson:
            > >> Hi,
            > >>
            > >> I am experiencing 421 errors between my secondary and primary MXes, and
            > >> it seems it is cause by lack of connection caching.
            > >
            > > What is the error message?
            >
            > There is no error message as such, see below.

            Any reasonable SMTP server sends 421 followed by some text that
            explains why it hangs up.

            Having looked at the text below, I think your problem is that
            you are making an insane number of SIMULTANEOUS connections
            to the primary MX host.

            > >> http://www.postfix.org/postconf.5.html#smtp_connection_cache_on_demand
            > >> misses to explain what is "high volume of mail in the active queue".
            > >> When is exactly connection caching activated?
            > >
            > > Roughly, it is activated when the active queue contains another
            > > message before the current delivery is completed.
            >
            > If the primary MX is down for an extended period of time and a large
            > queue accumulates on the backup, all messages are rushed to the primary
            > MX in what it seems separate smtp connections. At least I was able to
            > count as many smtp processes in `ps` as 2/3 of the number of queued
            > messages, right after I issue `postfix flush`. If I specify explicit
            > caching for the particular mx host, things work as expected. I guess
            > there is not enough time for the caching on demand to activate when
            > doing a flush or having enough queued messages to simulate one.

            This is not a surprise.

            If the number of SIMULTANEOUS connections is 2/3 the number of
            queued messages, then most connections will never be reused because
            the mail is already delivered.

            I suggest that you revert to no more than 10-20 SIMULTANEOUS
            connections to the primary MX (or to any machine).

            /etc/postfix/main.cf:
            smtp_destination_concurrency_limit=20
            relay_destination_concurrency_limit=20

            If you do that, not only will the primary MX perform better, you
            will also see connection reuse happen automatically.

            Wietse
          • Noel Jones
            ... Sounds as if you have a large *_destination_concurrency_limit set on the backup, and the primary isn t able to gracefully handle that many connections.
            Message 5 of 11 , May 1, 2007
            • 0 Attachment
              At 02:12 PM 5/1/2007, Peter Rabbitson wrote:

              >If the primary MX is down for an extended period of time and a large
              >queue accumulates on the backup, all messages are rushed to the primary
              >MX in what it seems separate smtp connections. At least I was able to
              >count as many smtp processes in `ps` as 2/3 of the number of queued
              >messages, right after I issue `postfix flush`. If I specify explicit
              >caching for the particular mx host, things work as expected. I guess
              >there is not enough time for the caching on demand to activate when
              >doing a flush or having enough queued messages to simulate one.
              >
              >Peter

              Sounds as if you have a large *_destination_concurrency_limit set on
              the backup, and the primary isn't able to gracefully handle that many
              connections. Don't do that.
              http://www.postfix.org/TUNING_README.html#rope

              --
              Noel Jones
            • Peter Rabbitson
              ... I apologize, I thought that 421 in the MTA world is as self-explanatory as say 403 in the http world. ... This is correct. ... I never changed the defaults
              Message 6 of 11 , May 5, 2007
              • 0 Attachment
                Wietse Venema wrote:
                > Peter Rabbitson:
                >> Wietse Venema wrote:
                >>> Peter Rabbitson:
                >>>> Hi,
                >>>>
                >>>> I am experiencing 421 errors between my secondary and primary MXes, and
                >>>> it seems it is cause by lack of connection caching.
                >>> What is the error message?
                >> There is no error message as such, see below.
                >
                > Any reasonable SMTP server sends 421 followed by some text that
                > explains why it hangs up.

                I apologize, I thought that 421 in the MTA world is as self-explanatory
                as say 403 in the http world.

                > Having looked at the text below, I think your problem is that
                > you are making an insane number of SIMULTANEOUS connections
                > to the primary MX host.

                This is correct.

                >>>> http://www.postfix.org/postconf.5.html#smtp_connection_cache_on_demand
                >>>> misses to explain what is "high volume of mail in the active queue".
                >>>> When is exactly connection caching activated?
                >>> Roughly, it is activated when the active queue contains another
                >>> message before the current delivery is completed.
                >> If the primary MX is down for an extended period of time and a large
                >> queue accumulates on the backup, all messages are rushed to the primary
                >> MX in what it seems separate smtp connections. At least I was able to
                >> count as many smtp processes in `ps` as 2/3 of the number of queued
                >> messages, right after I issue `postfix flush`. If I specify explicit
                >> caching for the particular mx host, things work as expected. I guess
                >> there is not enough time for the caching on demand to activate when
                >> doing a flush or having enough queued messages to simulate one.
                >
                > This is not a surprise.
                >
                > If the number of SIMULTANEOUS connections is 2/3 the number of
                > queued messages, then most connections will never be reused because
                > the mail is already delivered.
                >
                > I suggest that you revert to no more than 10-20 SIMULTANEOUS
                > connections to the primary MX (or to any machine).
                >
                > /etc/postfix/main.cf:
                > smtp_destination_concurrency_limit=20
                > relay_destination_concurrency_limit=20

                I never changed the defaults for those (postconf -n follows at the end
                of the message)

                > If you do that, not only will the primary MX perform better, you
                > will also see connection reuse happen automatically.
                >

                I did more testing, using explicit smtp_connection_cache_destinations
                and I still had the same experience. Rereading the documentation for the
                n-th time I noticed the following in several places:

                (in reference to *_destination_recipient_limit)
                Setting this parameter to a value of 1 changes the meaning of
                *_destination_concurrency_limit from concurrency per domain into
                concurrency per recipient.

                Does this by chance mean that *_destination_concurrency_limit refers to
                individual _domains_ and not individual MTAs? I am relaying mail for 6
                domains, all having the same primary MX (which is the one getting badly
                hammered after being down for a while).

                Thanks for the help

                -------

                postconf -n
                Arx:/etc/postfix# postconf -n
                address_verify_map = btree:/var/cache/postfix/verify.db
                address_verify_negative_cache = yes
                address_verify_negative_expire_time = 1d
                address_verify_negative_refresh_time = 1h
                address_verify_poll_count = 2
                address_verify_poll_delay = 2s
                address_verify_positive_expire_time = 31d
                address_verify_positive_refresh_time = 7d
                alias_database = $alias_maps
                alias_maps = hash:/etc/aliases
                append_dot_mydomain = no
                backwards_bounce_logfile_compatibility = no
                biff = no
                bounce_queue_lifetime = 12h
                bounce_size_limit = 20000
                config_directory = /etc/postfix
                hash_queue_depth = 1
                hash_queue_names = ''
                in_flow_delay = 0
                inet_interfaces = all
                inet_protocols = ipv4
                mailbox_command = procmail -a "$EXTENSION"
                mailbox_size_limit = 0
                maximal_queue_lifetime = 7d
                message_size_limit = 0
                minimal_backoff_time = 15m
                mydestination = $mydomain, localhost.$mydomain, localhost
                mydomain = rabbit.us
                myhostname = arx.rabbit.us
                mynetworks = 127.0.0.0/8 192.168.13.0/24 10.0.13.0/24 $inet_interfaces
                myorigin = $mydomain
                queue_directory = /var/spool/postfix
                queue_minfree = 1000000
                recipient_delimiter = +
                relay_domains = <6 relayed domains withheld, all with same primary MX>
                smtp_bind_address = 68.251.127.6
                smtp_connect_timeout = 5s
                smtp_connection_reuse_time_limit = 5m
                smtp_helo_timeout = 1m
                smtp_mail_timeout = 1m
                smtp_mx_address_limit = 0
                smtp_quit_timeout = 10s
                smtp_skip_quit_response = yes
                smtpd_authorized_verp_clients = $mynetworks
                smtpd_banner = $myhostname ESMTP $mail_name (Debian/GNU)
                smtpd_delay_reject = yes
                smtpd_error_sleep_time = 3s
                smtpd_hard_error_limit = 20
                smtpd_junk_command_limit = 20
                smtpd_recipient_limit = 200
                smtpd_recipient_restrictions = permit_mynetworks
                reject_unauth_destination reject_unknown_recipient_domain
                reject_unverified_recipient
                smtpd_sender_restrictions = reject_unknown_sender_domain
                smtpd_soft_error_limit = 5
                smtpd_timeout = 30s
                syslog_name = postfix
                Arx:/etc/postfix#
              • Wietse Venema
                ... What is the output of grep concurrency_limit /etc/postfix/main.cf grep recipient_limit /etc/postfix/main.cf Wietse
                Message 7 of 11 , May 5, 2007
                • 0 Attachment
                  Peter Rabbitson:
                  > Wietse Venema wrote:
                  > > Peter Rabbitson:
                  > >> Wietse Venema wrote:
                  > >>> Peter Rabbitson:
                  > >>>> Hi,
                  > >>>>
                  > >>>> I am experiencing 421 errors between my secondary and primary MXes, and
                  > >>>> it seems it is cause by lack of connection caching.
                  > >>> What is the error message?
                  > >> There is no error message as such, see below.
                  > >
                  > > Any reasonable SMTP server sends 421 followed by some text that
                  > > explains why it hangs up.
                  >
                  > I apologize, I thought that 421 in the MTA world is as self-explanatory
                  > as say 403 in the http world.

                  What is the output of

                  grep concurrency_limit /etc/postfix/main.cf
                  grep recipient_limit /etc/postfix/main.cf

                  Wietse
                • Wietse Venema
                  ... OK, if you are sending lotsa different domains to the same primary MX, try reducing the process limit for the Postfix relay transport in master.cf to say
                  Message 8 of 11 , May 5, 2007
                  • 0 Attachment
                    Peter Rabbitson:
                    > > Having looked at the text below, I think your problem is that
                    > > you are making an insane number of SIMULTANEOUS connections
                    > > to the primary MX host.
                    >
                    > This is correct.

                    OK, if you are sending lotsa different domains to the same primary
                    MX, try reducing the process limit for the Postfix relay transport
                    in master.cf to say 20 and then "postfix reload".

                    That limits the total number of backup-to-primary connections for
                    the domains combined.

                    Wietse
                  • Peter Rabbitson
                    ... Arx:~# grep concurrency_limit /etc/postfix/main.cf Arx:~# Arx:~# grep recipient_limit /etc/postfix/main.cf smtpd_recipient_limit = 200 Arx:~# Arx:~# grep
                    Message 9 of 11 , May 5, 2007
                    • 0 Attachment
                      Wietse Venema wrote:
                      > Peter Rabbitson:
                      >> Wietse Venema wrote:
                      >>> Peter Rabbitson:
                      >>>> Wietse Venema wrote:
                      >>>>> Peter Rabbitson:
                      >>>>>> Hi,
                      >>>>>>
                      >>>>>> I am experiencing 421 errors between my secondary and primary MXes, and
                      >>>>>> it seems it is cause by lack of connection caching.
                      >>>>> What is the error message?
                      >>>> There is no error message as such, see below.
                      >>> Any reasonable SMTP server sends 421 followed by some text that
                      >>> explains why it hangs up.
                      >> I apologize, I thought that 421 in the MTA world is as self-explanatory
                      >> as say 403 in the http world.
                      >
                      > What is the output of
                      >
                      > grep concurrency_limit /etc/postfix/main.cf
                      > grep recipient_limit /etc/postfix/main.cf
                      >
                      > Wietse

                      Arx:~# grep concurrency_limit /etc/postfix/main.cf
                      Arx:~#
                      Arx:~# grep recipient_limit /etc/postfix/main.cf
                      smtpd_recipient_limit = 200
                      Arx:~#
                      Arx:~# grep _limit /etc/postfix/main.cf
                      bounce_size_limit = 20000
                      smtp_connection_cache_reuse_limit = 100
                      smtp_connection_reuse_time_limit = 5m
                      smtp_mx_address_limit = 0
                      smtpd_soft_error_limit = 5
                      smtpd_hard_error_limit = 20
                      smtpd_junk_command_limit = 20
                      smtpd_recipient_limit = 200
                      mailbox_size_limit = 0
                      message_size_limit = 0
                      Arx:~#
                    • Peter Rabbitson
                      ... Understood. Are there plans to make all concurrency settings mx-aware instead of domain-based as it is now? Thanks for the help! Peter
                      Message 10 of 11 , May 5, 2007
                      • 0 Attachment
                        Wietse Venema wrote:
                        > Peter Rabbitson:
                        >>> Having looked at the text below, I think your problem is that
                        >>> you are making an insane number of SIMULTANEOUS connections
                        >>> to the primary MX host.
                        >> This is correct.
                        >
                        > OK, if you are sending lotsa different domains to the same primary
                        > MX, try reducing the process limit for the Postfix relay transport
                        > in master.cf to say 20 and then "postfix reload".
                        >
                        > That limits the total number of backup-to-primary connections for
                        > the domains combined.
                        >

                        Understood. Are there plans to make all concurrency settings mx-aware
                        instead of domain-based as it is now? Thanks for the help!

                        Peter
                      • Wietse Venema
                        ... I have 14MB of plans sitting in the inbox. It s unlikely they will all be completed (or that all of them should be). Wietse
                        Message 11 of 11 , May 5, 2007
                        • 0 Attachment
                          Peter Rabbitson:
                          > Wietse Venema wrote:
                          > > Peter Rabbitson:
                          > >>> Having looked at the text below, I think your problem is that
                          > >>> you are making an insane number of SIMULTANEOUS connections
                          > >>> to the primary MX host.
                          > >> This is correct.
                          > >
                          > > OK, if you are sending lotsa different domains to the same primary
                          > > MX, try reducing the process limit for the Postfix relay transport
                          > > in master.cf to say 20 and then "postfix reload".
                          > >
                          > > That limits the total number of backup-to-primary connections for
                          > > the domains combined.
                          > >
                          >
                          > Understood. Are there plans to make all concurrency settings mx-aware
                          > instead of domain-based as it is now? Thanks for the help!

                          I have 14MB of plans sitting in the inbox. It's unlikely they will all
                          be completed (or that all of them should be).

                          Wietse
                        Your message has been successfully submitted and would be delivered to recipients shortly.