Loading ...
Sorry, an error occurred while loading the content.
 

smtpd processes congregating at the pub

Expand Messages
  • Stan Hoeppner
    Based on purely visual non-scientific observation (top), it seems my smtpd processes on my MX hang around much longer in (Debian) 2.5.5 than they did in
    Message 1 of 16 , Jan 28, 2010
      Based on purely visual non-scientific observation (top), it seems my smtpd
      processes on my MX hang around much longer in (Debian) 2.5.5 than they did in
      (Debian) 2.3.8. In 2.3.8 Master seemed to build them and tear them down very
      quickly after the transaction was complete. An smtpd process' lifespan was
      usually 10 seconds or less on my 2.3.8. In 2.5.5 smtpd's seem to hang around
      for up to 30 secs to a minute.

      Local shows very speedy delivery. Is this "long" smtpd process lifespan normal
      for 2.5.5 or did I do something screwy/wrong in my config?

      relay=local, delay=2.2, delays=2.2/0/0/0.01, dsn=2.0.0, status=sent
      relay=local, delay=0.32, delays=0.29/0.02/0/0, dsn=2.0.0, status=sent
      relay=local, delay=0.77, delays=0.75/0.03/0/0, dsn=2.0.0, status=sent
      relay=local, delay=0.26, delays=0.25/0/0/0.01, dsn=2.0.0, status=sent
      relay=local, delay=0.64, delays=0.62/0.03/0/0, dsn=2.0.0, status=sent
      relay=local, delay=0.26, delays=0.25/0/0/0, dsn=2.0.0, status=sent

      --
      Stan
    • Stan Hoeppner
      ... I think I found it: max_idle = x The default is 100 on my system. I changed it to 10 and that seems to have had an effect. Did this setting exist in
      Message 2 of 16 , Jan 28, 2010
        Stan Hoeppner put forth on 1/29/2010 12:27 AM:
        > Based on purely visual non-scientific observation (top), it seems my smtpd
        > processes on my MX hang around much longer in (Debian) 2.5.5 than they did in
        > (Debian) 2.3.8. In 2.3.8 Master seemed to build them and tear them down very
        > quickly after the transaction was complete. An smtpd process' lifespan was
        > usually 10 seconds or less on my 2.3.8. In 2.5.5 smtpd's seem to hang around
        > for up to 30 secs to a minute.
        >
        > Local shows very speedy delivery. Is this "long" smtpd process lifespan normal
        > for 2.5.5 or did I do something screwy/wrong in my config?
        >
        > relay=local, delay=2.2, delays=2.2/0/0/0.01, dsn=2.0.0, status=sent
        > relay=local, delay=0.32, delays=0.29/0.02/0/0, dsn=2.0.0, status=sent
        > relay=local, delay=0.77, delays=0.75/0.03/0/0, dsn=2.0.0, status=sent
        > relay=local, delay=0.26, delays=0.25/0/0/0.01, dsn=2.0.0, status=sent
        > relay=local, delay=0.64, delays=0.62/0.03/0/0, dsn=2.0.0, status=sent
        > relay=local, delay=0.26, delays=0.25/0/0/0, dsn=2.0.0, status=sent

        I think I found it:

        max_idle = x

        The default is 100 on my system. I changed it to 10 and that seems to have had
        an effect.

        Did this setting exist in 2.3.8? I didn't see a version note next to max_idle
        in my 2.5.5 man smtpd. If so, was the default something insanely low like 1, or
        0? Like I said, smtpd's seemed to come and go in a hurry on 2.3.8.

        --
        Stan
      • Wietse Venema
        ... Perhaps Debian changed this: http://www.postfix.org/postconf.5.html#max_idle The Postfix default is 100s. I don t really see why anyone would shorten this
        Message 3 of 16 , Jan 29, 2010
          Stan Hoeppner:
          > Based on purely visual non-scientific observation (top), it seems my smtpd
          > processes on my MX hang around much longer in (Debian) 2.5.5 than they did in
          > (Debian) 2.3.8. In 2.3.8 Master seemed to build them and tear them down very

          Perhaps Debian changed this:
          http://www.postfix.org/postconf.5.html#max_idle

          The Postfix default is 100s.

          I don't really see why anyone would shorten this - that's a waste
          of CPU cycles. In particular, stopping Postfix daemons after 10s
          means that people don't have a clue about what they are doing.
          The fact that it's now increased to 30s confirms my suspicion.

          Technical correctness: the Postfix master does not terminate
          processes. Processes terminate voluntarily.

          Wietse
        • Noel Jones
          ... Nitpick: you talk about smtpd, then show log snips from smtp. But no matter, they both honor max_idle and will behave in a similar manner. The max_idle
          Message 4 of 16 , Jan 29, 2010
            On 1/29/2010 1:37 AM, Stan Hoeppner wrote:
            > Stan Hoeppner put forth on 1/29/2010 12:27 AM:
            >> Based on purely visual non-scientific observation (top), it seems my smtpd
            >> processes on my MX hang around much longer in (Debian) 2.5.5 than they did in
            >> (Debian) 2.3.8. In 2.3.8 Master seemed to build them and tear them down very
            >> quickly after the transaction was complete. An smtpd process' lifespan was
            >> usually 10 seconds or less on my 2.3.8. In 2.5.5 smtpd's seem to hang around
            >> for up to 30 secs to a minute.
            >>
            >> Local shows very speedy delivery. Is this "long" smtpd process lifespan normal
            >> for 2.5.5 or did I do something screwy/wrong in my config?
            >>
            >> relay=local, delay=2.2, delays=2.2/0/0/0.01, dsn=2.0.0, status=sent
            >> relay=local, delay=0.32, delays=0.29/0.02/0/0, dsn=2.0.0, status=sent
            >> relay=local, delay=0.77, delays=0.75/0.03/0/0, dsn=2.0.0, status=sent
            >> relay=local, delay=0.26, delays=0.25/0/0/0.01, dsn=2.0.0, status=sent
            >> relay=local, delay=0.64, delays=0.62/0.03/0/0, dsn=2.0.0, status=sent
            >> relay=local, delay=0.26, delays=0.25/0/0/0, dsn=2.0.0, status=sent
            >
            > I think I found it:
            >
            > max_idle = x
            >
            > The default is 100 on my system. I changed it to 10 and that seems to have had
            > an effect.
            >
            > Did this setting exist in 2.3.8? I didn't see a version note next to max_idle
            > in my 2.5.5 man smtpd. If so, was the default something insanely low like 1, or
            > 0? Like I said, smtpd's seemed to come and go in a hurry on 2.3.8.
            >


            Nitpick: you talk about smtpd, then show log snips from smtp.
            But no matter, they both honor max_idle and will behave in a
            similar manner.

            The max_idle default has been 100s pretty much forever. The
            idea is that an idle postfix process will be reused to do more
            work rather than starting a new process every time. This
            makes postfix *far* more efficient than one process per job.

            Although the 100s default is somewhat arbitrary, I have
            trouble imagining a situation where a shorter max_idle makes
            sense. On a very lightly loaded system where processes are
            seldom reused, a shorter max_idle might not hurt anything, but
            it won't help anything either.

            -- Noel Jones
          • Stan Hoeppner
            ... Yes, I confirmed this on my system. ... Think of a lightly loaded (smtp connects/min) vanity domain server that functions as a Postfix MX with local
            Message 5 of 16 , Jan 29, 2010
              Wietse Venema put forth on 1/29/2010 6:15 AM:
              > Stan Hoeppner:
              >> Based on purely visual non-scientific observation (top), it seems my smtpd
              >> processes on my MX hang around much longer in (Debian) 2.5.5 than they did in
              >> (Debian) 2.3.8. In 2.3.8 Master seemed to build them and tear them down very
              >
              > Perhaps Debian changed this:
              > http://www.postfix.org/postconf.5.html#max_idle
              >
              > The Postfix default is 100s.

              Yes, I confirmed this on my system.

              > I don't really see why anyone would shorten this - that's a waste
              > of CPU cycles. In particular, stopping Postfix daemons after 10s
              > means that people don't have a clue about what they are doing.
              > The fact that it's now increased to 30s confirms my suspicion.

              Think of a lightly loaded (smtp connects/min) vanity domain server that
              functions as a Postfix MX with local delivery, a Dovecot IMAP, a
              Lighty+Roundcube, a Samba server, and a dns resolver serving local requests and
              one remote workstation. The system is also used interactively (via SSH/BASH)
              for a number of things including an occasional kernel compile. The machine only
              has 384MB of RAM. My smtp load is low enough that having an smtpd process or
              two hanging around for 100 seconds just wastes 13-18MB per smtpd of memory for
              80-90 of those 100 seconds. This system regularly goes 5 minutes or more
              between smtp connects. Sometimes two come in simultaneously, and I end up with
              two smtpd processes hanging around for 100 seconds, eating over 30MB RAM with no
              benefit. Thus, for me, it makes more sense to have the smtpd's exit as soon as
              possible to free up memory that can be (better) used for something else. Yes, I
              guess I'm a maniac. ;)

              In this scenario, with very infrequent smtpd reuse, do you still think I should
              let them idle for 100 seconds, or at all? From my perspective, that 18-30MB+
              can often be better utilized during that time.

              --
              Stan
            • Wietse Venema
              ... Allow me to present a tutorial on Postfix and operating system basics. Postfix reuses processes for the same reasons that Apache does; however, Apache
              Message 6 of 16 , Jan 30, 2010
                Stan Hoeppner:
                > Wietse Venema put forth on 1/29/2010 6:15 AM:
                > > Stan Hoeppner:
                > >> Based on purely visual non-scientific observation (top), it seems my smtpd
                > >> processes on my MX hang around much longer in (Debian) 2.5.5 than they did in
                > >> (Debian) 2.3.8. In 2.3.8 Master seemed to build them and tear them down very
                > >
                > > Perhaps Debian changed this:
                > > http://www.postfix.org/postconf.5.html#max_idle
                > >
                > > The Postfix default is 100s.
                >
                > Yes, I confirmed this on my system.
                >
                > > I don't really see why anyone would shorten this - that's a waste
                > > of CPU cycles. In particular, stopping Postfix daemons after 10s

                Allow me to present a tutorial on Postfix and operating system basics.

                Postfix reuses processes for the same reasons that Apache does;
                however, Apache always runs a fixed minimum amount of daemons,
                whereas Postfix will dynamically shrink to zero smtpd processes
                over time.

                Therefore, people who believe that Postfix processes should not be
                running in the absence of client requests, should also terminate
                their Apache processes until a connection arrives. No-one does that.

                If people believe that each smtpd process uses 15MB of RAM, and
                that two smtpd processes use 30MB of RAM, then that would have been
                correct had Postfix been running on MS-DOS.

                First, the physical memory footprint of a process (called resident
                memory size) is smaller than the virtual memory footprint (which
                comprises all addressable memory including the executable, libraries,
                data, heap and stack). With FreeBSD 8.0 I see an smtpd VSZ/RSS of
                6.9MB/4.8MB; with Fedora Core 11, 4.2MB/1.8MB; and with FreeBSD
                4.1 it's 1.8MB/1.4MB. Ten years of system library bloat.

                Second, when multiple processes execute the same executable file
                and libraries, those processes will share a single memory copy of
                the code and constants of that executable file and libraries.
                Therefore, a large portion of their resident memory sizes will
                actually map onto the same physical memory pages. 15+15 != 30.

                Third, some code uses mmap() to allocate memory that is mapped from
                a file. This adds to the virtual memory footprint of each process,
                but of course only the pages that are actually accessed will add
                to the resident memory size. In the case of Postfix, this mechanism
                is used by Berkeley DB to allocate a 16MB shared-memory read buffer.

                There are some other tricks that allow for further savings (such
                as copy-on-write, which allows sharing of a memory page until a
                process attempts to write to it) but in the case of Postfix, those
                savings will be modest.

                Wietse
              • Stan Hoeppner
                ... Thank you Wietse. I m always eager to learn. :) ... Possibly not the best reference example, as I switched to Lighty mainly due to the Apache behavior you
                Message 7 of 16 , Jan 30, 2010
                  Wietse Venema put forth on 1/30/2010 9:03 AM:

                  > Allow me to present a tutorial on Postfix and operating system basics.

                  Thank you Wietse. I'm always eager to learn. :)

                  > Postfix reuses processes for the same reasons that Apache does;
                  > however, Apache always runs a fixed minimum amount of daemons,
                  > whereas Postfix will dynamically shrink to zero smtpd processes
                  > over time.

                  Possibly not the best reference example, as I switched to Lighty mainly due to
                  the Apache behavior you describe, but also due to Apache resource hogging in
                  general. But I understand your point. It's better to keep one or two processes
                  resident to service the next inbound requests than to constantly tear down and
                  then rebuild processes, which causes significant overhead and performance issues
                  on busy systems.

                  > Therefore, people who believe that Postfix processes should not be
                  > running in the absence of client requests, should also terminate
                  > their Apache processes until a connection arrives. No-one does that.

                  Wouldn't that really depend on the purpose of the server? How about a web admin
                  daemon running on a small network device? I almost do this with Lighty
                  currently. I have a single daemon instance that handles all requests, max
                  processes=1. It's a very lightly loaded server, and a single instance is more
                  than enough. In fact, given the load, I might possibly look into running Lighty
                  from inetd, if possible, as I do Samba.

                  > If people believe that each smtpd process uses 15MB of RAM, and
                  > that two smtpd processes use 30MB of RAM, then that would have been
                  > correct had Postfix been running on MS-DOS.
                  >
                  > First, the physical memory footprint of a process (called resident
                  > memory size) is smaller than the virtual memory footprint (which
                  > comprises all addressable memory including the executable, libraries,
                  > data, heap and stack). With FreeBSD 8.0 I see an smtpd VSZ/RSS of
                  > 6.9MB/4.8MB; with Fedora Core 11, 4.2MB/1.8MB; and with FreeBSD
                  > 4.1 it's 1.8MB/1.4MB. Ten years of system library bloat.

                  Debian 5.0.3, kernel 2.6.31
                  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
                  29242 postfix 20 0 22408 18m 2268 S 0 4.9 0:00.58 smtpd
                  29251 postfix 20 0 17264 13m 2208 S 0 3.6 0:00.48 smtpd

                  > Second, when multiple processes execute the same executable file
                  > and libraries, those processes will share a single memory copy of
                  > the code and constants of that executable file and libraries.
                  > Therefore, a large portion of their resident memory sizes will
                  > actually map onto the same physical memory pages. 15+15 != 30.

                  I was of the understanding that top's SHR column described memory shareable with
                  other processes. In the real example above from earlier today, it would seem
                  that my two smtpd processes can only share ~2.2MB of code, data structures, etc.

                  man top:
                  t: SHR -- Shared Mem size (kb)
                  The amount of shared memory used by a task. It simply reflects memory
                  that could be potentially shared with other
                  processes.

                  Am I missing something, or reading my top output incorrectly?

                  > Third, some code uses mmap() to allocate memory that is mapped from
                  > a file. This adds to the virtual memory footprint of each process,
                  > but of course only the pages that are actually accessed will add
                  > to the resident memory size. In the case of Postfix, this mechanism
                  > is used by Berkeley DB to allocate a 16MB shared-memory read buffer.

                  Is this 16MB buffer also used for hash and/or cidr tables, and is this
                  shareable? AFAIK I don't use Berkeley DB tables, only hash (small,few) and cidr
                  (very large, a handful).

                  > There are some other tricks that allow for further savings (such
                  > as copy-on-write, which allows sharing of a memory page until a
                  > process attempts to write to it) but in the case of Postfix, those
                  > savings will be modest.

                  I must be screwing something up somewhere then. According to my top output, I'm
                  only sharing ~2.2MB between smtpd processes, yet I've seen them occupy anywhere
                  from 11-18MB RES. If the top output is correct, there is a huge amount of
                  additional sharing that "should" be occurring, no?

                  Debian runs Postfix in a chroot by default. I know very little about chroot
                  environments. Could this have something to do with the tiny amount of shared
                  memory between the smtpds?

                  Thanks for taking interest in this Wietse. I'm sure I've probably done
                  something screwy that is easily fixable, and will get that shared memory count
                  up where it should be.

                  --
                  Stan
                • Wietse Venema
                  ... hash (and btree) == Berkeley DB. If you have big CIDR tables, you can save lots of memory by using proxy:cidr: instead of cidr: (and running postfix
                  Message 8 of 16 , Jan 30, 2010
                    Stan Hoeppner:
                    > AFAIK I don't use Berkeley DB tables, only hash (small,few) and cidr
                    > (very large, a handful).

                    hash (and btree) == Berkeley DB.

                    If you have big CIDR tables, you can save lots of memory by using
                    proxy:cidr: instead of cidr: (and running "postfix reload").
                    Effectively, this turns all that private memory into something that
                    can be shared via the proxy: protocol.

                    The current CIDR implementation is optimized to make it easy to
                    verify for correctness, and is optimized for speed when used with
                    limited lists of netblocks (mynetworks, unassigned address blocks,
                    reserved address blocks, etc.).

                    If you want to list large portions of Internet address space such
                    as entire countries the current implementation starts burning CPU
                    time (it examines all CIDR patterns in order; with a bit of extra
                    up-front work during initialization, address lookups could skip
                    over a lot of patterns, but the implementation would of course be
                    harder to verify for correctness), and it wastes 24 bytes per CIDR
                    rule when Postfix is compiled with IPv6 support (this roughly
                    doubles the amount memory that is used by CIDR tables).

                    Wietse
                  • Stan Hoeppner
                    ... Ahh, good to know. I d thought only btree used Berkeley DB and that hash tables used something else. ... I implemented proxymap but it doesn t appear to
                    Message 9 of 16 , Jan 30, 2010
                      Wietse Venema put forth on 1/30/2010 7:14 PM:
                      > Stan Hoeppner:
                      >> AFAIK I don't use Berkeley DB tables, only hash (small,few) and cidr
                      >> (very large, a handful).
                      >
                      > hash (and btree) == Berkeley DB.

                      Ahh, good to know. I'd thought only btree used Berkeley DB and that hash tables
                      used something else.

                      > If you have big CIDR tables, you can save lots of memory by using
                      > proxy:cidr: instead of cidr: (and running "postfix reload").
                      > Effectively, this turns all that private memory into something that
                      > can be shared via the proxy: protocol.

                      I implemented proxymap but it doesn't appear to have changed the memory
                      footprint of smtpd much at all, if any. I reloaded once, and restarted once
                      just in case.

                      PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
                      4554 postfix 20 0 20828 17m 2268 S 0 4.5 0:00.46 smtpd
                      4560 postfix 20 0 20036 16m 2268 S 0 4.3 0:00.47 smtpd
                      4555 postfix 20 0 6812 3056 1416 S 0 0.8 0:00.10 proxymap

                      > The current CIDR implementation is optimized to make it easy to
                      > verify for correctness, and is optimized for speed when used with
                      > limited lists of netblocks (mynetworks, unassigned address blocks,
                      > reserved address blocks, etc.).

                      Understood.

                      > If you want to list large portions of Internet address space such
                      > as entire countries the current implementation starts burning CPU
                      > time (it examines all CIDR patterns in order; with a bit of extra
                      > up-front work during initialization, address lookups could skip
                      > over a lot of patterns, but the implementation would of course be
                      > harder to verify for correctness), and it wastes 24 bytes per CIDR
                      > rule when Postfix is compiled with IPv6 support (this roughly
                      > doubles the amount memory that is used by CIDR tables).

                      I don't really notice much CPU burn on any postfix processes with these largish
                      CIDRs, never have. I've got 12,212 CIDRs in 3 files, 11,148 of them in just the
                      "countries" file alone. After implementing proxymap, I'm not seeing much
                      reduction in smtpd RES size, maybe 1MB if that. SHR is almost identical to
                      before. If it's not the big tables bloating smtpd, I wonder what is? Or, have
                      I not implemented proxymap correctly? Following are my postconf -n and main.cf
                      relevant parts.

                      alias_maps = hash:/etc/aliases
                      append_dot_mydomain = no
                      biff = no
                      config_directory = /etc/postfix
                      disable_vrfy_command = yes
                      header_checks = pcre:/etc/postfix/header_checks
                      inet_interfaces = all
                      message_size_limit = 10240000
                      mime_header_checks = pcre:/etc/postfix/mime_header_checks
                      mydestination = hardwarefreak.com
                      myhostname = greer.hardwarefreak.com
                      mynetworks = 192.168.100.0/24
                      myorigin = hardwarefreak.com
                      parent_domain_matches_subdomains = debug_peer_list smtpd_access_maps
                      proxy_interfaces = 65.41.216.221
                      proxy_read_maps = $local_recipient_maps $mydestination $virtual_alias_maps
                      $virtual_alias_domains $virtual_mailbox_maps $virtual_mailbox_domains
                      $relay_recipient_maps $relay_domains $canonical_maps $sender_canonical_maps
                      $recipient_canonical_maps $relocated_maps $transport_maps $mynetworks
                      $sender_bcc_maps $recipient_bcc_maps $smtp_generic_maps $lmtp_generic_maps
                      proxy:${cidr}/countries proxy:${cidr}/spammer proxy:${cidr}/misc-spam-srcs
                      readme_directory = /usr/share/doc/postfix
                      recipient_bcc_maps = hash:/etc/postfix/recipient_bcc
                      relay_domains =
                      smtpd_banner = $myhostname ESMTP Postfix
                      smtpd_helo_required = yes
                      smtpd_recipient_restrictions = permit_mynetworks
                      reject_unauth_destination check_recipient_access
                      hash:/etc/postfix/whitelist check_sender_access hash:/etc/postfix/whitelist
                      check_client_access hash:/etc/postfix/whitelist check_client_access
                      hash:/etc/postfix/blacklist check_client_access
                      regexp:/etc/postfix/fqrdns.regexp check_client_access
                      pcre:/etc/postfix/ptr-tld.pcre check_client_access proxy:${cidr}/countries
                      check_client_access proxy:${cidr}/spammer check_client_access
                      proxy:${cidr}/misc-spam-srcs reject_unknown_reverse_client_hostname
                      reject_non_fqdn_sender reject_non_fqdn_helo_hostname
                      reject_invalid_helo_hostname reject_unknown_helo_hostname
                      reject_unlisted_recipient reject_rbl_client zen.spamhaus.org
                      check_policy_service inet:127.0.0.1:60000
                      strict_rfc821_envelopes = yes
                      virtual_alias_maps = hash:/etc/postfix/virtual

                      /etc/postfix/main.cf snippet

                      cidr=cidr:/etc/postfix/cidr_files

                      proxy_read_maps = $local_recipient_maps $mydestination $virtual_alias_maps
                      $virtual_alias_domains $virtual_mailbox_maps $virtual_mailbox_domains
                      $relay_recipient_maps $relay_domains $canonical_maps $sender_canonical_maps
                      $recipient_canonical_maps $relocated_maps $transport_maps $mynetworks
                      $sender_bcc_maps $recipient_bcc_maps $smtp_generic_maps $lmtp_generic_maps
                      proxy:${cidr}/countries proxy:${cidr}/spammer proxy:${cidr}/misc-spam-srcs

                      check_client_access proxy:${cidr}/countries
                      check_client_access proxy:${cidr}/spammer
                      check_client_access proxy:${cidr}/misc-spam-srcs

                      --
                      Stan
                    • Stan Hoeppner
                      Sorry for top posting. Forgot to add something earlier: Proxymap seems to be exiting on my system immediately after servicing requests. It does not seem to
                      Message 10 of 16 , Jan 30, 2010
                        Sorry for top posting. Forgot to add something earlier: Proxymap seems to be
                        exiting on my system immediately after servicing requests. It does not seem to
                        be obeying $max_use or $max_idle which are both set to 100. It did this even
                        before I added cidr lists to proxymap a few hours ago. Before that, afaik, it
                        was only being called for local alias verification, and it exited immediately in
                        that case as well.

                        --
                        Stan


                        Stan Hoeppner put forth on 1/30/2010 11:13 PM:
                        > Wietse Venema put forth on 1/30/2010 7:14 PM:
                        >> Stan Hoeppner:
                        >>> AFAIK I don't use Berkeley DB tables, only hash (small,few) and cidr
                        >>> (very large, a handful).
                        >>
                        >> hash (and btree) == Berkeley DB.
                        >
                        > Ahh, good to know. I'd thought only btree used Berkeley DB and that hash tables
                        > used something else.
                        >
                        >> If you have big CIDR tables, you can save lots of memory by using
                        >> proxy:cidr: instead of cidr: (and running "postfix reload").
                        >> Effectively, this turns all that private memory into something that
                        >> can be shared via the proxy: protocol.
                        >
                        > I implemented proxymap but it doesn't appear to have changed the memory
                        > footprint of smtpd much at all, if any. I reloaded once, and restarted once
                        > just in case.
                        >
                        > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
                        > 4554 postfix 20 0 20828 17m 2268 S 0 4.5 0:00.46 smtpd
                        > 4560 postfix 20 0 20036 16m 2268 S 0 4.3 0:00.47 smtpd
                        > 4555 postfix 20 0 6812 3056 1416 S 0 0.8 0:00.10 proxymap
                        >
                        >> The current CIDR implementation is optimized to make it easy to
                        >> verify for correctness, and is optimized for speed when used with
                        >> limited lists of netblocks (mynetworks, unassigned address blocks,
                        >> reserved address blocks, etc.).
                        >
                        > Understood.
                        >
                        >> If you want to list large portions of Internet address space such
                        >> as entire countries the current implementation starts burning CPU
                        >> time (it examines all CIDR patterns in order; with a bit of extra
                        >> up-front work during initialization, address lookups could skip
                        >> over a lot of patterns, but the implementation would of course be
                        >> harder to verify for correctness), and it wastes 24 bytes per CIDR
                        >> rule when Postfix is compiled with IPv6 support (this roughly
                        >> doubles the amount memory that is used by CIDR tables).
                        >
                        > I don't really notice much CPU burn on any postfix processes with these largish
                        > CIDRs, never have. I've got 12,212 CIDRs in 3 files, 11,148 of them in just the
                        > "countries" file alone. After implementing proxymap, I'm not seeing much
                        > reduction in smtpd RES size, maybe 1MB if that. SHR is almost identical to
                        > before. If it's not the big tables bloating smtpd, I wonder what is? Or, have
                        > I not implemented proxymap correctly? Following are my postconf -n and main.cf
                        > relevant parts.
                        >
                        > alias_maps = hash:/etc/aliases
                        > append_dot_mydomain = no
                        > biff = no
                        > config_directory = /etc/postfix
                        > disable_vrfy_command = yes
                        > header_checks = pcre:/etc/postfix/header_checks
                        > inet_interfaces = all
                        > message_size_limit = 10240000
                        > mime_header_checks = pcre:/etc/postfix/mime_header_checks
                        > mydestination = hardwarefreak.com
                        > myhostname = greer.hardwarefreak.com
                        > mynetworks = 192.168.100.0/24
                        > myorigin = hardwarefreak.com
                        > parent_domain_matches_subdomains = debug_peer_list smtpd_access_maps
                        > proxy_interfaces = 65.41.216.221
                        > proxy_read_maps = $local_recipient_maps $mydestination $virtual_alias_maps
                        > $virtual_alias_domains $virtual_mailbox_maps $virtual_mailbox_domains
                        > $relay_recipient_maps $relay_domains $canonical_maps $sender_canonical_maps
                        > $recipient_canonical_maps $relocated_maps $transport_maps $mynetworks
                        > $sender_bcc_maps $recipient_bcc_maps $smtp_generic_maps $lmtp_generic_maps
                        > proxy:${cidr}/countries proxy:${cidr}/spammer proxy:${cidr}/misc-spam-srcs
                        > readme_directory = /usr/share/doc/postfix
                        > recipient_bcc_maps = hash:/etc/postfix/recipient_bcc
                        > relay_domains =
                        > smtpd_banner = $myhostname ESMTP Postfix
                        > smtpd_helo_required = yes
                        > smtpd_recipient_restrictions = permit_mynetworks
                        > reject_unauth_destination check_recipient_access
                        > hash:/etc/postfix/whitelist check_sender_access hash:/etc/postfix/whitelist
                        > check_client_access hash:/etc/postfix/whitelist check_client_access
                        > hash:/etc/postfix/blacklist check_client_access
                        > regexp:/etc/postfix/fqrdns.regexp check_client_access
                        > pcre:/etc/postfix/ptr-tld.pcre check_client_access proxy:${cidr}/countries
                        > check_client_access proxy:${cidr}/spammer check_client_access
                        > proxy:${cidr}/misc-spam-srcs reject_unknown_reverse_client_hostname
                        > reject_non_fqdn_sender reject_non_fqdn_helo_hostname
                        > reject_invalid_helo_hostname reject_unknown_helo_hostname
                        > reject_unlisted_recipient reject_rbl_client zen.spamhaus.org
                        > check_policy_service inet:127.0.0.1:60000
                        > strict_rfc821_envelopes = yes
                        > virtual_alias_maps = hash:/etc/postfix/virtual
                        >
                        > /etc/postfix/main.cf snippet
                        >
                        > cidr=cidr:/etc/postfix/cidr_files
                        >
                        > proxy_read_maps = $local_recipient_maps $mydestination $virtual_alias_maps
                        > $virtual_alias_domains $virtual_mailbox_maps $virtual_mailbox_domains
                        > $relay_recipient_maps $relay_domains $canonical_maps $sender_canonical_maps
                        > $recipient_canonical_maps $relocated_maps $transport_maps $mynetworks
                        > $sender_bcc_maps $recipient_bcc_maps $smtp_generic_maps $lmtp_generic_maps
                        > proxy:${cidr}/countries proxy:${cidr}/spammer proxy:${cidr}/misc-spam-srcs
                        >
                        > check_client_access proxy:${cidr}/countries
                        > check_client_access proxy:${cidr}/spammer
                        > check_client_access proxy:${cidr}/misc-spam-srcs
                        >
                      • Stan Hoeppner
                        ... Making a little more progress on this, slowly. I d forgotten that I have a regexp table that s rather large, containing 1626 expressions. I added it to
                        Message 11 of 16 , Jan 31, 2010
                          Stan Hoeppner put forth on 1/31/2010 12:04 AM:
                          > Sorry for top posting. Forgot to add something earlier: Proxymap seems to be
                          > exiting on my system immediately after servicing requests. It does not seem to
                          > be obeying $max_use or $max_idle which are both set to 100. It did this even
                          > before I added cidr lists to proxymap a few hours ago. Before that, afaik, it
                          > was only being called for local alias verification, and it exited immediately in
                          > that case as well.

                          Making a little more progress on this, slowly. I'd forgotten that I have a
                          regexp table that's rather large, containing 1626 expressions.

                          I added it to proxymap, and this action dropped the size of my smtpd processes
                          dramatically, by about a factor of 5. Apparently, even though this regexp table
                          has only 1626 lines, it requires far more memory than my big 'countries' cidr
                          table which has 11148 lines.

                          PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
                          14411 postfix 20 0 20276 16m 1480 S 8 4.3 0:00.51 proxymap
                          14410 postfix 20 0 6704 3368 2208 S 0 0.9 0:00.04 smtpd

                          This is making good progress. Seeing the smtpd's memory footprint drop so
                          dramatically is fantastic. However, I'm still curious as to why proxymap
                          doesn't appear to be honoring $max_idle or $max_use. Maybe my understanding of
                          $max_use is not correct? It's currently set to 100, the default. Watching top
                          while sending a test message through, I see proxymap launch but then exit within
                          5 seconds, while smtpd honors max_idle. Is there some other setting I need to
                          change to keep proxymap around longer?

                          --
                          Stan
                        • Wietse Venema
                          ... Short answer (workaround for low-traffic sites): set ipc_idle=$max_idle to approximate the expected behavior. This keeps the smtpd-to-proxymap connection
                          Message 12 of 16 , Jan 31, 2010
                            Stan Hoeppner:
                            > This is making good progress. Seeing the smtpd's memory footprint
                            > drop so dramatically is fantastic. However, I'm still curious as
                            > to why proxymap doesn't appear to be honoring $max_idle or $max_use.
                            > Maybe my understanding of $max_use is not correct? It's currently
                            > set to 100, the default. Watching top while sending a test message
                            > through, I see proxymap launch but then exit within 5 seconds,
                            > while smtpd honors max_idle. Is there some other setting I need
                            > to change to keep proxymap around longer?

                            Short answer (workaround for low-traffic sites): set ipc_idle=$max_idle
                            to approximate the expected behavior. This keeps the smtpd-to-proxymap
                            connection open for as long as smtpd runs. Then, proxymap won't
                            terminate before its clients terminate.

                            Better: apply the long-term solution, in the form of the patch below.
                            This undoes the max_idle override (a workaround that I introduced
                            with Postfix 2.3). I already introduced the better solution with
                            Postfix 2.4 while solving a different problem.

                            Long answer: in ancient times, all Postfix daemons except qmgr
                            implemented the well-known max_idle=100s and max_use=100, as well
                            as the lesser-known ipc_idle=100s (see "short answer" for the effect
                            of that parameter).

                            While this worked fine for single-client servers such as smtpd, it
                            was not so great for multi-client servers such as proxymap or
                            trivial-rewrite. This problem was known, and the idea was that it
                            would be solved over time.

                            Theoretically, smtpd could run for up to $max_idle * $max_use = 3
                            hours, while proxymap and trivial-rewrite could run for up to
                            $max_idle * $max_use * $max_use = 12 days on low-traffic systems
                            (one SMTP client every 100s, or a little under 900 SMTP clients a
                            day), and it would run forever on systems with a steady mail flow.

                            This was a problem. The point of max_use is to limit the impact of
                            bugs such as memory or file handle leaks, by retiring a process
                            after doing a limited amount of work. I can test Postfix itself
                            with tools such as Purify and Valgrind, but I can't do those tests
                            with every version of everyone's system libraries.

                            If a proxymap or trivial-rewrite server can run for 11 days even
                            on systems with a minuscule load, then max_use isn't working as
                            intended.

                            The main cause is that the proxymap etc. clients reuse a connection
                            to improve efficiency. Therefore, the proxymap etc. server politely
                            waits until all its clients have disconnected before checking the
                            max_use counter. While this politeness thing can't be changed
                            easily, it is relatively easy to play with the proxymap etc. server's
                            max_idle value, and with the smtpd etc. ipc_ttl value.

                            Postfix 2.3 reduced the proxymap etc. max_idle to a fixed 1s value
                            to make those processes go away sooner when idle. I think that
                            this was a mistake, because it makes processes terminate too soon,
                            and thereby worsens the low-traffic behavior. Instead, we should
                            speed up the proxymap etc. server's max_use counter.

                            Postfix 2.4 reduced ipc_ttl to 5s. This was done for a different
                            purpose: to allow proxymap etc. clients to switch to the least-loaded
                            proxymap etc. server. But, I think that this was also the right way
                            to deal with long-lived proxymap etc. processes, because it speeds
                            up the proxymap etc. max_use counter.

                            The patch below keeps the reduced ipc_ttl from Postfix 2.4, and
                            removes the max_idle overrides from Postfix 2.3.

                            Wietse

                            *** ./src/proxymap/proxymap.c- Thu Jan 10 09:03:55 2008
                            --- ./src/proxymap/proxymap.c Sun Jan 31 10:52:50 2010
                            ***************
                            *** 594,605 ****
                            myfree(saved_filter);

                            /*
                            - * This process is called by clients that already enforce the max_idle
                            - * time, so we don't have to do it another time.
                            - */
                            - var_idle_limit = 1;
                            -
                            - /*
                            * Never, ever, get killed by a master signal, as that could corrupt a
                            * persistent database when we're in the middle of an update.
                            */
                            --- 594,599 ----
                            *** ./src/trivial-rewrite/trivial-rewrite.c- Wed Dec 9 18:39:51 2009
                            --- ./src/trivial-rewrite/trivial-rewrite.c Sun Jan 31 10:53:01 2010
                            ***************
                            *** 565,576 ****
                            if (resolve_verify.transport_info)
                            transport_post_init(resolve_verify.transport_info);
                            check_table_stats(0, (char *) 0);
                            -
                            - /*
                            - * This process is called by clients that already enforce the max_idle
                            - * time, so we don't have to do it another time.
                            - */
                            - var_idle_limit = 1;
                            }

                            MAIL_VERSION_STAMP_DECLARE;
                            --- 565,570 ----
                          • Stan Hoeppner
                            ... Wietse, thank you for the very thorough and thoughtful response. For a few reasons, including the fact I don t trust myself working with source in this
                            Message 13 of 16 , Jan 31, 2010
                              Wietse Venema put forth on 1/31/2010 10:38 AM:
                              > Stan Hoeppner:
                              >> This is making good progress. Seeing the smtpd's memory footprint
                              >> drop so dramatically is fantastic. However, I'm still curious as
                              >> to why proxymap doesn't appear to be honoring $max_idle or $max_use.
                              >> Maybe my understanding of $max_use is not correct? It's currently
                              >> set to 100, the default. Watching top while sending a test message
                              >> through, I see proxymap launch but then exit within 5 seconds,
                              >> while smtpd honors max_idle. Is there some other setting I need
                              >> to change to keep proxymap around longer?
                              >
                              > Short answer (workaround for low-traffic sites): set ipc_idle=$max_idle
                              > to approximate the expected behavior. This keeps the smtpd-to-proxymap
                              > connection open for as long as smtpd runs. Then, proxymap won't
                              > terminate before its clients terminate.

                              Wietse, thank you for the very thorough and thoughtful response. For a few
                              reasons, including the fact I don't trust myself working with source in this
                              case, and that I'd rather not throw monkey wrenches into my distro's package
                              management, I'm going to go with the short answer workaround above. All factors
                              being taken into account, I think it best fits my needs, skills, and usage profile.

                              > Better: apply the long-term solution, in the form of the patch below.
                              > This undoes the max_idle override (a workaround that I introduced
                              > with Postfix 2.3). I already introduced the better solution with
                              > Postfix 2.4 while solving a different problem.

                              I'm not sure if I fully understand this. I'm using 2.5.5, so shouldn't I
                              already have the 2.4 solution mentioned above? I must not be reading this
                              correctly.

                              > Long answer: in ancient times, all Postfix daemons except qmgr
                              > implemented the well-known max_idle=100s and max_use=100, as well
                              > as the lesser-known ipc_idle=100s (see "short answer" for the effect
                              > of that parameter).
                              >
                              > While this worked fine for single-client servers such as smtpd, it
                              > was not so great for multi-client servers such as proxymap or
                              > trivial-rewrite. This problem was known, and the idea was that it
                              > would be solved over time.
                              >
                              > Theoretically, smtpd could run for up to $max_idle * $max_use = 3
                              > hours, while proxymap and trivial-rewrite could run for up to
                              > $max_idle * $max_use * $max_use = 12 days on low-traffic systems
                              > (one SMTP client every 100s, or a little under 900 SMTP clients a
                              > day), and it would run forever on systems with a steady mail flow.
                              >
                              > This was a problem. The point of max_use is to limit the impact of
                              > bugs such as memory or file handle leaks, by retiring a process
                              > after doing a limited amount of work. I can test Postfix itself
                              > with tools such as Purify and Valgrind, but I can't do those tests
                              > with every version of everyone's system libraries.

                              This is a very smart design philosophy. Just one more reason I feel privileged
                              to use Postfix.

                              > If a proxymap or trivial-rewrite server can run for 11 days even
                              > on systems with a minuscule load, then max_use isn't working as
                              > intended.
                              >
                              > The main cause is that the proxymap etc. clients reuse a connection
                              > to improve efficiency. Therefore, the proxymap etc. server politely
                              > waits until all its clients have disconnected before checking the
                              > max_use counter. While this politeness thing can't be changed
                              > easily, it is relatively easy to play with the proxymap etc. server's
                              > max_idle value, and with the smtpd etc. ipc_ttl value.
                              >
                              > Postfix 2.3 reduced the proxymap etc. max_idle to a fixed 1s value
                              > to make those processes go away sooner when idle. I think that
                              > this was a mistake, because it makes processes terminate too soon,
                              > and thereby worsens the low-traffic behavior. Instead, we should
                              > speed up the proxymap etc. server's max_use counter.

                              > Postfix 2.4 reduced ipc_ttl to 5s. This was done for a different
                              > purpose: to allow proxymap etc. clients to switch to the least-loaded
                              > proxymap etc. server. But, I think that this was also the right way
                              > to deal with long-lived proxymap etc. processes, because it speeds
                              > up the proxymap etc. max_use counter.

                              Absolutely fascinating background information Wietse. Thank you for sharing
                              this. It's always nice to learn how/why some things work "under the hood";
                              things that often can't easily be found in any official documentation.

                              --
                              Stan
                            • Wietse Venema
                              ... The patch undoes the Postfix 2.3 change that is responsible for the shorter-than-expected proxymap lifetimes that you observed on low-traffic systems. With
                              Message 14 of 16 , Jan 31, 2010
                                Stan Hoeppner:
                                > > Better: apply the long-term solution, in the form of the patch below.
                                > > This undoes the max_idle override (a workaround that I introduced
                                > > with Postfix 2.3). I already introduced the better solution with
                                > > Postfix 2.4 while solving a different problem.
                                >
                                > I'm not sure if I fully understand this. I'm using 2.5.5, so shouldn't I
                                > already have the 2.4 solution mentioned above? I must not be reading this
                                > correctly.

                                The patch undoes the Postfix 2.3 change that is responsible for
                                the shorter-than-expected proxymap lifetimes that you observed
                                on low-traffic systems.

                                With that change backed out, the reduced ipc_idle change from
                                Postfix 2.4 will finally get a chance to fix the excessive lifetime
                                of proxymap and trivial-rewrite processes on high-traffic systems.

                                Wietse
                              • Stan Hoeppner
                                ... So, if I understand correctly, these changes made in 2.3 and 2.4 were to get more desirable behavior from proxymap and trivial-rewrite on high traffic
                                Message 15 of 16 , Jan 31, 2010
                                  Wietse Venema put forth on 1/31/2010 7:34 PM:
                                  > Stan Hoeppner:
                                  >>> Better: apply the long-term solution, in the form of the patch below.
                                  >>> This undoes the max_idle override (a workaround that I introduced
                                  >>> with Postfix 2.3). I already introduced the better solution with
                                  >>> Postfix 2.4 while solving a different problem.
                                  >>
                                  >> I'm not sure if I fully understand this. I'm using 2.5.5, so shouldn't I
                                  >> already have the 2.4 solution mentioned above? I must not be reading this
                                  >> correctly.
                                  >
                                  > The patch undoes the Postfix 2.3 change that is responsible for
                                  > the shorter-than-expected proxymap lifetimes that you observed
                                  > on low-traffic systems.
                                  >
                                  > With that change backed out, the reduced ipc_idle change from
                                  > Postfix 2.4 will finally get a chance to fix the excessive lifetime
                                  > of proxymap and trivial-rewrite processes on high-traffic systems.

                                  So, if I understand correctly, these changes made in 2.3 and 2.4 were to get
                                  more desirable behavior from proxymap and trivial-rewrite on high traffic
                                  systems, and this caused this (very minor) problem on low traffic systems? The
                                  patch resolves the low traffic issue, basically reverting to the older code used
                                  before said 2.3 changes?

                                  And these changes have, through 2.7, given the desired behavior on high-traffic
                                  systems? Or no? Your statement "will finally get a chance to..." is future
                                  tense. Does this mean the desired behavior for high-traffic systems has not
                                  been seen to date? I apologize if this seems a stupid question. The future
                                  tense in your statement confuses me. If that _is_ what you mean, future tense,
                                  does this mean I have inadvertently played a tiny role in helping you identify a
                                  long standing problem/issue? ;)

                                  --
                                  Stan
                                • Stan Hoeppner
                                  ... Maybe I could have worded that more clearly Noel. Those snippets are from postfix/local not smtp. smtp doesn t normally relay to local, afaik. ;) I
                                  Message 16 of 16 , Jan 31, 2010
                                    Noel Jones put forth on 1/29/2010 8:44 AM:
                                    > On 1/29/2010 1:37 AM, Stan Hoeppner wrote:

                                    >>> Local shows very speedy delivery. Is this "long" smtpd process
                                    >>> lifespan normal
                                    >>> for 2.5.5 or did I do something screwy/wrong in my config?
                                    >>>
                                    >>> relay=local, delay=2.2, delays=2.2/0/0/0.01, dsn=2.0.0, status=sent
                                    >>> relay=local, delay=0.32, delays=0.29/0.02/0/0, dsn=2.0.0, status=sent
                                    >>> relay=local, delay=0.77, delays=0.75/0.03/0/0, dsn=2.0.0, status=sent
                                    >>> relay=local, delay=0.26, delays=0.25/0/0/0.01, dsn=2.0.0, status=sent
                                    >>> relay=local, delay=0.64, delays=0.62/0.03/0/0, dsn=2.0.0, status=sent
                                    >>> relay=local, delay=0.26, delays=0.25/0/0/0, dsn=2.0.0, status=sent

                                    > Nitpick: you talk about smtpd, then show log snips from smtp. But no
                                    > matter, they both honor max_idle and will behave in a similar manner.

                                    Maybe I could have worded that more clearly Noel. Those snippets are from
                                    postfix/local not smtp. smtp doesn't normally relay to local, afaik. ;) I
                                    included these snippets in an attempt to show that inbound delivery is very
                                    fast. Not understanding the smtpd process behavior at the time, wrt max_idle, I
                                    assume that fast delivery would equal smtpd exiting quickly. smtpd doesn't log
                                    delays afaict, so I included the local information instead.

                                    My apologies for the confusion.

                                    --
                                    Stan
                                  Your message has been successfully submitted and would be delivered to recipients shortly.