Loading ...
Sorry, an error occurred while loading the content.

Re: Initial 220 greeting timeout

Expand Messages
  • Wietse Venema
    ... Your server is too slow, so that connections pile up in front of it. Find out where the bottle neck is by SYSTEMATICALLY MEASURING latency (not: manual
    Message 1 of 15 , Nov 20, 2012
    • 0 Attachment
      Alex:
      > Nov 19 20:39:03 mail01 postfix/smtpd[19820]: lost connection after
      > CONNECT from listserver.translateplanmulti.info[198.41.120.7]

      Your server is too slow, so that connections pile up in front of
      it.

      Find out where the bottle neck is by SYSTEMATICALLY MEASURING latency
      (not: manual "telnet to port 25" tests): name lookups, header checks,
      body checks, file system, CPU, memory, and so on. If you can't
      figure it out then hire a professional.

      Wietse
    • Stan Hoeppner
      ... translateplanmulti.info is a snowshoe domain, 198.41.120.0/24, and everyone should be blocking it. Postscreen by default stops bots, not snowshoe ratware.
      Message 2 of 15 , Nov 21, 2012
      • 0 Attachment
        On 11/20/2012 6:06 AM, Wietse Venema wrote:
        > Alex:
        >> Nov 19 20:39:03 mail01 postfix/smtpd[19820]: lost connection after
        >> CONNECT from listserver.translateplanmulti.info[198.41.120.7]
        >
        > Your server is too slow, so that connections pile up in front of
        > it.

        translateplanmulti.info is a snowshoe domain, 198.41.120.0/24, and
        everyone should be blocking it. Postscreen by default stops bots, not
        snowshoe ratware. Alex' smtpds and amavis/SA may be getting hammered by
        snowshow connections, though this one log entry alone doesn't prove such.

        > Find out where the bottle neck is by SYSTEMATICALLY MEASURING latency
        > (not: manual "telnet to port 25" tests): name lookups, header checks,
        > body checks, file system, CPU, memory, and so on. If you can't
        > figure it out then hire a professional.

        Before expending the effort on full bottleneck analysis, I'd recommend
        Alex should first concentrate on blocking more spam outright before it
        reaches his smtpds and thus his content filters. My logs back to the
        18th show the same snowshoe spammer, with BRBL blocking every connection:

        Nov 18 16:57:41 greer postfix/smtpd[9352]: NOQUEUE: reject: RCPT from
        listserver.translateplanmulti.info[198.41.120.7]: 554 5.7.1 Service
        unavailable; Client host [198.41.120.7] blocked using b.barracudacentral.org
        ....
        Nov 20 15:50:17 greer postfix/smtpd[19255]: NOQUEUE: reject: RCPT from
        openview.translateplanmulti.info[198.41.120.9]: 554 5.7.1 Service
        unavailable; Client host [198.41.120.9] blocked using b.barracudacentral.org

        Alex has BRBL in his postscreen config, but apparently his scoring setup
        is not triggering a rejection. Fixing that will stop this particular
        snowshoe spammer, and many others, from reaching his smtpds and content
        filters. Reducing that load will likely help this delay issue. This
        can also be done with a local block list such as a CIDR table for
        subnets, or an indexed table for domains, though this involves much more
        labor than getting DNSBLs to do the work for you.

        In addition, the Linux kernel SYN flood warning indicates Alex needs a
        firewall in front of this host, or a better configuration on it, or that
        he should implement packet filtering, or better packet filtering, on the
        host itself, via iptables rules.

        If due to policy most spam needs to go through the content filters for
        flagging, a 4x1TB 7.2K RAID5 setup may be insufficient to sync the load
        due to parity RAID's horrible random write throughput. A four disk
        RAID10 will typically yield 3x or more write throughput for a mail
        workload. This of course depends on your RAID controller, its cache
        size and configuration, etc. If you're using md/RAID5 your random IO
        will be pretty bad. You may want to analyze IO throughput and latency
        on your queue filesystem/device. If you're using EXTx for the queue
        filesystem, switching to XFS will yield a decent increase in queue
        throughput as well.

        --
        Stan
      • Alex
        Hi, ... I pulled the IPs out of the logs for these lost connection errors over the last 24hrs, and it does appear that there are multiple IPs in the same
        Message 3 of 15 , Nov 21, 2012
        • 0 Attachment
          Hi,

          >>> Nov 19 20:39:03 mail01 postfix/smtpd[19820]: lost connection after
          >>> CONNECT from listserver.translateplanmulti.info[198.41.120.7]
          >>
          >> Your server is too slow, so that connections pile up in front of
          >> it.
          >
          > translateplanmulti.info is a snowshoe domain, 198.41.120.0/24, and
          > everyone should be blocking it. Postscreen by default stops bots, not
          > snowshoe ratware. Alex' smtpds and amavis/SA may be getting hammered by
          > snowshow connections, though this one log entry alone doesn't prove such.

          I pulled the IPs out of the logs for these 'lost connection' errors
          over the last 24hrs, and it does appear that there are multiple IPs in
          the same network losing the connection. This also doesn't really prove
          much, but there are cases where there are dozens of consecutive 'lost
          connection' errors for a single IP.

          I'm sure by now it's in the PBL or SBL.

          > Before expending the effort on full bottleneck analysis, I'd recommend
          > Alex should first concentrate on blocking more spam outright before it
          > reaches his smtpds and thus his content filters. My logs back to the
          > 18th show the same snowshoe spammer, with BRBL blocking every connection:
          >
          > Nov 18 16:57:41 greer postfix/smtpd[9352]: NOQUEUE: reject: RCPT from
          > listserver.translateplanmulti.info[198.41.120.7]: 554 5.7.1 Service
          > unavailable; Client host [198.41.120.7] blocked using b.barracudacentral.org

          I searched for 198.41.120 and found hundreds of entries such as these:

          Nov 21 19:39:14 mail01 postfix/postscreen[9111]: PASS OLD [198.41.120.10]:47864

          They were later all tagged as spam, but it would definitely be nice to
          be blocking these outright with postscreen. I've now added an iptables
          rule manually, but I wish there was a way to build in some
          intelligence to automate it, such as with fail2ban.

          Are you suggesting I increase the weight of the BRBL with postscreen?

          postscreen_dnsbl_sites = myhost.zen.dq.spamhaus.net*2
          bl.spamcop.net*1 b.barracudacentral.org*1 psbl.surriel.com*1

          This IP is now listed in barracuda, but wasn't when it was received.
          It only hit the URIBL list.

          > filters. Reducing that load will likely help this delay issue. This
          > can also be done with a local block list such as a CIDR table for
          > subnets, or an indexed table for domains, though this involves much more
          > labor than getting DNSBLs to do the work for you.

          Yes, I wish I had the resources for this.

          > In addition, the Linux kernel SYN flood warning indicates Alex needs a
          > firewall in front of this host, or a better configuration on it, or that
          > he should implement packet filtering, or better packet filtering, on the
          > host itself, via iptables rules.

          It sounds like I should at least be doing some QoS work with tc to
          throttle the number of connections from a single IP. This looks like a
          good article to read through before a turkey dinner tomorrow:

          http://www.techrepublic.com/blog/10things/10-iptables-rules-to-help-secure-your-linux-box/539

          > If due to policy most spam needs to go through the content filters for
          > flagging, a 4x1TB 7.2K RAID5 setup may be insufficient to sync the load
          > due to parity RAID's horrible random write throughput.

          Yes, I definitely know that the disks in my configuration are a
          bottleneck. If I had to do it over I would use RAID10.

          There is some amount of CPU iowait, but the majority of the time the
          processors are more idle than fully utilized.

          Thanks so much for your help. Really appreciate it.
          Alex
        • Wietse Venema
          ... Postscreen forgets DNSBL lookup results after one hour. At least that has been the default for a long time. You can set it shorter if you think it would
          Message 4 of 15 , Nov 21, 2012
          • 0 Attachment
            Alex:
            > Nov 21 19:39:14 mail01 postfix/postscreen[9111]: PASS OLD [198.41.120.10]:47864
            >
            > They were later all tagged as spam, but it would definitely be nice to
            > be blocking these outright with postscreen. I've now added an iptables

            Postscreen "forgets" DNSBL lookup results after one hour. At least
            that has been the default for a long time. You can set it shorter
            if you think it would help, but it makes no sense to set it much
            shorter than the time between DNSBL updates.

            Wietse
          • Stan Hoeppner
            ... This is a bad assumption. The PBL lists dynamics/etc, not snowshoe IPs. The SBL lists some snowshoe IPs, but you likely won t see a lot of overlap with
            Message 5 of 15 , Nov 22, 2012
            • 0 Attachment
              On 11/21/2012 7:01 PM, Alex wrote:

              > I pulled the IPs out of the logs for these 'lost connection' errors
              > over the last 24hrs, and it does appear that there are multiple IPs in
              > the same network losing the connection. This also doesn't really prove
              > much, but there are cases where there are dozens of consecutive 'lost
              > connection' errors for a single IP.
              >
              > I'm sure by now it's in the PBL or SBL.

              This is a bad assumption. The PBL lists dynamics/etc, not snowshoe IPs.
              The SBL lists some snowshoe IPs, but you likely won't see a lot of
              overlap with other IP based DNSBLs. Snowshoe is notorious for avoiding
              traps that feed IP based DNSBLs.

              > I searched for 198.41.120 and found hundreds of entries such as these:
              >
              > Nov 21 19:39:14 mail01 postfix/postscreen[9111]: PASS OLD [198.41.120.10]:47864
              >
              > They were later all tagged as spam, but it would definitely be nice to
              > be blocking these outright with postscreen. I've now added an iptables
              > rule manually, but I wish there was a way to build in some
              > intelligence to automate it, such as with fail2ban.

              Unfortunately fail2ban doesn't work for snowshoe. The rate is
              intentionally low, which is why snowshoe avoids most trap driven DNSBLs
              as well.

              > Are you suggesting I increase the weight of the BRBL with postscreen?

              I don't use postscreen. I block outright in SMTPD on any DNSBL hit.
              I.e. I don't use weighting. With any of the reputable DNSBLs you should
              probably outright block, not score. So set postscreen weighting so any
              hit causes a rejection. If you are FP averse, simply duplicate your
              postscreen DNSBL config in SMTPD with 'WARN_IF_REJECT' and do a log
              comparison to see what additional clients would be rejected. If you're
              not seeing warnings on ham, go live.

              > postscreen_dnsbl_sites = myhost.zen.dq.spamhaus.net*2
              > bl.spamcop.net*1 b.barracudacentral.org*1 psbl.surriel.com*1
              >
              > This IP is now listed in barracuda, but wasn't when it was received.
              > It only hit the URIBL list.

              Since Nov 18 I've blocked every connection from this /24 with a
              combination of BRBL and DBL. The first attempt here from 198.41.120.10
              (which you reference above) at Nov 21 17:38:30 was blocked by DBL. The
              first attempt from this /24 was nailed by BRBL at Nov 18 16:57:41.
              postscreen doesn't handle domains, only IPs. Your main.cf parameters
              show you're not rejecting snowshoe domains via RHSBLs. You should be
              using something like this

              smtpd_recipient_restrictions =
              ...
              reject_rhsbl_reverse_client dbl.spamhaus.org
              reject_rhsbl_sender dbl.spamhaus.org
              reject_rhsbl_helo dbl.spamhaus.org
              ...

              And in fact you asked about DNSBLS in April 2010
              http://comments.gmane.org/gmane.mail.postfix.user/208344

              and were given all of this information then, by Ralf and myself. You
              can also use multi.uribl.com and multi.surbl.org here, requiring a total
              of 9 parameter entries.

              I just noticed you don't require HELO. So you need this as well:

              smtpd_helo_required = yes

              And in fact, your current HELO based restrictions are having no effect
              if clients don't send HELO/EHLO:

              check_helo_access pcre:/etc/postfix/helo_checks.pcre
              reject_invalid_helo_hostname

              > Yes, I wish I had the resources for this.

              Check out: http://dnsbl.invaluement.com/ivmsip/

              > It sounds like I should at least be doing some QoS work with tc to
              > throttle the number of connections from a single IP.

              Your kernel is shutting down the SYN flood, which is what that kernel
              message tells you, so Postfix isn't being affected, and it's probably
              not doing any harm to your system, performance or otherwise. I simply
              feel it would be better to shut this down at an upstream firewall.

              > Yes, I definitely know that the disks in my configuration are a
              > bottleneck. If I had to do it over I would use RAID10.

              For gateway/AS appliance queue duty with no local mailboxes you should
              seriously look at using two smallish good quality (Intel) SSDs mirrored
              (with a hot spare) with a low end hardware RAID controller (allows boot,
              /, and queue on the mirror device). It's cheaper and 100s of times
              faster than rust in RAID10, and you simply never have to worry about an
              IO bottleneck. Something like:

              http://www.newegg.com/Product/Product.aspx?Item=N82E16816103229
              http://www.newegg.com/Product/Product.aspx?Item=N82E16820167120

              > There is some amount of CPU iowait, but the majority of the time the
              > processors are more idle than fully utilized.

              They always will be mostly idle. You have 8 cores and 8GB RAM for an
              SMTP/filter workload. Such a system with a few SSDs or a very large
              spindle count rusty RAID and a GbE connection could handle many hundreds
              of msgs/sec, if configured properly. The performance key to the SMTP
              workload has always been, and always will be, random IO throughput
              to/from the queue directory and/or mailboxes.

              > Thanks so much for your help. Really appreciate it.

              Always glad to help "Alex". ;)

              --
              Stan
            • Jamie Paul Griffin
              [ Stan Hoeppner Wrote On Thu 22.Nov 12 at 8:19:21 GMT ] ... Hi Stan, can I ask: what is a Snowshoe domain?
              Message 6 of 15 , Nov 22, 2012
              • 0 Attachment
                [ Stan Hoeppner Wrote On Thu 22.Nov'12 at 8:19:21 GMT ]

                > On 11/21/2012 7:01 PM, Alex wrote:
                >
                > > I pulled the IPs out of the logs for these 'lost connection' errors
                > > over the last 24hrs, and it does appear that there are multiple IPs in
                > > the same network losing the connection. This also doesn't really prove
                > > much, but there are cases where there are dozens of consecutive 'lost
                > > connection' errors for a single IP.
                > >
                > > I'm sure by now it's in the PBL or SBL.
                >
                > This is a bad assumption. The PBL lists dynamics/etc, not snowshoe IPs.
                > The SBL lists some snowshoe IPs, but you likely won't see a lot of
                > overlap with other IP based DNSBLs. Snowshoe is notorious for avoiding
                > traps that feed IP based DNSBLs.

                Hi Stan, can I ask: what is a "Snowshoe" domain?
              • Michael P. Demelbauer
                ... http://blog.wordtothewise.com/2009/10/spamhaus-vs-snowshoe-spammers/ -- Michael P. Demelbauer Systemadministration WSR Arsenal, Objekt 20 1030 Wien ...
                Message 7 of 15 , Nov 22, 2012
                • 0 Attachment
                  On Thu, Nov 22, 2012 at 08:24:16AM +0000, Jamie Paul Griffin wrote:
                  > Hi Stan, can I ask: what is a "Snowshoe" domain?

                  http://blog.wordtothewise.com/2009/10/spamhaus-vs-snowshoe-spammers/
                  --
                  Michael P. Demelbauer
                  Systemadministration
                  WSR
                  Arsenal, Objekt 20
                  1030 Wien
                  -------------------------------------------------------------------------------
                  Don't go around saying the world owes you a living.
                  The world owes you nothing. It was here first.
                  -- Mark Twain
                • Stan Hoeppner
                  ... It s a domain that wears these: http://blog.mlive.com/outdoors_impact/2009/03/large_1snowshoe07.jpg -- Stan
                  Message 8 of 15 , Nov 22, 2012
                  • 0 Attachment
                    On 11/22/2012 2:24 AM, Jamie Paul Griffin wrote:

                    > Hi Stan, can I ask: what is a "Snowshoe" domain?

                    It's a domain that wears these:

                    http://blog.mlive.com/outdoors_impact/2009/03/large_1snowshoe07.jpg

                    --
                    Stan
                  • Alex
                    Hi, ... Right, that makes sense. A spammer wouldn t have access to a consecutive block of dynamic IPs, like from a cable company or Verizon. It still could
                    Message 9 of 15 , Nov 22, 2012
                    • 0 Attachment
                      Hi,

                      >> I'm sure by now it's in the PBL or SBL.
                      >
                      > This is a bad assumption. The PBL lists dynamics/etc, not snowshoe IPs.

                      Right, that makes sense. A spammer wouldn't have access to a
                      consecutive block of dynamic IPs, like from a cable company or
                      Verizon. It still could mean that it's listed in the PBL by now,
                      though.

                      >> They were later all tagged as spam, but it would definitely be nice to
                      >> be blocking these outright with postscreen. I've now added an iptables
                      >> rule manually, but I wish there was a way to build in some
                      >> intelligence to automate it, such as with fail2ban.
                      >
                      > Unfortunately fail2ban doesn't work for snowshoe. The rate is
                      > intentionally low, which is why snowshoe avoids most trap driven DNSBLs
                      > as well.

                      I have fail2ban working with dnsblog. It may not necessarily work for
                      snowshoe, but it works well for repeated attempts. Just to confirm my
                      understanding, dnsblog does the lookup and logging, then rejects based
                      on the policy, correct? So it wouldn't be necessary filter on
                      postscreen entries because it's the same IP log info as with dnsblog?

                      >> Are you suggesting I increase the weight of the BRBL with postscreen?
                      >
                      > I don't use postscreen. I block outright in SMTPD on any DNSBL hit.
                      > I.e. I don't use weighting. With any of the reputable DNSBLs you should
                      > probably outright block, not score. So set postscreen weighting so any

                      Okay, I've set the postscreen threshold to 1, so any hit is a reject.
                      It's already dramatically increased the number of rejects.

                      I've also added the reject_rhsbl_reverse_client and other rhsbl
                      statements you've recommended. I decided not to bother with
                      warn_if_reject and trust the DNSBLs. I realize it's doing twice as
                      many DNS lookups for now. I'll also have to whitelist any false
                      positive IPs in multiple places for now too.

                      When I was working on this in 2010 (how the hell did you remember
                      that?), my system was so old that it not only didn't support
                      warn_if_reject, it didn't support any of the rhsbl statements in
                      smtpd_recipient_restrictions. It was certainly pre-2.0 release I was
                      using, so I wasn't able to implement any of the suggestions.

                      > smtpd_recipient_restrictions =
                      > ...
                      > reject_rhsbl_reverse_client dbl.spamhaus.org
                      > reject_rhsbl_sender dbl.spamhaus.org
                      > reject_rhsbl_helo dbl.spamhaus.org
                      > ...
                      >
                      > And in fact you asked about DNSBLS in April 2010
                      > http://comments.gmane.org/gmane.mail.postfix.user/208344
                      >
                      > and were given all of this information then, by Ralf and myself. You
                      > can also use multi.uribl.com and multi.surbl.org here, requiring a total
                      > of 9 parameter entries.

                      For now I've just added the spamhaus.org entries. I've added them
                      after reject_unknown_recipient_domain and before check_helo_access. Is
                      that correct?

                      How about barracuda? I'm currently using it with postscreen.

                      I think I like postscreen better than the rhsbl statements because of
                      the additional features of postscreen.

                      > I just noticed you don't require HELO. So you need this as well:
                      >
                      > smtpd_helo_required = yes
                      >
                      > And in fact, your current HELO based restrictions are having no effect
                      > if clients don't send HELO/EHLO:
                      >
                      > check_helo_access pcre:/etc/postfix/helo_checks.pcre
                      > reject_invalid_helo_hostname

                      Okay, awesome, I've added that. I didn't even think it was possible to
                      send mail without that.

                      Headed off for some turkey, so for now I'll just say thanks and great
                      advice about the SSD system. I'm definitely interested in building an
                      SSD system, and planned on doing that early next year, once I have the
                      resources from the customer.

                      Thanks again,
                      Alex
                    • Stan Hoeppner
                      ... Again, the IP in question will never be listed in the PBL. SBL maybe, PBL no. Might be time to brush up on Spamhaus various lists and their criteria. ...
                      Message 10 of 15 , Nov 23, 2012
                      • 0 Attachment
                        On 11/22/2012 12:39 PM, Alex wrote:
                        > Hi,
                        >
                        >>> I'm sure by now it's in the PBL or SBL.
                        >>
                        >> This is a bad assumption. The PBL lists dynamics/etc, not snowshoe IPs.
                        >
                        > Right, that makes sense. A spammer wouldn't have access to a
                        > consecutive block of dynamic IPs, like from a cable company or
                        > Verizon. It still could mean that it's listed in the PBL by now,
                        > though.

                        Again, the IP in question will never be listed in the PBL. SBL maybe,
                        PBL no. Might be time to brush up on Spamhaus various lists and their
                        criteria.

                        >>> They were later all tagged as spam, but it would definitely be nice to
                        >>> be blocking these outright with postscreen. I've now added an iptables
                        >>> rule manually, but I wish there was a way to build in some
                        >>> intelligence to automate it, such as with fail2ban.
                        >>
                        >> Unfortunately fail2ban doesn't work for snowshoe. The rate is
                        >> intentionally low, which is why snowshoe avoids most trap driven DNSBLs
                        >> as well.
                        >
                        > I have fail2ban working with dnsblog. It may not necessarily work for
                        > snowshoe, but it works well for repeated attempts.

                        Fail2ban doesn't stop spam. It merely shifts the burden of rejection
                        from Postfix to the IP stack. And it won't work for snowshoe because
                        you're never going to detect snowshoe with Postscreen, or any Postfix
                        controls.

                        > Just to confirm my
                        > understanding, dnsblog does the lookup and logging, then rejects based
                        > on the policy, correct? So it wouldn't be necessary filter on
                        > postscreen entries because it's the same IP log info as with dnsblog?

                        Someone else will need to answer this.

                        >>> Are you suggesting I increase the weight of the BRBL with postscreen?
                        >>
                        >> I don't use postscreen. I block outright in SMTPD on any DNSBL hit.
                        >> I.e. I don't use weighting. With any of the reputable DNSBLs you should
                        >> probably outright block, not score. So set postscreen weighting so any
                        >
                        > Okay, I've set the postscreen threshold to 1, so any hit is a reject.
                        > It's already dramatically increased the number of rejects.

                        And decreased the load on your content filters as well, I presume, and
                        likely decreased or eliminated your 220 delay issue.

                        > I've also added the reject_rhsbl_reverse_client and other rhsbl
                        > statements you've recommended. I decided not to bother with
                        > warn_if_reject and trust the DNSBLs. I realize it's doing twice as
                        > many DNS lookups for now.

                        You're using SA which makes all of these same DNSBL lookups. So you're
                        not doing any extra lookups, just doing them sooner in the cycle. If
                        mail reaches SA its lookups are now local to your resolver, which speeds
                        up SA as it doesn't have to wait for remote DNS server responses.

                        > I'll also have to whitelist any false
                        > positive IPs in multiple places for now too.

                        RHSBL restrictions trigger on domains, not IPs. Domains that end up on
                        domain based block lists or URI block lists rarely, if ever, send legit
                        mail.

                        > When I was working on this in 2010 (how the hell did you remember
                        > that?), my system was so old that it not only didn't support
                        > warn_if_reject, it didn't support any of the rhsbl statements in
                        > smtpd_recipient_restrictions. It was certainly pre-2.0 release I was
                        > using, so I wasn't able to implement any of the suggestions.

                        It's obviously always best to stay close to current.

                        >> smtpd_recipient_restrictions =
                        >> ...
                        >> reject_rhsbl_reverse_client dbl.spamhaus.org
                        >> reject_rhsbl_sender dbl.spamhaus.org
                        >> reject_rhsbl_helo dbl.spamhaus.org
                        >> ...
                        >>
                        >> And in fact you asked about DNSBLS in April 2010
                        >> http://comments.gmane.org/gmane.mail.postfix.user/208344
                        >>
                        >> and were given all of this information then, by Ralf and myself. You
                        >> can also use multi.uribl.com and multi.surbl.org here, requiring a total
                        >> of 9 parameter entries.
                        >
                        > For now I've just added the spamhaus.org entries. I've added them
                        > after reject_unknown_recipient_domain and before check_helo_access. Is
                        > that correct?

                        I always put my least expensive restrictions first and most expensive
                        last. So inbuilt Postfix checks go first, then local table checks, then
                        DNSBL lookups, then content filters.

                        > How about barracuda? I'm currently using it with postscreen.

                        Set it to reject outright (which I believe you already have) and you're
                        done.

                        > I think I like postscreen better than the rhsbl statements because of
                        > the additional features of postscreen.

                        Fuzzy dice hang'n on your mirror don't make the car go faster. If you
                        find that you *need* weighting of RHS domain rejection decisions due to
                        high FPs (which I doubt), then you can use postfwd or policyd for
                        weighting. Keep in mind policy servers are much slower than Postfix
                        smtpd restrictions, but faster than content filters. Thus it's always
                        best to reject with inbuilt Postfix restrictions if you can, on a busy
                        server.

                        >> I just noticed you don't require HELO. So you need this as well:
                        >>
                        >> smtpd_helo_required = yes
                        >>
                        >> And in fact, your current HELO based restrictions are having no effect
                        >> if clients don't send HELO/EHLO:
                        >>
                        >> check_helo_access pcre:/etc/postfix/helo_checks.pcre
                        >> reject_invalid_helo_hostname
                        >
                        > Okay, awesome, I've added that. I didn't even think it was possible to
                        > send mail without that.

                        I'm not sure if the latest SMTP RFC requires HELO/EHLO or not.
                        Regardless, you should enforce it or your HELO checks may be worthless.

                        > Headed off for some turkey,

                        Hope you didn't gain 10 pounds like many of us. :)

                        > so for now I'll just say thanks and great
                        > advice about the SSD system. I'm definitely interested in building an
                        > SSD system, and planned on doing that early next year, once I have the
                        > resources from the customer.

                        Prices should be a little lower by then as well, at least for the SSDs.
                        The RAID card prices may not move much. SSD simply makes soo much
                        sense for a mail gateway. You never have to worry about a queue IO
                        bottleneck again.

                        > Thanks again,

                        Always glad to help.

                        --
                        Stan
                      • Alex
                        Stan, ... Yes, I didn t fully understand that dynamics aren t listed in the PBL. ... I thought the IP layer would be more efficient than filtering it at the
                        Message 11 of 15 , Nov 27, 2012
                        • 0 Attachment
                          Stan,

                          >> Right, that makes sense. A spammer wouldn't have access to a
                          >> consecutive block of dynamic IPs, like from a cable company or
                          >> Verizon. It still could mean that it's listed in the PBL by now,
                          >> though.
                          >
                          > Again, the IP in question will never be listed in the PBL. SBL maybe,
                          > PBL no. Might be time to brush up on Spamhaus various lists and their
                          > criteria.

                          Yes, I didn't fully understand that dynamics aren't listed in the PBL.

                          >> I have fail2ban working with dnsblog. It may not necessarily work for
                          >> snowshoe, but it works well for repeated attempts.
                          >
                          > Fail2ban doesn't stop spam. It merely shifts the burden of rejection
                          > from Postfix to the IP stack. And it won't work for snowshoe because
                          > you're never going to detect snowshoe with Postscreen, or any Postfix
                          > controls.

                          I thought the IP layer would be more efficient than filtering it at
                          the postfix application layer, and also would then not have to
                          specifically worry about whether it was part of a snowshoe botnet or a
                          single hacked IP.

                          >> Okay, I've set the postscreen threshold to 1, so any hit is a reject.
                          >> It's already dramatically increased the number of rejects.
                          >
                          > And decreased the load on your content filters as well, I presume, and
                          > likely decreased or eliminated your 220 delay issue.

                          Yes, I haven't seen any further indication of 220 delay issues. I also
                          still have a few too many header checks that I'll be purging.

                          >> I've also added the reject_rhsbl_reverse_client and other rhsbl
                          >> statements you've recommended. I decided not to bother with
                          >> warn_if_reject and trust the DNSBLs. I realize it's doing twice as
                          >> many DNS lookups for now.
                          >
                          > You're using SA which makes all of these same DNSBL lookups. So you're
                          > not doing any extra lookups, just doing them sooner in the cycle. If
                          > mail reaches SA its lookups are now local to your resolver, which speeds
                          > up SA as it doesn't have to wait for remote DNS server responses.

                          I meant that if I kept postscreen running, I would have the lookups there too.

                          >> I think I like postscreen better than the rhsbl statements because of
                          >> the additional features of postscreen.
                          >
                          > Fuzzy dice hang'n on your mirror don't make the car go faster. If you
                          > find that you *need* weighting of RHS domain rejection decisions due to
                          > high FPs (which I doubt), then you can use postfwd or policyd for
                          > weighting. Keep in mind policy servers are much slower than Postfix
                          > smtpd restrictions, but faster than content filters. Thus it's always
                          > best to reject with inbuilt Postfix restrictions if you can, on a busy
                          > server.

                          I meant that postscreen has extra functionality such as the protocol
                          tests before and after the 220 greeting.

                          > Prices should be a little lower by then as well, at least for the SSDs.
                          > The RAID card prices may not move much. SSD simply makes soo much
                          > sense for a mail gateway. You never have to worry about a queue IO
                          > bottleneck again.

                          I've actually given more thought to doing this sooner on the secondary
                          box. However, I can't find any 3.5" SSD SATA disks that will fit in my
                          existing 1U SATA chassis. Any ideas? Here's the newegg link you sent:

                          http://www.newegg.com/Product/Product.aspx?Item=N82E16820167120

                          I thought I could use the existing SATA controller on board with the
                          Linux md RAID5.

                          Thanks,
                          Alex
                        • Stan Hoeppner
                          ... Dynamics are 95+% of what s listed in the PBL: I am an ISP. I don t want my dynamic/dial-up users sending spam via infection. I contact Spamhaus and say
                          Message 12 of 15 , Nov 27, 2012
                          • 0 Attachment
                            On 11/27/2012 2:51 PM, Alex wrote:
                            > Stan,
                            >
                            >>> Right, that makes sense. A spammer wouldn't have access to a
                            >>> consecutive block of dynamic IPs, like from a cable company or
                            >>> Verizon. It still could mean that it's listed in the PBL by now,
                            >>> though.
                            >>
                            >> Again, the IP in question will never be listed in the PBL. SBL maybe,
                            >> PBL no. Might be time to brush up on Spamhaus various lists and their
                            >> criteria.
                            >
                            > Yes, I didn't fully understand that dynamics aren't listed in the PBL.

                            Dynamics are 95+% of what's listed in the PBL:

                            I am an ISP. I don't want my dynamic/dial-up users sending spam via
                            infection. I contact Spamhaus and say "Please add this /16 to the PBL.
                            These IPs should never send direct SMTP mail." This is why it's called
                            the "Policy Block List". It's network owner policy that decides what is
                            listed. In some cases Spamhaus adds entries without being contacted by
                            network operators when it is clear a network is dynamic and bot spam is
                            spewing.

                            The SBL is where you'll find snowshoe IPs listed by Spamhaus.

                            >>> I have fail2ban working with dnsblog. It may not necessarily work for
                            >>> snowshoe, but it works well for repeated attempts.
                            >>
                            >> Fail2ban doesn't stop spam. It merely shifts the burden of rejection
                            >> from Postfix to the IP stack. And it won't work for snowshoe because
                            >> you're never going to detect snowshoe with Postscreen, or any Postfix
                            >> controls.
                            >
                            > I thought the IP layer would be more efficient than filtering it at
                            > the postfix application layer, and also would then not have to
                            > specifically worry about whether it was part of a snowshoe botnet or a
                            > single hacked IP.

                            Dropping SMTP packets should be done with care. If you FP on an email
                            to the CEO and he comes asking, giving you a sender address, you have no
                            way to track it down in logs. The CPU overhead of rejecting with
                            Postscreen or smtpd is absolutely tiny, especially compared to a run
                            through SA, and in the big picture is not much more than dropping
                            packets. Postfix rejections occur in a few microseconds to milliseconds
                            on modern hardware.

                            If you're going to drop packets make sure you know the IP send nothing
                            but spam. Case in point, many SOHOs and small biz have their Exchange
                            server and PCs behind the same NAT'd IP and don't do egress filtering on
                            TCP 25. If they get a bot infection and you autoban the NAT'd IP you're
                            now killing all their legit mail. Again, do SMTP packet dropping with
                            care. This means doing it manually for known bad hosts/networks. Or,
                            don't do it at all.

                            >>> Okay, I've set the postscreen threshold to 1, so any hit is a reject.
                            >>> It's already dramatically increased the number of rejects.
                            >>
                            >> And decreased the load on your content filters as well, I presume, and
                            >> likely decreased or eliminated your 220 delay issue.
                            >
                            > Yes, I haven't seen any further indication of 220 delay issues. I also
                            > still have a few too many header checks that I'll be purging.

                            Header checks aren't usually an overhead problem unless you have many
                            hundreds or thousands of them. Body checks are more of a concern.
                            Neither eat CPU anything like SA.

                            >>> I've also added the reject_rhsbl_reverse_client and other rhsbl
                            >>> statements you've recommended. I decided not to bother with
                            >>> warn_if_reject and trust the DNSBLs. I realize it's doing twice as
                            >>> many DNS lookups for now.
                            >>
                            >> You're using SA which makes all of these same DNSBL lookups. So you're
                            >> not doing any extra lookups, just doing them sooner in the cycle. If
                            >> mail reaches SA its lookups are now local to your resolver, which speeds
                            >> up SA as it doesn't have to wait for remote DNS server responses.
                            >
                            > I meant that if I kept postscreen running, I would have the lookups there too.

                            Again, adding reject_rhsbl_foo_client adds no additional queries because
                            you are using Spamassassin, which also performs these same queries. The
                            dnsbl queries you are doing in Postscreen are also duplicated by SA. So
                            again, you have no additional net queries.

                            >>> I think I like postscreen better than the rhsbl statements because of
                            >>> the additional features of postscreen.
                            >>
                            >> Fuzzy dice hang'n on your mirror don't make the car go faster. If you
                            >> find that you *need* weighting of RHS domain rejection decisions due to
                            >> high FPs (which I doubt), then you can use postfwd or policyd for
                            >> weighting. Keep in mind policy servers are much slower than Postfix
                            >> smtpd restrictions, but faster than content filters. Thus it's always
                            >> best to reject with inbuilt Postfix restrictions if you can, on a busy
                            >> server.
                            >
                            > I meant that postscreen has extra functionality such as the protocol
                            > tests before and after the 220 greeting.

                            You seem to be mistakenly looking at these sets of Postfix features as
                            competitive instead of cooperative. They are a layered defense, and
                            Postscreen is the point man. The features you mention here are
                            zombie/bot specific, and thus only exist in Postscreen. RHSBL tests
                            only appear in SMTPD. There is some feature overlap in Postscreen due
                            to user demand.

                            I think it would greatly benefit you to actually read the Postscreen
                            documentation which explains all of this in detail:
                            http://www.postfix.org/POSTSCREEN_README.html

                            >> Prices should be a little lower by then as well, at least for the SSDs.
                            >> The RAID card prices may not move much. SSD simply makes soo much
                            >> sense for a mail gateway. You never have to worry about a queue IO
                            >> bottleneck again.
                            >
                            > I've actually given more thought to doing this sooner on the secondary
                            > box. However, I can't find any 3.5" SSD SATA disks that will fit in my
                            > existing 1U SATA chassis. Any ideas? Here's the newegg link you sent:

                            Yes. You use the included 2.5 to 3.5 adapter clearly shown in the pictures.

                            > http://www.newegg.com/Product/Product.aspx?Item=N82E16820167120
                            >
                            > I thought I could use the existing SATA controller on board with the
                            > Linux md RAID5.

                            Using the onboard Southbridge SATA controller and software vs hardware
                            RAID is up to you. I prefer hardware RAID solutions for many reasons
                            that have been covered by myself and others many times. Either way your
                            desire to use RAID5 baffles me, and likely most others here. Parity
                            RAID has ZERO upside and a long list of downsides for transactional
                            workloads. A single SSD gives you the IOPS and capacity you need. A
                            mirror adds redundancy.

                            Using RAID5 simply adds the cost of a 3rd drive, slows down your IO due
                            to RMW cycles, increases initialization and rebuild time, etc, etc. And
                            with md/RAID5, a sufficiently high IO rate on only 3 SSDs can eat one
                            entire CPU core in overhead because the writer is a single kernel thread
                            (patches to fix this threading bottleneck have been submitted but won't
                            be available in distro kernels for many months to years).

                            Whether you do hardware or software RAID, do yourself a favor and use 2
                            mirrored SSDs. You can't need more than 60GB for a queue device, if
                            1/3rd of that. And you'll avoid many potential headaches of RAID5.

                            --
                            Stan
                          • Alex
                            Hi, ... Thanks for the explanation. Trying to do too many things at once. You probably think I m an idiot by now. ... I meant as it relates to blocking by
                            Message 13 of 15 , Dec 2, 2012
                            • 0 Attachment
                              Hi,

                              >>> Again, the IP in question will never be listed in the PBL. SBL maybe,
                              >>> PBL no. Might be time to brush up on Spamhaus various lists and their
                              >>> criteria.
                              >>
                              >> Yes, I didn't fully understand that dynamics aren't listed in the PBL.
                              ...
                              > The SBL is where you'll find snowshoe IPs listed by Spamhaus.

                              Thanks for the explanation. Trying to do too many things at once. You
                              probably think I'm an idiot by now.

                              > Dropping SMTP packets should be done with care. If you FP on an email
                              > to the CEO and he comes asking, giving you a sender address, you have no

                              I meant as it relates to blocking by spamhaus or barracuda. From an FP
                              perspective, that doesn't affect it either way. Perhaps the audit
                              trail is a little better with just letting postscreen continue to
                              block it rather than fail2ban, however.

                              > If you're going to drop packets make sure you know the IP send nothing
                              > but spam. Case in point, many SOHOs and small biz have their Exchange

                              Yes, I'm only using it on IPs that have already been confirmed to be
                              blacklisted, of course.

                              > care. This means doing it manually for known bad hosts/networks. Or,
                              > don't do it at all.

                              I've got stories to tell there. Another day, though.

                              > Header checks aren't usually an overhead problem unless you have many
                              > hundreds or thousands of them. Body checks are more of a concern.
                              > Neither eat CPU anything like SA.

                              I'd say I had a few hundred. Dumped some of the old ones, and had them
                              there instead of SA for just that reason.

                              >> I meant that if I kept postscreen running, I would have the lookups there too.
                              >
                              > Again, adding reject_rhsbl_foo_client adds no additional queries because
                              > you are using Spamassassin, which also performs these same queries. The
                              > dnsbl queries you are doing in Postscreen are also duplicated by SA. So
                              > again, you have no additional net queries.

                              Assuming the rhsbl fails in postfix, then SA processes it, doesn't
                              that two queries for the same rhsbl entry?

                              > I think it would greatly benefit you to actually read the Postscreen
                              > documentation which explains all of this in detail:
                              > http://www.postfix.org/POSTSCREEN_README.html

                              I've read it again, but my confusion was with reading about
                              reject_rhsbl statements, and forgetting they're domain, not IP based.

                              >> I've actually given more thought to doing this sooner on the secondary
                              >> box. However, I can't find any 3.5" SSD SATA disks that will fit in my
                              >> existing 1U SATA chassis. Any ideas? Here's the newegg link you sent:
                              >
                              > Yes. You use the included 2.5 to 3.5 adapter clearly shown in the pictures.

                              Unfortunately it's not that easy. You would think it would be that
                              easy, but somehow I knew there would be complications. I investigated
                              this, and it's just a tray to make it fit in a 3.5" bay, such what
                              you'd find in a desktop PC.

                              Looking at the picture, it just didn't seem right. I actually called
                              Intel, and I explained to them I had a 1U chassis and needed to be
                              able to put this 2.5" disk in the tray where normally a 3.5" disk is
                              used. He told me what you thought -- that the tray would work for
                              that, even though I knew there was no way it would.

                              I ordered two of the 60GB 520 series disks instead of the ones you
                              mentioned -- better warranty and faster. They arrived on Friday, and
                              sure enough, it's just a metal frame to put it in a desktop, not a 1U
                              chassis.

                              So, considering they generate no heat and take up no space, I'm
                              thinking of using velco inside the case. We'll see how that goes.

                              > Using the onboard Southbridge SATA controller and software vs hardware
                              > RAID is up to you. I prefer hardware RAID solutions for many reasons
                              > that have been covered by myself and others many times. Either way your

                              I would never use the onboard SATA controller. That's crap.

                              I have great faith in Neil Brown and his md code :-) I've lost a few
                              software arrays over the years, but have always found them reliable
                              and better supported in Linux. I've also used the battery-backed
                              hardware RAID in the past, which is nice too.

                              > Using RAID5 simply adds the cost of a 3rd drive, slows down your IO due
                              > to RMW cycles, increases initialization and rebuild time, etc, etc. And
                              > with md/RAID5, a sufficiently high IO rate on only 3 SSDs can eat one
                              > entire CPU core in overhead because the writer is a single kernel thread

                              Yes, there's no debating an SSD would be preferred in all situations.
                              When this was built, we used four SATA3 disks with 64MB cache and
                              RAID5 because the system was so fast already that the extra expense
                              wasn't necessary.

                              I also wasn't as familiar and hadn't as extensively tested the RAID10
                              config, but will sure use it for the mailstore system I'm building
                              next.

                              Thanks again,
                              Alex
                            • Stan Hoeppner
                              ... You re welcome. I understand that completely. No, not at all. ... Yeah, with fail2ban you don t really have no audit trail at all. Keep in mind that
                              Message 14 of 15 , Dec 3, 2012
                              • 0 Attachment
                                On 12/2/2012 1:20 PM, Alex wrote:

                                > Thanks for the explanation. Trying to do too many things at once. You
                                > probably think I'm an idiot by now.

                                You're welcome. I understand that completely. No, not at all.

                                >> Dropping SMTP packets should be done with care. If you FP on an email
                                >> to the CEO and he comes asking, giving you a sender address, you have no
                                >
                                > I meant as it relates to blocking by spamhaus or barracuda. From an FP
                                > perspective, that doesn't affect it either way. Perhaps the audit
                                > trail is a little better with just letting postscreen continue to
                                > block it rather than fail2ban, however.

                                Yeah, with fail2ban you don't really have no audit trail at all. Keep
                                in mind that legit senders can wind up on trap driven lists. Both Zen
                                and BRBL expire listings when the IPs are no longer emitting for X
                                period of time. So if you feed DNSBL rejected IPs to it you should set
                                some kind of expiry period.

                                > Yes, I'm only using it on IPs that have already been confirmed to be
                                > blacklisted, of course.

                                See above.

                                > I'd say I had a few hundred. Dumped some of the old ones, and had them
                                > there instead of SA for just that reason.

                                This is a double edged sword: if you hit with header_checks you save SA
                                CPU time. If you miss you've added extra CPU time on top of SA. But
                                given you have a 4 core machine, IIRC, CPU shouldn't be much concern,
                                even with the relatively high msg rate you have.

                                > Assuming the rhsbl fails in postfix, then SA processes it, doesn't
                                > that two queries for the same rhsbl entry?

                                When I say "query" I'm talking about those that come with cost, i.e.
                                external to your network, with latency. The 2nd DNS query in this case,
                                SA, is answered by your local caching resolver, thus no cost.

                                > I've read it again, but my confusion was with reading about
                                > reject_rhsbl statements, and forgetting they're domain, not IP based.

                                Yes, postscreen has but a fraction of the rejection parameters/types
                                available in SMTPD. All of the domain based rejections are SMTPD only.

                                > Unfortunately it's not that easy. You would think it would be that
                                > easy, but somehow I knew there would be complications. I investigated
                                > this, and it's just a tray to make it fit in a 3.5" bay, such what
                                > you'd find in a desktop PC.
                                >
                                > Looking at the picture, it just didn't seem right. I actually called
                                > Intel, and I explained to them I had a 1U chassis and needed to be
                                > able to put this 2.5" disk in the tray where normally a 3.5" disk is
                                > used. He told me what you thought -- that the tray would work for
                                > that, even though I knew there was no way it would.
                                >
                                > I ordered two of the 60GB 520 series disks instead of the ones you
                                > mentioned -- better warranty and faster. They arrived on Friday, and
                                > sure enough, it's just a metal frame to put it in a desktop, not a 1U
                                > chassis.
                                >
                                > So, considering they generate no heat and take up no space, I'm
                                > thinking of using velco inside the case. We'll see how that goes.

                                Don't do that. Send me the make/model# and/or a picture of link to the
                                manufacturer product page. Is this a tier one chassis? I.e. HP, Dell,
                                IBM, etc? Once I see the drive cage arrangement I can point you to
                                exactly what you need. However, if the chassis has hot swap 3.5" SATA
                                bays then the adapter should allow you mount the SSD in the carrier with
                                perfect interface mating to the backplane.

                                > I would never use the onboard SATA controller. That's crap.

                                Not all of them are crap. Many yes, but not all. This depends a bit on
                                what capabilities you need. If you don't need expander or PMP support
                                the list of decent ones is larger. If your board has an integrated
                                LSISAS 1064/68/78 or 2008/2108 then you're golden for mdraid.

                                > I have great faith in Neil Brown and his md code :-) I've lost a few
                                > software arrays over the years, but have always found them reliable
                                > and better supported in Linux. I've also used the battery-backed
                                > hardware RAID in the past, which is nice too.

                                There's nothing inherently wrong with mdraid. As long as one knows its
                                limitations and works around them, or doesn't have a configuration or
                                workload that will bump into said limitations, then you should be fine.

                                > Yes, there's no debating an SSD would be preferred in all situations.
                                > When this was built, we used four SATA3 disks with 64MB cache and
                                > RAID5 because the system was so fast already that the extra expense
                                > wasn't necessary.

                                And this is one of the disadvantages of mdraid in absence of a BBWC
                                controller. For filesystem and data safety, drive caches should never
                                be enabled, which murders mdraid write performance, especially RAID5/6.
                                If your UPS burps, or with some kernel panic/crash scenarios, you lose
                                the contents of the write cache in the drives, possibly resulting in
                                filesystem corruption and lost data. Mounting a filesystem with write
                                barriers enabled helps a bit. If you use a BBWC controller you only
                                lose what's in flight in the Linux buffer cache. In this case, with a
                                journaling FS, the filesystem won't be corrupted. With mdraid, vanilla
                                controller, and drive caches enabled, you'll almost certainly get some
                                FS corruption.

                                > I also wasn't as familiar and hadn't as extensively tested the RAID10
                                > config, but will sure use it for the mailstore system I'm building
                                > next.

                                With SSD if you use anything other than 2 drive mirroring (RAID1) you're
                                wasting money and needlessly increasing overhead, unless you need more
                                capacity than a mirror pair provides. In that case you'd concatenate
                                mirror pairs as this method is infinitely expandable, whereas mdraid0/10
                                cannot be expanded, and concat gives better random IO performance than
                                striping for small file (mail) workloads. Note these comments are SSD
                                specific and filesystem agnostic.

                                The concat of mirrors only yields high performance with rust if you're
                                using XFS, and have precisely calculated allocation group size and
                                number of allocation groups so that each is wholly contained on a single
                                mirror pair. This XFS concat setup is a bit of a black art. If you
                                want to know more about it I can instruct you off list.

                                Now if you're using rust, as in a mail store where SSD is cost
                                prohibitive given capacity needs, then a properly configured RAID10 is
                                generally the best option for most people (and with any FS other than
                                XFS). It gives the best random write performance, good random read
                                performance, and far lower rebuild time than parity arrays. Most people
                                don't appreciate this last point until they have to rebuild, say, an 8x
                                7.2K 1TB drive RAID6 array on a busy production mail store server and it
                                takes half a day or longer, increasing latency for all users. With a
                                concat of mirrors and XFS, when rebuilding a failed drive, only those
                                users whose mailboxes reside on that mirror pair will have increased
                                latency, because this setup automatically spreads all mailboxes
                                relatively evenly across all mirrors. Think of it as file level
                                striping across disks, or more correctly, directory level striping.
                                Such a setup is perfect for maildir mailboxes or Dovecot m/dbox.

                                --
                                Stan
                              Your message has been successfully submitted and would be delivered to recipients shortly.