Loading ...
Sorry, an error occurred while loading the content.

need help with regexp in header_checks

Expand Messages
  • naser sonbaty
    Hi, I need help with postfix regexp in header_checks. I want discard all emails(any domain) from admin@ I use following: /^(To|From|Cc|Reply-To): admin@(.*)/
    Message 1 of 18 , Nov 13, 2013
    • 0 Attachment
      Hi,

      I need help with postfix regexp in header_checks.
      I want discard all emails(any domain) from admin@

      I use following:
      /^(To|From|Cc|Reply-To): admin@(.*)/        DISCARD

      but its not working

      thx for help
    • Stan Hoeppner
      ... Tests fine here: $ cat test.regexp /^(To|From|Cc|Reply-To): admin@(.*)/ DISCARD $ postmap -q blah: admin@abc.com regexp:./test.regexp $ postmap -q
      Message 2 of 18 , Nov 13, 2013
      • 0 Attachment
        On 11/13/2013 2:34 AM, naser sonbaty wrote:
        > Hi,
        >
        > I need help with postfix regexp in header_checks.
        > I want discard all emails(any domain) from admin@
        >
        > I use following:
        > /^(To|From|Cc|Reply-To): admin@(.*)/ DISCARD
        >
        > but its not working

        Tests fine here:

        $ cat test.regexp
        /^(To|From|Cc|Reply-To): admin@(.*)/ DISCARD

        $ postmap -q "blah: admin@..." regexp:./test.regexp

        $ postmap -q "From: admin@..." regexp:./test.regexp
        DISCARD

        $ postmap -q "To: admin@..." regexp:./test.regexp
        DISCARD

        $ postmap -q "CC: admin@..." regexp:./test.regexp
        DISCARD

        $ postmap -q "Reply-To: admin@..." regexp:./test.regexp
        DISCARD

        If these tests work but header_checks isn't working then you need to
        execute "postfix reload" to load your new/modified regexp table.

        Also, note that the carat (^) anchor isn't necessary. The header fields
        you're testing for are in the left most position. Thus no reason to
        left anchor your expression.

        --
        Stan
      • Noel Jones
        ... WARNING: This looks like a really bad idea. Use at your own risk. In particular, discarding mail should be a last resort, especially for a broad expression
        Message 3 of 18 , Nov 13, 2013
        • 0 Attachment
          On 11/13/2013 2:34 AM, naser sonbaty wrote:
          > Hi,
          >
          > I need help with postfix regexp in header_checks.
          > I want discard all emails(any domain) from admin@
          >
          > I use following:
          > /^(To|From|Cc|Reply-To): admin@(.*)/ DISCARD
          >
          > but its not working
          >
          > thx for help


          WARNING: This looks like a really bad idea. Use at your own risk.
          In particular, discarding mail should be a last resort, especially
          for a broad expression like this.

          Anyway, this should match better:
          /^(To|From|Cc|Reply-To): .*[" <]admin@/ DISCARD



          -- Noel Jones
        • moparisthebest
          Agreed. Why would you want to discard my emails? :( Is there something wrong with having an email named admin?
          Message 4 of 18 , Nov 13, 2013
          • 0 Attachment
            Agreed. Why would you want to discard my emails? :(

            Is there something wrong with having an email named admin?

            On 11/13/2013 10:01 AM, Noel Jones wrote:
            > On 11/13/2013 2:34 AM, naser sonbaty wrote:
            >> Hi,
            >>
            >> I need help with postfix regexp in header_checks.
            >> I want discard all emails(any domain) from admin@
            >>
            >> I use following:
            >> /^(To|From|Cc|Reply-To): admin@(.*)/ DISCARD
            >>
            >> but its not working
            >>
            >> thx for help
            >
            >
            > WARNING: This looks like a really bad idea. Use at your own risk.
            > In particular, discarding mail should be a last resort, especially
            > for a broad expression like this.
            >
            > Anyway, this should match better:
            > /^(To|From|Cc|Reply-To): .*[" <]admin@/ DISCARD
            >
            >
            >
            > -- Noel Jones
            >
          • Jan P. Kessler
            ... Of course there is. - Anchored expressions are executed faster (the parser has to check the pattern only against the beginning of the line). - If I write
            Message 5 of 18 , Nov 13, 2013
            • 0 Attachment
              > Also, note that the carat (^) anchor isn't necessary. The header fields
              > you're testing for are in the left most position. Thus no reason to
              > left anchor your expression.

              Of course there is.

              - Anchored expressions are executed faster (the parser has to check the
              pattern only against the beginning of the line).

              - If I write an e-mail with the following subject, the OP would get a
              false-positive:

              Subject: Wrote an e-mail to: admin@...
            • Bill Cole
              ... There absolutely ARE reasons to anchor RE s in header_checks: 1. Performance. In recent years email has developed a sort of header cancer: new, often
              Message 6 of 18 , Nov 13, 2013
              • 0 Attachment
                On 13 Nov 2013, at 6:39, Stan Hoeppner wrote:

                > On 11/13/2013 2:34 AM, naser sonbaty wrote:
                >> Hi,
                >>
                >> I need help with postfix regexp in header_checks.
                >> I want discard all emails(any domain) from admin@
                >>
                >> I use following:
                >> /^(To|From|Cc|Reply-To): admin@(.*)/ DISCARD
                >>
                >> but its not working
                >
                > Tests fine here:
                >
                > $ cat test.regexp
                > /^(To|From|Cc|Reply-To): admin@(.*)/ DISCARD
                >
                > $ postmap -q "blah: admin@..." regexp:./test.regexp
                >
                > $ postmap -q "From: admin@..." regexp:./test.regexp
                > DISCARD
                >
                > $ postmap -q "To: admin@..." regexp:./test.regexp
                > DISCARD
                >
                > $ postmap -q "CC: admin@..." regexp:./test.regexp
                > DISCARD
                >
                > $ postmap -q "Reply-To: admin@..." regexp:./test.regexp
                > DISCARD
                >
                > If these tests work but header_checks isn't working then you need to
                > execute "postfix reload" to load your new/modified regexp table.
                >
                > Also, note that the carat (^) anchor isn't necessary. The header
                > fields
                > you're testing for are in the left most position. Thus no reason to
                > left anchor your expression.

                There absolutely ARE reasons to anchor RE's in header_checks:

                1. Performance. In recent years email has developed a sort of header
                cancer: new, often proprietary, and often opaque headers that routinely
                have logical lengths of hundreds of characters. Not anchoring a header
                check to the start of the header when you only want to check a few
                specific headers wastes effort scanning for a match anywhere in a
                header, potentially taking hundreds of times longer to confirm a
                non-match

                2. Matching unanticipated headers. Except for the very few headers with
                tightly defined structure (e.g. Date), *ANY* header could potentially
                include any string that would match "(To|From|Cc|Reply-To): " starting
                somewhere other than the start of the line. e.g. "Subject: I'm naive
                enough to think I want to discard all mail with To: admin@ in a header"
              • Bill Cole
                ... Start by reading the documentation, including the man pages for header_checks and regexp_table and Postfix s BUILTIN_FILTER_README. Also: following the
                Message 7 of 18 , Nov 13, 2013
                • 0 Attachment
                  On 13 Nov 2013, at 3:34, naser sonbaty wrote:

                  > Hi,
                  >
                  > I need help with postfix regexp in header_checks.

                  Start by reading the documentation, including the man pages for
                  header_checks and regexp_table and Postfix's BUILTIN_FILTER_README.
                  Also: following the advice in the last ~50 lines of the DEBUG_README
                  would help.

                  > I want discard all emails(any domain) from admin@

                  I trust that you believe this to be true, but I suspect that you would
                  eventually regret successfully implementing that goal. As has already
                  been demonstrated by an earlier respondent, you have no way of knowing
                  what or who might use the 'admin' address in all domains.

                  With that noted, I will assume that you actually know what you are doing
                  and that this is for some special-function mail system that will never
                  be offered unanticipated but desirable mail...

                  > I use following:
                  > /^(To|From|Cc|Reply-To): admin@(.*)/ DISCARD
                  >
                  > but its not working
                  >
                  > thx for help

                  To provide real help, we would need you to provide something more
                  substantial than "its not working." Repeating: following the advice in
                  the last ~50 lines of the DEBUG_README would help.

                  In this case, a sample of the mail you want your check to catch that it
                  is not catching would be a minimum requirement, along with the output of
                  'postconf -n header_checks' showing that you have the feature enabled.
                  Also useful: confirmation that the file your configuration specifies
                  exists, is readable, and has your pattern in it. If the whole file isn't
                  too long and holds no secrets it could even be useful to share it all,
                  since there are potential rule-ordering pitfalls.
                • Stan Hoeppner
                  ... In recent years CPUs have become so blindingly fast it makes no difference. Any excess cycles burned by a non anchored regex were idle cycles anyway.
                  Message 8 of 18 , Nov 13, 2013
                  • 0 Attachment
                    On 11/13/2013 9:50 AM, Bill Cole wrote:
                    > On 13 Nov 2013, at 6:39, Stan Hoeppner wrote:

                    >> Also, note that the carat (^) anchor isn't necessary. The header fields
                    >> you're testing for are in the left most position. Thus no reason to
                    >> left anchor your expression.
                    >
                    > There absolutely ARE reasons to anchor RE's in header_checks:
                    >
                    > 1. Performance. In recent years email has developed a sort of header
                    > cancer: new, often proprietary, and often opaque headers that routinely
                    > have logical lengths of hundreds of characters. Not anchoring a header
                    > check to the start of the header when you only want to check a few
                    > specific headers wastes effort scanning for a match anywhere in a
                    > header, potentially taking hundreds of times longer to confirm a non-match

                    In recent years CPUs have become so blindingly fast it makes no
                    difference. Any excess cycles burned by a non anchored regex were idle
                    cycles anyway. There are good arguments for anchoring expressions, but
                    saving CPU cycles is simply no longer one of them, not for years now.

                    I used to make your argument here, but again, it no longer applies.

                    > 2. Matching unanticipated headers. Except for the very few headers with
                    > tightly defined structure (e.g. Date), *ANY* header could potentially
                    > include any string that would match "(To|From|Cc|Reply-To): " starting
                    > somewhere other than the start of the line. e.g. "Subject: I'm naive
                    > enough to think I want to discard all mail with To: admin@ in a header"

                    This is a stronger argument, though I'm not sure how realistic a
                    scenario this is, with the email address in the subject. A better
                    argument would be that without anchoring the expression would also match
                    headers such as

                    X-Original-To: admin@...
                    Delivered-To: admin@...

                    in which case I'd agree he should anchor. I didn't take these into
                    account previously.

                    --
                    Stan
                  • Viktor Dukhovni
                    ... Mere excuse for sloppiness. Always anchor, then when possible discard leading ^.* and trailing .*$ . -- Viktor.
                    Message 9 of 18 , Nov 13, 2013
                    • 0 Attachment
                      On Thu, Nov 14, 2013 at 12:32:45AM -0600, Stan Hoeppner wrote:

                      > In recent years CPUs have become so blindingly fast it makes no
                      > difference. Any excess cycles burned by a non anchored regex were idle
                      > cycles anyway. There are good arguments for anchoring expressions, but
                      > saving CPU cycles is simply no longer one of them, not for years now.

                      Mere excuse for sloppiness. Always anchor, then when possible
                      discard leading "^.*" and trailing ".*$".

                      --
                      Viktor.
                    • tejas sarade
                      I think .* will match everythig. On Nov 13, 2013 8:32 PM, Noel Jones wrote:
                      Message 10 of 18 , Nov 13, 2013
                      • 0 Attachment

                        I think .* will match everythig.

                        On Nov 13, 2013 8:32 PM, "Noel Jones" <njones@...> wrote:
                      • Stan Hoeppner
                        ... I find that offensive Viktor. There is a huge difference between arguing a point of fact and arguing a position. Above is an example of the former, and
                        Message 11 of 18 , Nov 13, 2013
                        • 0 Attachment
                          On 11/14/2013 12:41 AM, Viktor Dukhovni wrote:
                          > On Thu, Nov 14, 2013 at 12:32:45AM -0600, Stan Hoeppner wrote:
                          >
                          >> In recent years CPUs have become so blindingly fast it makes no
                          >> difference. Any excess cycles burned by a non anchored regex were idle
                          >> cycles anyway. There are good arguments for anchoring expressions, but
                          >> saving CPU cycles is simply no longer one of them, not for years now.
                          >
                          > Mere excuse for sloppiness.

                          I find that offensive Viktor. There is a huge difference between
                          arguing a point of fact and arguing a position. Above is an example of
                          the former, and is a correct statement.

                          > Always anchor, then when possible
                          > discard leading "^.*" and trailing ".*$".

                          Yes, for people who have the time and dedication to "do it right", such
                          as ourselves. Others can take shortcuts and get the job done, just as
                          PHP/Perl/Java/etc heretics don't use C. It seemed to me in this case to
                          offer the OP a shortcut. That may have been incorrect. Tar and feather
                          me for that if you like, but do not accuse me of practicing or promoting
                          sloppiness, as that is simply not true. My work speaks for itself. But
                          apparently you've never even looked at it, despite it being mentioned
                          here dozens or hundreds of times over the past few years. You've formed
                          an opinion and are making untrue statements based solely on my few words
                          in this thread. Look at it:

                          http://www.hardwarefreak.com/fqrdns.pcre.txt

                          Do you consider these regexes sloppy?

                          I could remove the anchoring and they would still work in the targeted
                          use case. And the additional CPU burn wouldn't be noticeable, if even
                          measurable. But I started with fully qualified expressions years ago,
                          hence the name of the table, and I've stuck with them, even though I
                          don't really need to. Tell me that's what a sloppy person would do.

                          --
                          Stan
                        • Viktor Dukhovni
                          ... Sorry to hear that. Just because you re posting excuses for sloppiness does not mean that your work is not valuable. Both are true at the same time. The
                          Message 12 of 18 , Nov 14, 2013
                          • 0 Attachment
                            On Thu, Nov 14, 2013 at 01:35:39AM -0600, Stan Hoeppner wrote:

                            > > Mere excuse for sloppiness.
                            >
                            > I find that offensive Viktor. There is a huge difference between
                            > arguing a point of fact and arguing a position. Above is an example of
                            > the former, and is a correct statement.

                            Sorry to hear that. Just because you're posting excuses for
                            sloppiness does not mean that your work is not valuable. Both are
                            true at the same time. The impact of the sloppiness may be minor
                            to insignificant, and that's what makes it mere sloppiness rather
                            than say negligence or incompetence which are not in question here.

                            Making fewer mistakes is not mere luck, it is the result of meticulous
                            habits. CPU efficiency has nothing to do with my comment. Anchored
                            expressions yield fewer surprises, and not using them habitually
                            is sloppy. Make anchored regular expressions a habit.

                            This is analogous to always putting shell "${variable}" expansions
                            in double quotes (except on rare occasions when you want word-splitting)
                            and various other ways of generally staying out of trouble.

                            I could mention using "set -e" in shell scripts to avoid undetected
                            command failures, or using:

                            sendmail -f "${sender}" ...

                            instead of:

                            sendmail -f"${sender}" ...

                            because the latter misbehaves when "${sender}" is empty.

                            > > Always anchor, then when possible
                            > > discard leading "^.*" and trailing ".*$".
                            >
                            > Yes, for people who have the time and dedication to "do it right", such
                            > as ourselves. Others can take shortcuts and get the job done, just as
                            > PHP/Perl/Java/etc heretics don't use C.

                            Don't under-estimate the rest of humanity, teach them.

                            --
                            Viktor.
                          • Noel Jones
                            ... The expression I posted is correct. /^(To|From|Cc|Reply-To): .*[
                            Message 13 of 18 , Nov 14, 2013
                            • 0 Attachment
                              On 11/14/2013 1:07 AM, tejas sarade wrote:
                              > I think .* will match everythig.
                              >
                              > On Nov 13, 2013 8:32 PM, "Noel Jones" <njones@...

                              The expression I posted is correct.
                              /^(To|From|Cc|Reply-To): .*[" <]admin@/ DISCARD

                              This should match headers such as
                              From: System admin <admin@...>
                              or other variations.


                              -- Noel Jones
                            • Bill Cole
                              ... I think it might surprise you to learn how many mail servers run on systems constrained by CPU and RAM. This used to be a consequence of old hardware being
                              Message 14 of 18 , Nov 14, 2013
                              • 0 Attachment
                                On 14 Nov 2013, at 1:32, Stan Hoeppner wrote:

                                > On 11/13/2013 9:50 AM, Bill Cole wrote:
                                >> On 13 Nov 2013, at 6:39, Stan Hoeppner wrote:
                                >
                                >>> Also, note that the carat (^) anchor isn't necessary. The header
                                >>> fields
                                >>> you're testing for are in the left most position. Thus no reason to
                                >>> left anchor your expression.
                                >>
                                >> There absolutely ARE reasons to anchor RE's in header_checks:
                                >>
                                >> 1. Performance. In recent years email has developed a sort of header
                                >> cancer: new, often proprietary, and often opaque headers that
                                >> routinely
                                >> have logical lengths of hundreds of characters. Not anchoring a
                                >> header
                                >> check to the start of the header when you only want to check a few
                                >> specific headers wastes effort scanning for a match anywhere in a
                                >> header, potentially taking hundreds of times longer to confirm a
                                >> non-match
                                >
                                > In recent years CPUs have become so blindingly fast it makes no
                                > difference. Any excess cycles burned by a non anchored regex were
                                > idle
                                > cycles anyway. There are good arguments for anchoring expressions,
                                > but
                                > saving CPU cycles is simply no longer one of them, not for years now.
                                >
                                > I used to make your argument here, but again, it no longer applies.

                                I think it might surprise you to learn how many mail servers run on
                                systems constrained by CPU and RAM. This used to be a consequence of old
                                hardware being repurposed to utility service and ambushed by the need to
                                filter mail (a relative novelty) but today it is often the result of
                                virtualization being used to maximize utilization of all those cheap and
                                abundant resources. If your mail server is running on dedicated recent
                                but not bleeding-edge hardware you may not care about CPU, but if it is
                                running on a VPS capped at 300MHz or billed by real CPU usage, you do.
                              • Michael P. Demelbauer
                                ... Hallo Noel, this might be off topic here, but I m wondering about the regexp since yesterday. How will this match a variant I ve already
                                Message 15 of 18 , Nov 14, 2013
                                • 0 Attachment
                                  On Thu, Nov 14, 2013 at 08:19:52AM -0600, Noel Jones wrote:
                                  > On 11/14/2013 1:07 AM, tejas sarade wrote:
                                  > > I think .* will match everythig.
                                  > >
                                  > > On Nov 13, 2013 8:32 PM, "Noel Jones" <njones@...
                                  >
                                  > The expression I posted is correct.
                                  > /^(To|From|Cc|Reply-To): .*[" <]admin@/ DISCARD
                                  >
                                  > This should match headers such as
                                  > From: System admin <admin@...>
                                  > or other variations.
                                  >
                                  >
                                  > -- Noel Jones

                                  Hallo Noel,

                                  this might be off topic here, but I'm wondering about the regexp since
                                  yesterday.

                                  How will this match "<admin@....>" a variant I've already seen in some
                                  clients. If I understand the alternation correctly it searches for "
                                  Blank or < directly followed by admin@. What's my mistake?

                                  Many thx and sorry for OT,
                                  --
                                  Michael P. Demelbauer
                                  Systemadministration
                                  WSR
                                  Arsenal, Objekt 20
                                  1030 Wien
                                  -------------------------------------------------------------------------------
                                  Memory is like an orgasm, it's a lot better,
                                  if you don't have to fake it.
                                  -- Linux fortunes
                                • Wietse Venema
                                  Stan, your contributions are appreciated but please do not criticize those who suggest improvements. Anchoring regular expressions (that don t start with
                                  Message 16 of 18 , Nov 14, 2013
                                  • 0 Attachment
                                    Stan, your contributions are appreciated but please do not criticize
                                    those who suggest improvements.

                                    Anchoring regular expressions (that don't start with wild-card) is
                                    a must to avoid false matches. This is a correctness issue. Matching
                                    "To:" just because it appears in a Subject: is wrong.

                                    Savings in CPU cycles come second. The best way to save cycles is
                                    to group patterns with the same prefix under IF/ELSE/ENDIF.

                                    Wietse
                                  • Noel Jones
                                    ... Given the case of , the .* will match the and the grouping will match the
                                    Message 17 of 18 , Nov 14, 2013
                                    • 0 Attachment
                                      On 11/14/2013 9:27 AM, Michael P. Demelbauer wrote:
                                      > On Thu, Nov 14, 2013 at 08:19:52AM -0600, Noel Jones wrote:
                                      >> On 11/14/2013 1:07 AM, tejas sarade wrote:
                                      >>> I think .* will match everythig.
                                      >>>
                                      >>> On Nov 13, 2013 8:32 PM, "Noel Jones" <njones@...
                                      >>
                                      >> The expression I posted is correct.
                                      >> /^(To|From|Cc|Reply-To): .*[" <]admin@/ DISCARD
                                      >>
                                      >> This should match headers such as
                                      >> From: System admin <admin@...>
                                      >> or other variations.
                                      >>
                                      >>
                                      >> -- Noel Jones
                                      >
                                      > Hallo Noel,
                                      >
                                      > this might be off topic here, but I'm wondering about the regexp since
                                      > yesterday.
                                      >
                                      > How will this match "<admin@....>" a variant I've already seen in some
                                      > clients. If I understand the alternation correctly it searches for "
                                      > Blank or < directly followed by admin@. What's my mistake?
                                      >
                                      > Many thx and sorry for OT,
                                      >


                                      Given the case of "<admin@....>", the .* will match the " and the
                                      grouping will match the < followed by admin@.



                                      -- Noel Jones
                                    • LuKreme
                                      ... Besides the discussion on the need to anchor the regex (you do), I m trying to wrap my head around why one would want to discard mail from admin@? I mean,
                                      Message 18 of 18 , Nov 15, 2013
                                      • 0 Attachment
                                        On Nov 13, 2013, at 8:01, Noel Jones <njones@...> wrote:
                                        > Anyway, this should match better:
                                        > /^(To|From|Cc|Reply-To): .*[" <]admin@/ DISCARD

                                        Besides the discussion on the need to anchor the regex (you do), I'm trying to wrap my head around why one would want to discard mail from admin@?

                                        I mean, I reject mails that claim to come from LOCAL admin accounts, but in general?
                                      Your message has been successfully submitted and would be delivered to recipients shortly.