Re: need help with regexp in header_checks

  Stan Hoeppner
    Nov 13, 2013
      On 11/13/2013 9:50 AM, Bill Cole wrote:
      > On 13 Nov 2013, at 6:39, Stan Hoeppner wrote:

      >> Also, note that the carat (^) anchor isn't necessary. The header fields
      >> you're testing for are in the left most position. Thus no reason to
      >> left anchor your expression.
      > There absolutely ARE reasons to anchor RE's in header_checks:
      > 1. Performance. In recent years email has developed a sort of header
      > cancer: new, often proprietary, and often opaque headers that routinely
      > have logical lengths of hundreds of characters. Not anchoring a header
      > check to the start of the header when you only want to check a few
      > specific headers wastes effort scanning for a match anywhere in a
      > header, potentially taking hundreds of times longer to confirm a non-match

      In recent years CPUs have become so blindingly fast it makes no
      difference. Any excess cycles burned by a non anchored regex were idle
      cycles anyway. There are good arguments for anchoring expressions, but
      saving CPU cycles is simply no longer one of them, not for years now.

      I used to make your argument here, but again, it no longer applies.

      > 2. Matching unanticipated headers. Except for the very few headers with
      > tightly defined structure (e.g. Date), *ANY* header could potentially
      > include any string that would match "(To|From|Cc|Reply-To): " starting
      > somewhere other than the start of the line. e.g. "Subject: I'm naive
      > enough to think I want to discard all mail with To: admin@ in a header"

      This is a stronger argument, though I'm not sure how realistic a
      scenario this is, with the email address in the subject. A better
      argument would be that without anchoring the expression would also match
      headers such as

      X-Original-To: admin@...
      Delivered-To: admin@...

      in which case I'd agree he should anchor. I didn't take these into
      account previously.

