Loading ...
Sorry, an error occurred while loading the content.

Race condition in postmap?

Expand Messages
  • Richard Cooper
    Hi All, I m using postfix as an MX server which delivers email to the final recipient using virtual aliases. The version number according to rpm is
    Message 1 of 7 , Feb 12, 2010
    • 0 Attachment
      Hi All,

      I'm using postfix as an MX server which delivers email to the final recipient using virtual aliases. The version number according to rpm is postfix-2.3.3-2.1.el5_2, that is the version which is supplied in the main CentOS5 yum repositories. This is working perfectly except for one problem. Occasionally postfix will reject an email with the following error:

      Feb 12 00:41:24 mail1 postfix/smtpd[24782]: NOQUEUE: reject: RCPT from unknown[111.111.111.111]: 550 5.1.1 <recipient@...>: Recipient address rejected: User unknown in virtual alias table; from=<sender@...> to=< recipient@... > proto=SMTP helo=<example.org>

      This is a very intermittent and short lived error. Emails to recipient@... were working before the error and start working again a few seconds after it.

      Based on my debugging it seems that this error is related to me running postmap to rebuild the virtual_alias table This is despite the fact that the recipient@... address is correctly configured in both the old and new virtual_aliases. Here a log of what was happening at the same time as the above error:

      2010-02-12 00:40:39,345 - 24597 - DEBUG - Writing virtual_aliases and virtual_domains
      2010-02-12 00:41:20,496 - 24597 - DEBUG - Done writing virtual_aliases and virtual_domains.
      2010-02-12 00:41:20,506 - 24597 - DEBUG - Running postmap virtual_domains.
      2010-02-12 00:41:23,555 - 24597 - DEBUG - Done running postmap virtual_domains.
      2010-02-12 00:41:23,556 - 24597 - DEBUG - Running postmap virtual_aliases.
      2010-02-12 00:41:24,107 - 24597 - DEBUG - Done running postmap virtual_aliases.

      Maillog doesn't have millisecond precision so I can't see exactly when the "User unknown in virtual alias table" error was logged but it happens either while "postmap virtual_aliases" is running or very shortly (within a second) afterwards. The same pattern repeats itself in other cases of the same error. The error always seem to happen within a second or so of "postmap virtual_aliases" finishing.

      So my questions are:

      1. Does my analysis seem correct?
      2. Is this a known problem? Are there any known race conditions in reloading the virtual aliases config by running "postmap virtual_aliases"
      3. Is there anyway I can fix or work around this problem? Would upgrading help?
      4. If I switched to using a SQL or LDAP backend for the table lookups would this problem go away?

      Thanks in advance,

      - Richard
    • Eray Aslan
      ... http://www.postfix.org/DATABASE_README.html#safe_db -- Eray
      Message 2 of 7 , Feb 12, 2010
      • 0 Attachment
        On 12.02.2010 13:25, Richard Cooper wrote:
        > Feb 12 00:41:24 mail1 postfix/smtpd[24782]: NOQUEUE: reject: RCPT from unknown[111.111.111.111]: 550 5.1.1 <recipient@...>: Recipient address rejected: User unknown in virtual alias table; from=<sender@...> to=< recipient@... > proto=SMTP helo=<example.org>
        >
        > This is a very intermittent and short lived error. Emails to recipient@... were working before the error and start working again a few seconds after it.

        http://www.postfix.org/DATABASE_README.html#safe_db

        --
        Eray
      • LuKreme
        ... The email is not to recipient@example.com, it is to recipient@example.com Didn t we just cover this in the last week? -- I ve just learned about his
        Message 3 of 7 , Feb 12, 2010
        • 0 Attachment
          On 12-Feb-2010, at 04:25, Richard Cooper wrote:
          >
          > to=< recipient@... > proto=SMTP helo=<example.org>
          >
          > This is a very intermittent and short lived error. Emails to recipient@... were working before the error and start working again a few seconds after it.

          The email is not to recipient@..., it is to " recipient@..."

          Didn't we just cover this in the last week?

          --
          "I've just learned about his illness. Let's hope it's nothing
          trivial." Irvin S. Cobb
        • Richard Cooper
          ... I m not sure that apples to my case. That page says If the update fails in the middle [because the disk is full or because something else happens] then
          Message 4 of 7 , Feb 12, 2010
          • 0 Attachment
            On 12 Feb 2010, at 12:12, Eray Aslan wrote:

            > On 12.02.2010 13:25, Richard Cooper wrote:
            >> Feb 12 00:41:24 mail1 postfix/smtpd[24782]: NOQUEUE: reject: RCPT from unknown[111.111.111.111]: 550 5.1.1 <recipient@...>: Recipient address rejected: User unknown in virtual alias table; from=<sender@...> to=< recipient@... > proto=SMTP helo=<example.org>
            >>
            >> This is a very intermittent and short lived error. Emails to recipient@... were working before the error and start working again a few seconds after it.
            >
            > http://www.postfix.org/DATABASE_README.html#safe_db


            I'm not sure that apples to my case. That page says "If the update fails in the middle [because the disk is full or because something else happens] then you have no usable database, and Postfix will stop working". In my case the update completes without error, correctly writes virtual_aliases.db and postfix continues working. The only visible error is that during the update Postfix "forgets" some of the lookup table for a short period of time.

            None the less, it's a good suggestion for the next thing for me to test. I will give it a try and see if it fixes the problem. Thank you.

            - Richard

            PS: Apologies to Eray for the off-list reply
          • Richard Cooper
            ... Sorry. That was a typo I introduced while anonymizing. There were no extraneous spaces in the original. - Richard PS: Apologies to LuKreme for the off-list
            Message 5 of 7 , Feb 12, 2010
            • 0 Attachment
              On 12 Feb 2010, at 12:21, LuKreme wrote:
              > On 12-Feb-2010, at 04:25, Richard Cooper wrote:
              >>
              >> to=< recipient@... > proto=SMTP helo=<example.org>
              >>
              >> This is a very intermittent and short lived error. Emails to recipient@... were working before the error and start working again a few seconds after it.
              >
              > The email is not to recipient@..., it is to " recipient@..."

              Sorry. That was a typo I introduced while anonymizing. There were no extraneous spaces in the original.

              - Richard

              PS: Apologies to LuKreme for the off-list reply
            • Eray Aslan
              ... You might also want to try CDB. Its updates are atomic. Recommended instead of Berkeley DB. http://www.postfix.org/CDB_README.html -- Eray
              Message 6 of 7 , Feb 12, 2010
              • 0 Attachment
                On 12.02.2010 14:47, Richard Cooper wrote:
                > On 12 Feb 2010, at 12:12, Eray Aslan wrote:
                >> On 12.02.2010 13:25, Richard Cooper wrote:
                >>> Feb 12 00:41:24 mail1 postfix/smtpd[24782]: NOQUEUE: reject: RCPT from unknown[111.111.111.111]: 550 5.1.1 <recipient@...>: Recipient address rejected: User unknown in virtual alias table; from=<sender@...> to=< recipient@... > proto=SMTP helo=<example.org>
                >>>
                >>> This is a very intermittent and short lived error. Emails to recipient@... were working before the error and start working again a few seconds after it.
                >>
                >> http://www.postfix.org/DATABASE_README.html#safe_db
                >
                >
                > I'm not sure that apples to my case. That page says "If the update fails in the middle [because the disk is full or because something else happens] then you have no usable database, and Postfix will stop working". In my case the update completes without error, correctly writes virtual_aliases.db and postfix continues working. The only visible error is that during the update Postfix "forgets" some of the lookup table for a short period of time.

                You might also want to try CDB. Its updates are atomic. Recommended
                instead of Berkeley DB.

                http://www.postfix.org/CDB_README.html

                --
                Eray
              • Victor Duchovni
                ... The original Berkeley DB (version 1.8x) which was available at the time that hash and btree table support were added to Postfix was a simple indexed
                Message 7 of 7 , Feb 12, 2010
                • 0 Attachment
                  On Fri, Feb 12, 2010 at 11:25:05AM +0000, Richard Cooper wrote:

                  > Based on my debugging it seems that this error is related to me running postmap to rebuild the virtual_alias table This is despite the fact that the recipient@... address is correctly configured in both the old and new virtual_aliases. Here a log of what was happening at the same time as the above error:

                  The original Berkeley DB (version 1.8x) which was available at the time
                  that "hash" and "btree" table support were added to Postfix was a simple
                  indexed file format and library. In that version of Berkeley DB there
                  were no memory mapped page pools, transaction logs, ...

                  If are using Berkeley DB on a BSD system with version 1.8x, then postmap
                  is race-free due to the Postfix locking protocol for Berkeley DB files.

                  Newer much more feature-full versions of Berkeley DB are no longer race-free
                  with the Postfix locking protocol, and you need to atomically create/rename
                  a newly built table.

                  I don't use Berkeley DB for multi-reader tables, I strongly recommend
                  CDB for that purpose. I only use Berkeley DB for single reader/writer
                  tables such as TLS session caches, address verification caches, ...

                  --
                  Viktor.

                  P.S. Morgan Stanley is looking for a New York City based, Senior Unix
                  system/email administrator to architect and sustain our perimeter email
                  environment. If you are interested, please drop me a note.
                Your message has been successfully submitted and would be delivered to recipients shortly.