Loading ...
Sorry, an error occurred while loading the content.

Re: smtpd crashes

Expand Messages
  • Wietse Venema
    ... Introduced 20091227, fixed 20091230 (dangling pointer in event manager). Sometimes non-production software has a defect. Wietse
    Message 1 of 15 , Jan 2, 2010
    • 0 Attachment
      Ralf Hildebrandt:
      > * Ralf Hildebrandt <Ralf.Hildebrandt@...>:
      >
      > > I updated to postfix-2.7-20100101 these minutes; maybe something
      > > changes.
      >
      > postfix-2.7-20091228-nonprod was the old version that caused all the
      > logentries.

      Introduced 20091227, fixed 20091230 (dangling pointer in event manager).

      Sometimes non-production software has a defect.

      Wietse
    • Len Conrad
      ... Jan 2 09:26:06 mx7 postfix/master[32550]: terminating on signal 15 Jan 2 09:26:06 mx7 postfix/postscreen[32558]: fatal: close database
      Message 2 of 15 , Jan 2, 2010
      • 0 Attachment
        >Dec 31 20:05:54 mail-ausfall kernel: [876822.781710] smtpd[27410] general protection ip:80813d8 sp:bf9c2d68 error:0 in smtpd[8048000+53000]

        ...I have none of these.

        >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory

        I don't use verify anymore, but we get these each time we stop postfix:


        >Jan 2 10:17:55 mail-ausfall postfix/postscreen[17151]: fatal: close database
        >/var/lib/postfix/ps_cache.db: No such file or directory


        Jan 2 09:26:06 mx7 postfix/master[32550]: terminating on signal 15
        Jan 2 09:26:06 mx7 postfix/postscreen[32558]: fatal: close database /var/db/postfix/ps_cache.db: No such file or directory

        mx7# egrep -i "postscreen" /var/log/maillog | awk '{ print $6}' | sort -f | uniq -ic | sort -rfn
        404335 PASS
        347830 BLACKLISTED
        60258 PREGREET
        2124 HANGUP
        593 warning:
        193 WHITELISTED
        35 fatal: <<<<<<<<<

        mx7# postconf mail_version
        mail_version = 2.7-20091209

        Wietse gave a patch but I'm waiting for it to be fixed in a snapshot.

        Len
      • Wietse Venema
        ... That is that stupid but HARMLESS Berkeley DB bug where they barf when a process closes a database after a it fork()s. I m running a DB version that does
        Message 3 of 15 , Jan 2, 2010
        • 0 Attachment
          Len Conrad:
          >
          > >Dec 31 20:05:54 mail-ausfall kernel: [876822.781710] smtpd[27410] general protection ip:80813d8 sp:bf9c2d68 error:0 in smtpd[8048000+53000]
          >
          > ...I have none of these.
          >
          > >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory
          >
          > I don't use verify anymore, but we get these each time we stop postfix:
          >
          > >Jan 2 10:17:55 mail-ausfall postfix/postscreen[17151]: fatal: close database
          > >/var/lib/postfix/ps_cache.db: No such file or directory

          That is that stupid but HARMLESS Berkeley DB bug where they
          barf when a process closes a database after a it fork()s.

          I'm running a DB version that does not do such things, so it
          did not show up in my testing.

          I'll fix this in the next release.

          Wietse
        • Wietse Venema
          ... Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7. Can you check if this warning (and the warning for postscreen) goes away when
          Message 4 of 15 , Jan 2, 2010
          • 0 Attachment
            >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory

            Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7.

            Can you check if this warning (and the warning for postscreen) goes
            away when automatic cache cleanup is turned off?

            address_verify_cache_cleanup_interval = 0
            postscreen_cache_cleanup_interval = 0

            This can't be the same bug as discussed last month with "close
            database after fork", because verify(8) does not fork. Also, Postfix
            does not close the same database twice (I wipe the database handle
            after close to prevent that from happening).

            The warning is harmless because Postfix flushes database buffers
            with each postscreen/verify database update. With synchronous
            database updates it would make no sense if the database failed to
            report update errors immediately and delayed those error reports
            until the database is closed.

            Wietse
          • Wietse Venema
            ... Also not on Fedora Core 11 with the default Berkeley DB 4.7.25. Wietse
            Message 5 of 15 , Jan 2, 2010
            • 0 Attachment
              Wietse Venema:
              > >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory

              Testing the same bogus error with postscreen:

              > Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7.
              Also not on Fedora Core 11 with the default Berkeley DB 4.7.25.

              Wietse

              > Can you check if this warning (and the warning for postscreen) goes
              > away when automatic cache cleanup is turned off?
              >
              > address_verify_cache_cleanup_interval = 0
              > postscreen_cache_cleanup_interval = 0
              >
              > This can't be the same bug as discussed last month with "close
              > database after fork", because verify(8) does not fork. Also, Postfix
              > does not close the same database twice (I wipe the database handle
              > after close to prevent that from happening).
              >
              > The warning is harmless because Postfix flushes database buffers
              > with each postscreen/verify database update. With synchronous
              > database updates it would make no sense if the database failed to
              > report update errors immediately and delayed those error reports
              > until the database is closed.
              >
              > Wietse
              >
              >
            • Ralf Hildebrandt
              ... Yes, I m not seeing those in 20100101 ... That s why I m reporting them... -- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité -
              Message 6 of 15 , Jan 2, 2010
              • 0 Attachment
                * Wietse Venema <wietse@...>:

                > Introduced 20091227, fixed 20091230 (dangling pointer in event manager).

                Yes, I'm not seeing those in 20100101

                > Sometimes non-production software has a defect.

                That's why I'm reporting them...

                --
                Ralf Hildebrandt
                Geschäftsbereich IT | Abteilung Netzwerk
                Charité - Universitätsmedizin Berlin
                Campus Benjamin Franklin
                Hindenburgdamm 30 | D-12203 Berlin
                Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962
                ralf.hildebrandt@... | http://www.charite.de
              • Ralf Hildebrandt
                ... It never occured BEFORE the automatic cache cleanup was introduced. ... My system used 4.7.25-8, now I ve switched to 4.8.24-1 (debian version numbers).
                Message 7 of 15 , Jan 2, 2010
                • 0 Attachment
                  * Wietse Venema <wietse@...>:
                  > >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory
                  >
                  > Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7.
                  >
                  > Can you check if this warning (and the warning for postscreen) goes
                  > away when automatic cache cleanup is turned off?
                  >
                  > address_verify_cache_cleanup_interval = 0
                  > postscreen_cache_cleanup_interval = 0

                  It never occured BEFORE the automatic cache cleanup was introduced.

                  > This can't be the same bug as discussed last month with "close
                  > database after fork", because verify(8) does not fork. Also, Postfix
                  > does not close the same database twice (I wipe the database handle
                  > after close to prevent that from happening).
                  >
                  > The warning is harmless because Postfix flushes database buffers
                  > with each postscreen/verify database update. With synchronous
                  > database updates it would make no sense if the database failed to
                  > report update errors immediately and delayed those error reports
                  > until the database is closed.

                  My system used 4.7.25-8, now I've switched to 4.8.24-1 (debian version
                  numbers). Let's see what happens.

                  If I'm still getting the errors, I'll turn off the automatic cache
                  cleanup

                  --
                  Ralf Hildebrandt
                  Geschäftsbereich IT | Abteilung Netzwerk
                  Charité - Universitätsmedizin Berlin
                  Campus Benjamin Franklin
                  Hindenburgdamm 30 | D-12203 Berlin
                  Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962
                  ralf.hildebrandt@... | http://www.charite.de
                • Wietse Venema
                  Ralf Hildebrandt: [ Charset UTF-8 unsupported, converting... ] ... New errors, bogus or not, happen after a program is changed so that it executes code paths
                  Message 8 of 15 , Jan 2, 2010
                  • 0 Attachment
                    Ralf Hildebrandt:
                    [ Charset UTF-8 unsupported, converting... ]
                    > * Wietse Venema <wietse@...>:
                    > > >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory
                    > >
                    > > Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7.
                    > >
                    > > Can you check if this warning (and the warning for postscreen) goes
                    > > away when automatic cache cleanup is turned off?
                    > >
                    > > address_verify_cache_cleanup_interval = 0
                    > > postscreen_cache_cleanup_interval = 0
                    >
                    > It never occured BEFORE the automatic cache cleanup was introduced.

                    New errors, bogus or not, happen after a program is changed so that
                    it executes code paths that it did not execute before.

                    I am going to take a very pragmatic decision. Having established
                    that this is a bogus error, I am going to log it as a non-error.

                    If someone can figure out how to reliably reproduce this, I am
                    mildly interested.

                    Wietse

                    *** ./dict_db.c- Thu Jan 4 09:06:07 2007
                    --- ./dict_db.c Sat Jan 2 16:28:08 2010
                    ***************
                    *** 535,542 ****
                    #endif
                    if (DICT_DB_SYNC(dict_db->db, 0) < 0)
                    msg_fatal("flush database %s: %m", dict_db->dict.name);
                    if (DICT_DB_CLOSE(dict_db->db) < 0)
                    ! msg_fatal("close database %s: %m", dict_db->dict.name);
                    if (dict_db->key_buf)
                    vstring_free(dict_db->key_buf);
                    if (dict_db->val_buf)
                    --- 535,553 ----
                    #endif
                    if (DICT_DB_SYNC(dict_db->db, 0) < 0)
                    msg_fatal("flush database %s: %m", dict_db->dict.name);
                    +
                    + /*
                    + * With some Berkeley DB implementations, close fails with a bogus ENOENT
                    + * error, while it reports no errors with put+sync, no errors with
                    + * del+sync, and no errors with the sync operation just before this
                    + * comment. This happens in programs that never fork and that never share
                    + * the database with other processes. The bogus close error has been
                    + * reported for programs that use the first/next iterator. Instead of
                    + * making Postfix look bad because it reports errors that other programs
                    + * ignore, I'm going to report the bogus error as a non-error.
                    + */
                    if (DICT_DB_CLOSE(dict_db->db) < 0)
                    ! msg_info("close database %s: %m", dict_db->dict.name);
                    if (dict_db->key_buf)
                    vstring_free(dict_db->key_buf);
                    if (dict_db->val_buf)
                  • Wietse Venema
                    ... Also released as postfix-2.7-20090102, with HISTORY file entry: Workaround: don t report bogus Berkeley DB close errors as fatal errors. All operations
                    Message 9 of 15 , Jan 2, 2010
                    • 0 Attachment
                      Wietse Venema:
                      > Ralf Hildebrandt:
                      > > * Wietse Venema <wietse@...>:
                      > > > >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory
                      > > >
                      > > > Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7.
                      > > >
                      > > > Can you check if this warning (and the warning for postscreen) goes
                      > > > away when automatic cache cleanup is turned off?
                      > > >
                      > > > address_verify_cache_cleanup_interval = 0
                      > > > postscreen_cache_cleanup_interval = 0
                      > >
                      > > It never occured BEFORE the automatic cache cleanup was introduced.
                      >
                      > New errors, bogus or not, happen after a program is changed so that
                      > it executes code paths that it did not execute before.
                      >
                      > I am going to take a very pragmatic decision. Having established
                      > that this is a bogus error, I am going to log it as a non-error.

                      Also released as postfix-2.7-20090102, with HISTORY file entry:

                      Workaround: don't report bogus Berkeley DB close errors as
                      fatal errors. All operations before close are already error
                      checked, so the data is known to be safe. File: util/dict_db.c.

                      Having spent the better part of today on bogus DB errors, I am now
                      going to spend the rest of this break on non-Postfix things.

                      Wietse

                      > If someone can figure out how to reliably reproduce this, I am
                      > mildly interested.
                      >
                      > Wietse
                      >
                      > *** ./dict_db.c- Thu Jan 4 09:06:07 2007
                      > --- ./dict_db.c Sat Jan 2 16:28:08 2010
                      > ***************
                      > *** 535,542 ****
                      > #endif
                      > if (DICT_DB_SYNC(dict_db->db, 0) < 0)
                      > msg_fatal("flush database %s: %m", dict_db->dict.name);
                      > if (DICT_DB_CLOSE(dict_db->db) < 0)
                      > ! msg_fatal("close database %s: %m", dict_db->dict.name);
                      > if (dict_db->key_buf)
                      > vstring_free(dict_db->key_buf);
                      > if (dict_db->val_buf)
                      > --- 535,553 ----
                      > #endif
                      > if (DICT_DB_SYNC(dict_db->db, 0) < 0)
                      > msg_fatal("flush database %s: %m", dict_db->dict.name);
                      > +
                      > + /*
                      > + * With some Berkeley DB implementations, close fails with a bogus ENOENT
                      > + * error, while it reports no errors with put+sync, no errors with
                      > + * del+sync, and no errors with the sync operation just before this
                      > + * comment. This happens in programs that never fork and that never share
                      > + * the database with other processes. The bogus close error has been
                      > + * reported for programs that use the first/next iterator. Instead of
                      > + * making Postfix look bad because it reports errors that other programs
                      > + * ignore, I'm going to report the bogus error as a non-error.
                      > + */
                      > if (DICT_DB_CLOSE(dict_db->db) < 0)
                      > ! msg_info("close database %s: %m", dict_db->dict.name);
                      > if (dict_db->key_buf)
                      > vstring_free(dict_db->key_buf);
                      > if (dict_db->val_buf)
                      >
                      >
                    • Alejandro Esteban Galvez
                      Hi! I have a Postfix+Dovecot+Ldap system, and this work is OK, but I need to run the quotes user using dovecot+ldap system. Any reply or idea for here? ...
                      Message 10 of 15 , Jan 2, 2010
                      • 0 Attachment
                        Hi! I have a Postfix+Dovecot+Ldap system, and this work is OK, but I need to run the
                        quotes user using dovecot+ldap system. Any reply or idea for here?

                        ------------------------------
                        Infomed - Red de Salud de Cuba
                        http://www.sld.cu/


                        --

                        Este mensaje le ha llegado mediante el servicio de correo electronico que ofrece Infomed para respaldar el cumplimiento de las misiones del Sistema Nacional de Salud. La persona que envia este correo asume el compromiso de usar el servicio a tales fines y cumplir con las regulaciones establecidas

                        Infomed: http://www.sld.cu/
                      • Wietse Venema
                        ... Perhaps you mean quotas? Wietse
                        Message 11 of 15 , Jan 2, 2010
                        • 0 Attachment
                          Alejandro Esteban Galvez:
                          > Hi! I have a Postfix+Dovecot+Ldap system, and this work is OK,
                          > but I need to run the quotes user using dovecot+ldap system. Any
                          > reply or idea for here?

                          Perhaps you mean quotas?

                          Wietse
                        • Alejandro Esteban Galvez
                          quotas ok ... Infomed - Red de Salud de Cuba http://www.sld.cu/ -- Este mensaje le ha llegado mediante el servicio de correo electronico que ofrece Infomed
                          Message 12 of 15 , Jan 2, 2010
                          • 0 Attachment
                            quotas ok

                            Mensaje citado por Wietse Venema <wietse@...>:

                            --- Alejandro Esteban Galvez:
                            --- > Hi! I have a Postfix+Dovecot+Ldap system, and this work is OK,
                            --- > but I need to run the quotes user using dovecot+ldap system. Any
                            --- > reply or idea for here?
                            ---
                            --- Perhaps you mean quotas?
                            ---
                            --- Wietse
                            ---




                            ------------------------------
                            Infomed - Red de Salud de Cuba
                            http://www.sld.cu/


                            --

                            Este mensaje le ha llegado mediante el servicio de correo electronico que ofrece Infomed para respaldar el cumplimiento de las misiones del Sistema Nacional de Salud. La persona que envia este correo asume el compromiso de usar el servicio a tales fines y cumplir con las regulaciones establecidas

                            Infomed: http://www.sld.cu/
                          • Ramiro Blanco
                            You could use dovecot lda with quota plugin if you want quota on deliver. More info: http://wiki.dovecot.org/Quota El 2 de enero de 2010 23:49, Alejandro
                            Message 13 of 15 , Jan 5, 2010
                            • 0 Attachment
                              You could use dovecot lda with quota plugin if you want quota on deliver.
                              More info: http://wiki.dovecot.org/Quota



                              El 2 de enero de 2010 23:49, Alejandro Esteban Galvez <alejandro@...> escribió:
                              quotas ok

                              Mensaje citado por Wietse Venema <wietse@...>:

                              --- Alejandro Esteban Galvez:
                              --- > Hi! I have a Postfix+Dovecot+Ldap system, and this work is OK,
                              --- > but I need to run the quotes user using dovecot+ldap system. Any
                              --- > reply or idea for here?
                              ---
                              --- Perhaps you mean quotas?
                              ---
                              ---     Wietse
                              ---




                              ------------------------------
                              Infomed - Red de Salud de Cuba
                              http://www.sld.cu/


                              --

                              Este mensaje le ha llegado mediante el servicio de correo electronico que ofrece Infomed para respaldar el cumplimiento de las misiones del Sistema Nacional de Salud. La persona que envia este correo asume el compromiso de usar el servicio a tales fines y cumplir con las regulaciones establecidas

                              Infomed: http://www.sld.cu/



                              --
                              Ramiro Blanco
                            Your message has been successfully submitted and would be delivered to recipients shortly.