Loading ...
Sorry, an error occurred while loading the content.

Re: smtpd crashes

Expand Messages
  • Wietse Venema
    ... That is that stupid but HARMLESS Berkeley DB bug where they barf when a process closes a database after a it fork()s. I m running a DB version that does
    Message 1 of 15 , Jan 2, 2010
    • 0 Attachment
      Len Conrad:
      >
      > >Dec 31 20:05:54 mail-ausfall kernel: [876822.781710] smtpd[27410] general protection ip:80813d8 sp:bf9c2d68 error:0 in smtpd[8048000+53000]
      >
      > ...I have none of these.
      >
      > >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory
      >
      > I don't use verify anymore, but we get these each time we stop postfix:
      >
      > >Jan 2 10:17:55 mail-ausfall postfix/postscreen[17151]: fatal: close database
      > >/var/lib/postfix/ps_cache.db: No such file or directory

      That is that stupid but HARMLESS Berkeley DB bug where they
      barf when a process closes a database after a it fork()s.

      I'm running a DB version that does not do such things, so it
      did not show up in my testing.

      I'll fix this in the next release.

      Wietse
    • Wietse Venema
      ... Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7. Can you check if this warning (and the warning for postscreen) goes away when
      Message 2 of 15 , Jan 2, 2010
      • 0 Attachment
        >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory

        Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7.

        Can you check if this warning (and the warning for postscreen) goes
        away when automatic cache cleanup is turned off?

        address_verify_cache_cleanup_interval = 0
        postscreen_cache_cleanup_interval = 0

        This can't be the same bug as discussed last month with "close
        database after fork", because verify(8) does not fork. Also, Postfix
        does not close the same database twice (I wipe the database handle
        after close to prevent that from happening).

        The warning is harmless because Postfix flushes database buffers
        with each postscreen/verify database update. With synchronous
        database updates it would make no sense if the database failed to
        report update errors immediately and delayed those error reports
        until the database is closed.

        Wietse
      • Wietse Venema
        ... Also not on Fedora Core 11 with the default Berkeley DB 4.7.25. Wietse
        Message 3 of 15 , Jan 2, 2010
        • 0 Attachment
          Wietse Venema:
          > >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory

          Testing the same bogus error with postscreen:

          > Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7.
          Also not on Fedora Core 11 with the default Berkeley DB 4.7.25.

          Wietse

          > Can you check if this warning (and the warning for postscreen) goes
          > away when automatic cache cleanup is turned off?
          >
          > address_verify_cache_cleanup_interval = 0
          > postscreen_cache_cleanup_interval = 0
          >
          > This can't be the same bug as discussed last month with "close
          > database after fork", because verify(8) does not fork. Also, Postfix
          > does not close the same database twice (I wipe the database handle
          > after close to prevent that from happening).
          >
          > The warning is harmless because Postfix flushes database buffers
          > with each postscreen/verify database update. With synchronous
          > database updates it would make no sense if the database failed to
          > report update errors immediately and delayed those error reports
          > until the database is closed.
          >
          > Wietse
          >
          >
        • Ralf Hildebrandt
          ... Yes, I m not seeing those in 20100101 ... That s why I m reporting them... -- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité -
          Message 4 of 15 , Jan 2, 2010
          • 0 Attachment
            * Wietse Venema <wietse@...>:

            > Introduced 20091227, fixed 20091230 (dangling pointer in event manager).

            Yes, I'm not seeing those in 20100101

            > Sometimes non-production software has a defect.

            That's why I'm reporting them...

            --
            Ralf Hildebrandt
            Geschäftsbereich IT | Abteilung Netzwerk
            Charité - Universitätsmedizin Berlin
            Campus Benjamin Franklin
            Hindenburgdamm 30 | D-12203 Berlin
            Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962
            ralf.hildebrandt@... | http://www.charite.de
          • Ralf Hildebrandt
            ... It never occured BEFORE the automatic cache cleanup was introduced. ... My system used 4.7.25-8, now I ve switched to 4.8.24-1 (debian version numbers).
            Message 5 of 15 , Jan 2, 2010
            • 0 Attachment
              * Wietse Venema <wietse@...>:
              > >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory
              >
              > Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7.
              >
              > Can you check if this warning (and the warning for postscreen) goes
              > away when automatic cache cleanup is turned off?
              >
              > address_verify_cache_cleanup_interval = 0
              > postscreen_cache_cleanup_interval = 0

              It never occured BEFORE the automatic cache cleanup was introduced.

              > This can't be the same bug as discussed last month with "close
              > database after fork", because verify(8) does not fork. Also, Postfix
              > does not close the same database twice (I wipe the database handle
              > after close to prevent that from happening).
              >
              > The warning is harmless because Postfix flushes database buffers
              > with each postscreen/verify database update. With synchronous
              > database updates it would make no sense if the database failed to
              > report update errors immediately and delayed those error reports
              > until the database is closed.

              My system used 4.7.25-8, now I've switched to 4.8.24-1 (debian version
              numbers). Let's see what happens.

              If I'm still getting the errors, I'll turn off the automatic cache
              cleanup

              --
              Ralf Hildebrandt
              Geschäftsbereich IT | Abteilung Netzwerk
              Charité - Universitätsmedizin Berlin
              Campus Benjamin Franklin
              Hindenburgdamm 30 | D-12203 Berlin
              Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962
              ralf.hildebrandt@... | http://www.charite.de
            • Wietse Venema
              Ralf Hildebrandt: [ Charset UTF-8 unsupported, converting... ] ... New errors, bogus or not, happen after a program is changed so that it executes code paths
              Message 6 of 15 , Jan 2, 2010
              • 0 Attachment
                Ralf Hildebrandt:
                [ Charset UTF-8 unsupported, converting... ]
                > * Wietse Venema <wietse@...>:
                > > >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory
                > >
                > > Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7.
                > >
                > > Can you check if this warning (and the warning for postscreen) goes
                > > away when automatic cache cleanup is turned off?
                > >
                > > address_verify_cache_cleanup_interval = 0
                > > postscreen_cache_cleanup_interval = 0
                >
                > It never occured BEFORE the automatic cache cleanup was introduced.

                New errors, bogus or not, happen after a program is changed so that
                it executes code paths that it did not execute before.

                I am going to take a very pragmatic decision. Having established
                that this is a bogus error, I am going to log it as a non-error.

                If someone can figure out how to reliably reproduce this, I am
                mildly interested.

                Wietse

                *** ./dict_db.c- Thu Jan 4 09:06:07 2007
                --- ./dict_db.c Sat Jan 2 16:28:08 2010
                ***************
                *** 535,542 ****
                #endif
                if (DICT_DB_SYNC(dict_db->db, 0) < 0)
                msg_fatal("flush database %s: %m", dict_db->dict.name);
                if (DICT_DB_CLOSE(dict_db->db) < 0)
                ! msg_fatal("close database %s: %m", dict_db->dict.name);
                if (dict_db->key_buf)
                vstring_free(dict_db->key_buf);
                if (dict_db->val_buf)
                --- 535,553 ----
                #endif
                if (DICT_DB_SYNC(dict_db->db, 0) < 0)
                msg_fatal("flush database %s: %m", dict_db->dict.name);
                +
                + /*
                + * With some Berkeley DB implementations, close fails with a bogus ENOENT
                + * error, while it reports no errors with put+sync, no errors with
                + * del+sync, and no errors with the sync operation just before this
                + * comment. This happens in programs that never fork and that never share
                + * the database with other processes. The bogus close error has been
                + * reported for programs that use the first/next iterator. Instead of
                + * making Postfix look bad because it reports errors that other programs
                + * ignore, I'm going to report the bogus error as a non-error.
                + */
                if (DICT_DB_CLOSE(dict_db->db) < 0)
                ! msg_info("close database %s: %m", dict_db->dict.name);
                if (dict_db->key_buf)
                vstring_free(dict_db->key_buf);
                if (dict_db->val_buf)
              • Wietse Venema
                ... Also released as postfix-2.7-20090102, with HISTORY file entry: Workaround: don t report bogus Berkeley DB close errors as fatal errors. All operations
                Message 7 of 15 , Jan 2, 2010
                • 0 Attachment
                  Wietse Venema:
                  > Ralf Hildebrandt:
                  > > * Wietse Venema <wietse@...>:
                  > > > >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory
                  > > >
                  > > > Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7.
                  > > >
                  > > > Can you check if this warning (and the warning for postscreen) goes
                  > > > away when automatic cache cleanup is turned off?
                  > > >
                  > > > address_verify_cache_cleanup_interval = 0
                  > > > postscreen_cache_cleanup_interval = 0
                  > >
                  > > It never occured BEFORE the automatic cache cleanup was introduced.
                  >
                  > New errors, bogus or not, happen after a program is changed so that
                  > it executes code paths that it did not execute before.
                  >
                  > I am going to take a very pragmatic decision. Having established
                  > that this is a bogus error, I am going to log it as a non-error.

                  Also released as postfix-2.7-20090102, with HISTORY file entry:

                  Workaround: don't report bogus Berkeley DB close errors as
                  fatal errors. All operations before close are already error
                  checked, so the data is known to be safe. File: util/dict_db.c.

                  Having spent the better part of today on bogus DB errors, I am now
                  going to spend the rest of this break on non-Postfix things.

                  Wietse

                  > If someone can figure out how to reliably reproduce this, I am
                  > mildly interested.
                  >
                  > Wietse
                  >
                  > *** ./dict_db.c- Thu Jan 4 09:06:07 2007
                  > --- ./dict_db.c Sat Jan 2 16:28:08 2010
                  > ***************
                  > *** 535,542 ****
                  > #endif
                  > if (DICT_DB_SYNC(dict_db->db, 0) < 0)
                  > msg_fatal("flush database %s: %m", dict_db->dict.name);
                  > if (DICT_DB_CLOSE(dict_db->db) < 0)
                  > ! msg_fatal("close database %s: %m", dict_db->dict.name);
                  > if (dict_db->key_buf)
                  > vstring_free(dict_db->key_buf);
                  > if (dict_db->val_buf)
                  > --- 535,553 ----
                  > #endif
                  > if (DICT_DB_SYNC(dict_db->db, 0) < 0)
                  > msg_fatal("flush database %s: %m", dict_db->dict.name);
                  > +
                  > + /*
                  > + * With some Berkeley DB implementations, close fails with a bogus ENOENT
                  > + * error, while it reports no errors with put+sync, no errors with
                  > + * del+sync, and no errors with the sync operation just before this
                  > + * comment. This happens in programs that never fork and that never share
                  > + * the database with other processes. The bogus close error has been
                  > + * reported for programs that use the first/next iterator. Instead of
                  > + * making Postfix look bad because it reports errors that other programs
                  > + * ignore, I'm going to report the bogus error as a non-error.
                  > + */
                  > if (DICT_DB_CLOSE(dict_db->db) < 0)
                  > ! msg_info("close database %s: %m", dict_db->dict.name);
                  > if (dict_db->key_buf)
                  > vstring_free(dict_db->key_buf);
                  > if (dict_db->val_buf)
                  >
                  >
                • Alejandro Esteban Galvez
                  Hi! I have a Postfix+Dovecot+Ldap system, and this work is OK, but I need to run the quotes user using dovecot+ldap system. Any reply or idea for here? ...
                  Message 8 of 15 , Jan 2, 2010
                  • 0 Attachment
                    Hi! I have a Postfix+Dovecot+Ldap system, and this work is OK, but I need to run the
                    quotes user using dovecot+ldap system. Any reply or idea for here?

                    ------------------------------
                    Infomed - Red de Salud de Cuba
                    http://www.sld.cu/


                    --

                    Este mensaje le ha llegado mediante el servicio de correo electronico que ofrece Infomed para respaldar el cumplimiento de las misiones del Sistema Nacional de Salud. La persona que envia este correo asume el compromiso de usar el servicio a tales fines y cumplir con las regulaciones establecidas

                    Infomed: http://www.sld.cu/
                  • Wietse Venema
                    ... Perhaps you mean quotas? Wietse
                    Message 9 of 15 , Jan 2, 2010
                    • 0 Attachment
                      Alejandro Esteban Galvez:
                      > Hi! I have a Postfix+Dovecot+Ldap system, and this work is OK,
                      > but I need to run the quotes user using dovecot+ldap system. Any
                      > reply or idea for here?

                      Perhaps you mean quotas?

                      Wietse
                    • Alejandro Esteban Galvez
                      quotas ok ... Infomed - Red de Salud de Cuba http://www.sld.cu/ -- Este mensaje le ha llegado mediante el servicio de correo electronico que ofrece Infomed
                      Message 10 of 15 , Jan 2, 2010
                      • 0 Attachment
                        quotas ok

                        Mensaje citado por Wietse Venema <wietse@...>:

                        --- Alejandro Esteban Galvez:
                        --- > Hi! I have a Postfix+Dovecot+Ldap system, and this work is OK,
                        --- > but I need to run the quotes user using dovecot+ldap system. Any
                        --- > reply or idea for here?
                        ---
                        --- Perhaps you mean quotas?
                        ---
                        --- Wietse
                        ---




                        ------------------------------
                        Infomed - Red de Salud de Cuba
                        http://www.sld.cu/


                        --

                        Este mensaje le ha llegado mediante el servicio de correo electronico que ofrece Infomed para respaldar el cumplimiento de las misiones del Sistema Nacional de Salud. La persona que envia este correo asume el compromiso de usar el servicio a tales fines y cumplir con las regulaciones establecidas

                        Infomed: http://www.sld.cu/
                      • Ramiro Blanco
                        You could use dovecot lda with quota plugin if you want quota on deliver. More info: http://wiki.dovecot.org/Quota El 2 de enero de 2010 23:49, Alejandro
                        Message 11 of 15 , Jan 5, 2010
                        • 0 Attachment
                          You could use dovecot lda with quota plugin if you want quota on deliver.
                          More info: http://wiki.dovecot.org/Quota



                          El 2 de enero de 2010 23:49, Alejandro Esteban Galvez <alejandro@...> escribió:
                          quotas ok

                          Mensaje citado por Wietse Venema <wietse@...>:

                          --- Alejandro Esteban Galvez:
                          --- > Hi! I have a Postfix+Dovecot+Ldap system, and this work is OK,
                          --- > but I need to run the quotes user using dovecot+ldap system. Any
                          --- > reply or idea for here?
                          ---
                          --- Perhaps you mean quotas?
                          ---
                          ---     Wietse
                          ---




                          ------------------------------
                          Infomed - Red de Salud de Cuba
                          http://www.sld.cu/


                          --

                          Este mensaje le ha llegado mediante el servicio de correo electronico que ofrece Infomed para respaldar el cumplimiento de las misiones del Sistema Nacional de Salud. La persona que envia este correo asume el compromiso de usar el servicio a tales fines y cumplir con las regulaciones establecidas

                          Infomed: http://www.sld.cu/



                          --
                          Ramiro Blanco
                        Your message has been successfully submitted and would be delivered to recipients shortly.