Loading ...
Sorry, an error occurred while loading the content.

smtpd crashes

Expand Messages
  • Ralf Hildebrandt
    Today I found: Dec 31 20:05:54 mail-ausfall kernel: [876822.781710] smtpd[27410] general protection ip:80813d8 sp:bf9c2d68 error:0 in smtpd[8048000+53000] Jan
    Message 1 of 15 , Jan 2, 2010
    • 0 Attachment
      Today I found:

      Dec 31 20:05:54 mail-ausfall kernel: [876822.781710] smtpd[27410] general protection ip:80813d8 sp:bf9c2d68 error:0 in smtpd[8048000+53000]
      Jan 1 21:22:23 mail-ausfall kernel: [967812.555067] smtpd[1590] general protection ip:80813d8 sp:bfbebe28 error:0 in smtpd[8048000+53000]
      Jan 2 04:36:23 mail-ausfall kernel: [993852.201068] smtpd[5253] general protection ip:80813d8 sp:bfd2aa38 error:0 in smtpd[8048000+53000]

      but these are not backed by any "error" or "fatal" entries in the log;
      instead I found these:

      Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory
      Jan 1 21:52:04 mail-ausfall postfix/verify[31780]: fatal: close database /var/lib/postfix/verify.db: No such file or directory
      Jan 2 04:09:40 mail-ausfall postfix/verify[2919]: fatal: close database /var/lib/postfix/verify.db: No such file or directory
      Jan 2 04:42:40 mail-ausfall postfix/verify[4901]: fatal: close database /var/lib/postfix/verify.db: No such file or directory
      Jan 2 10:17:55 mail-ausfall postfix/postscreen[17151]: fatal: close database /var/lib/postfix/ps_cache.db: No such file or directory

      Which also baffle me, since:

      # ls -l /var/lib/postfix/verify.db
      -rw-r--r-- 1 postfix postfix 20844544 2. Jan 10:28 /var/lib/postfix/verify.db
      # ls -l /var/lib/postfix/ps_cache.db
      -rw------- 1 postfix postfix 6131712 2. Jan 10:24 /var/lib/postfix/ps_cache.db

      I updated to postfix-2.7-20100101 these minutes; maybe something
      changes.

      --
      Ralf Hildebrandt
      Geschäftsbereich IT | Abteilung Netzwerk
      Charité - Universitätsmedizin Berlin
      Campus Benjamin Franklin
      Hindenburgdamm 30 | D-12203 Berlin
      Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962
      ralf.hildebrandt@... | http://www.charite.de
    • Ralf Hildebrandt
      ... postfix-2.7-20091228-nonprod was the old version that caused all the logentries. -- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité -
      Message 2 of 15 , Jan 2, 2010
      • 0 Attachment
        * Ralf Hildebrandt <Ralf.Hildebrandt@...>:

        > I updated to postfix-2.7-20100101 these minutes; maybe something
        > changes.

        postfix-2.7-20091228-nonprod was the old version that caused all the
        logentries.

        --
        Ralf Hildebrandt
        Geschäftsbereich IT | Abteilung Netzwerk
        Charité - Universitätsmedizin Berlin
        Campus Benjamin Franklin
        Hindenburgdamm 30 | D-12203 Berlin
        Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962
        ralf.hildebrandt@... | http://www.charite.de
      • Wietse Venema
        ... Introduced 20091227, fixed 20091230 (dangling pointer in event manager). Sometimes non-production software has a defect. Wietse
        Message 3 of 15 , Jan 2, 2010
        • 0 Attachment
          Ralf Hildebrandt:
          > * Ralf Hildebrandt <Ralf.Hildebrandt@...>:
          >
          > > I updated to postfix-2.7-20100101 these minutes; maybe something
          > > changes.
          >
          > postfix-2.7-20091228-nonprod was the old version that caused all the
          > logentries.

          Introduced 20091227, fixed 20091230 (dangling pointer in event manager).

          Sometimes non-production software has a defect.

          Wietse
        • Len Conrad
          ... Jan 2 09:26:06 mx7 postfix/master[32550]: terminating on signal 15 Jan 2 09:26:06 mx7 postfix/postscreen[32558]: fatal: close database
          Message 4 of 15 , Jan 2, 2010
          • 0 Attachment
            >Dec 31 20:05:54 mail-ausfall kernel: [876822.781710] smtpd[27410] general protection ip:80813d8 sp:bf9c2d68 error:0 in smtpd[8048000+53000]

            ...I have none of these.

            >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory

            I don't use verify anymore, but we get these each time we stop postfix:


            >Jan 2 10:17:55 mail-ausfall postfix/postscreen[17151]: fatal: close database
            >/var/lib/postfix/ps_cache.db: No such file or directory


            Jan 2 09:26:06 mx7 postfix/master[32550]: terminating on signal 15
            Jan 2 09:26:06 mx7 postfix/postscreen[32558]: fatal: close database /var/db/postfix/ps_cache.db: No such file or directory

            mx7# egrep -i "postscreen" /var/log/maillog | awk '{ print $6}' | sort -f | uniq -ic | sort -rfn
            404335 PASS
            347830 BLACKLISTED
            60258 PREGREET
            2124 HANGUP
            593 warning:
            193 WHITELISTED
            35 fatal: <<<<<<<<<

            mx7# postconf mail_version
            mail_version = 2.7-20091209

            Wietse gave a patch but I'm waiting for it to be fixed in a snapshot.

            Len
          • Wietse Venema
            ... That is that stupid but HARMLESS Berkeley DB bug where they barf when a process closes a database after a it fork()s. I m running a DB version that does
            Message 5 of 15 , Jan 2, 2010
            • 0 Attachment
              Len Conrad:
              >
              > >Dec 31 20:05:54 mail-ausfall kernel: [876822.781710] smtpd[27410] general protection ip:80813d8 sp:bf9c2d68 error:0 in smtpd[8048000+53000]
              >
              > ...I have none of these.
              >
              > >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory
              >
              > I don't use verify anymore, but we get these each time we stop postfix:
              >
              > >Jan 2 10:17:55 mail-ausfall postfix/postscreen[17151]: fatal: close database
              > >/var/lib/postfix/ps_cache.db: No such file or directory

              That is that stupid but HARMLESS Berkeley DB bug where they
              barf when a process closes a database after a it fork()s.

              I'm running a DB version that does not do such things, so it
              did not show up in my testing.

              I'll fix this in the next release.

              Wietse
            • Wietse Venema
              ... Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7. Can you check if this warning (and the warning for postscreen) goes away when
              Message 6 of 15 , Jan 2, 2010
              • 0 Attachment
                >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory

                Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7.

                Can you check if this warning (and the warning for postscreen) goes
                away when automatic cache cleanup is turned off?

                address_verify_cache_cleanup_interval = 0
                postscreen_cache_cleanup_interval = 0

                This can't be the same bug as discussed last month with "close
                database after fork", because verify(8) does not fork. Also, Postfix
                does not close the same database twice (I wipe the database handle
                after close to prevent that from happening).

                The warning is harmless because Postfix flushes database buffers
                with each postscreen/verify database update. With synchronous
                database updates it would make no sense if the database failed to
                report update errors immediately and delayed those error reports
                until the database is closed.

                Wietse
              • Wietse Venema
                ... Also not on Fedora Core 11 with the default Berkeley DB 4.7.25. Wietse
                Message 7 of 15 , Jan 2, 2010
                • 0 Attachment
                  Wietse Venema:
                  > >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory

                  Testing the same bogus error with postscreen:

                  > Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7.
                  Also not on Fedora Core 11 with the default Berkeley DB 4.7.25.

                  Wietse

                  > Can you check if this warning (and the warning for postscreen) goes
                  > away when automatic cache cleanup is turned off?
                  >
                  > address_verify_cache_cleanup_interval = 0
                  > postscreen_cache_cleanup_interval = 0
                  >
                  > This can't be the same bug as discussed last month with "close
                  > database after fork", because verify(8) does not fork. Also, Postfix
                  > does not close the same database twice (I wipe the database handle
                  > after close to prevent that from happening).
                  >
                  > The warning is harmless because Postfix flushes database buffers
                  > with each postscreen/verify database update. With synchronous
                  > database updates it would make no sense if the database failed to
                  > report update errors immediately and delayed those error reports
                  > until the database is closed.
                  >
                  > Wietse
                  >
                  >
                • Ralf Hildebrandt
                  ... Yes, I m not seeing those in 20100101 ... That s why I m reporting them... -- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité -
                  Message 8 of 15 , Jan 2, 2010
                  • 0 Attachment
                    * Wietse Venema <wietse@...>:

                    > Introduced 20091227, fixed 20091230 (dangling pointer in event manager).

                    Yes, I'm not seeing those in 20100101

                    > Sometimes non-production software has a defect.

                    That's why I'm reporting them...

                    --
                    Ralf Hildebrandt
                    Geschäftsbereich IT | Abteilung Netzwerk
                    Charité - Universitätsmedizin Berlin
                    Campus Benjamin Franklin
                    Hindenburgdamm 30 | D-12203 Berlin
                    Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962
                    ralf.hildebrandt@... | http://www.charite.de
                  • Ralf Hildebrandt
                    ... It never occured BEFORE the automatic cache cleanup was introduced. ... My system used 4.7.25-8, now I ve switched to 4.8.24-1 (debian version numbers).
                    Message 9 of 15 , Jan 2, 2010
                    • 0 Attachment
                      * Wietse Venema <wietse@...>:
                      > >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory
                      >
                      > Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7.
                      >
                      > Can you check if this warning (and the warning for postscreen) goes
                      > away when automatic cache cleanup is turned off?
                      >
                      > address_verify_cache_cleanup_interval = 0
                      > postscreen_cache_cleanup_interval = 0

                      It never occured BEFORE the automatic cache cleanup was introduced.

                      > This can't be the same bug as discussed last month with "close
                      > database after fork", because verify(8) does not fork. Also, Postfix
                      > does not close the same database twice (I wipe the database handle
                      > after close to prevent that from happening).
                      >
                      > The warning is harmless because Postfix flushes database buffers
                      > with each postscreen/verify database update. With synchronous
                      > database updates it would make no sense if the database failed to
                      > report update errors immediately and delayed those error reports
                      > until the database is closed.

                      My system used 4.7.25-8, now I've switched to 4.8.24-1 (debian version
                      numbers). Let's see what happens.

                      If I'm still getting the errors, I'll turn off the automatic cache
                      cleanup

                      --
                      Ralf Hildebrandt
                      Geschäftsbereich IT | Abteilung Netzwerk
                      Charité - Universitätsmedizin Berlin
                      Campus Benjamin Franklin
                      Hindenburgdamm 30 | D-12203 Berlin
                      Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962
                      ralf.hildebrandt@... | http://www.charite.de
                    • Wietse Venema
                      Ralf Hildebrandt: [ Charset UTF-8 unsupported, converting... ] ... New errors, bogus or not, happen after a program is changed so that it executes code paths
                      Message 10 of 15 , Jan 2, 2010
                      • 0 Attachment
                        Ralf Hildebrandt:
                        [ Charset UTF-8 unsupported, converting... ]
                        > * Wietse Venema <wietse@...>:
                        > > >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory
                        > >
                        > > Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7.
                        > >
                        > > Can you check if this warning (and the warning for postscreen) goes
                        > > away when automatic cache cleanup is turned off?
                        > >
                        > > address_verify_cache_cleanup_interval = 0
                        > > postscreen_cache_cleanup_interval = 0
                        >
                        > It never occured BEFORE the automatic cache cleanup was introduced.

                        New errors, bogus or not, happen after a program is changed so that
                        it executes code paths that it did not execute before.

                        I am going to take a very pragmatic decision. Having established
                        that this is a bogus error, I am going to log it as a non-error.

                        If someone can figure out how to reliably reproduce this, I am
                        mildly interested.

                        Wietse

                        *** ./dict_db.c- Thu Jan 4 09:06:07 2007
                        --- ./dict_db.c Sat Jan 2 16:28:08 2010
                        ***************
                        *** 535,542 ****
                        #endif
                        if (DICT_DB_SYNC(dict_db->db, 0) < 0)
                        msg_fatal("flush database %s: %m", dict_db->dict.name);
                        if (DICT_DB_CLOSE(dict_db->db) < 0)
                        ! msg_fatal("close database %s: %m", dict_db->dict.name);
                        if (dict_db->key_buf)
                        vstring_free(dict_db->key_buf);
                        if (dict_db->val_buf)
                        --- 535,553 ----
                        #endif
                        if (DICT_DB_SYNC(dict_db->db, 0) < 0)
                        msg_fatal("flush database %s: %m", dict_db->dict.name);
                        +
                        + /*
                        + * With some Berkeley DB implementations, close fails with a bogus ENOENT
                        + * error, while it reports no errors with put+sync, no errors with
                        + * del+sync, and no errors with the sync operation just before this
                        + * comment. This happens in programs that never fork and that never share
                        + * the database with other processes. The bogus close error has been
                        + * reported for programs that use the first/next iterator. Instead of
                        + * making Postfix look bad because it reports errors that other programs
                        + * ignore, I'm going to report the bogus error as a non-error.
                        + */
                        if (DICT_DB_CLOSE(dict_db->db) < 0)
                        ! msg_info("close database %s: %m", dict_db->dict.name);
                        if (dict_db->key_buf)
                        vstring_free(dict_db->key_buf);
                        if (dict_db->val_buf)
                      • Wietse Venema
                        ... Also released as postfix-2.7-20090102, with HISTORY file entry: Workaround: don t report bogus Berkeley DB close errors as fatal errors. All operations
                        Message 11 of 15 , Jan 2, 2010
                        • 0 Attachment
                          Wietse Venema:
                          > Ralf Hildebrandt:
                          > > * Wietse Venema <wietse@...>:
                          > > > >Jan 1 20:19:41 mail-ausfall postfix/verify[26329]: fatal: close database /var/lib/postfix/verify.db: No such file or directory
                          > > >
                          > > > Does not reproduce on Ubuntu 9.10-server with the default Berkeley DB 4.7.
                          > > >
                          > > > Can you check if this warning (and the warning for postscreen) goes
                          > > > away when automatic cache cleanup is turned off?
                          > > >
                          > > > address_verify_cache_cleanup_interval = 0
                          > > > postscreen_cache_cleanup_interval = 0
                          > >
                          > > It never occured BEFORE the automatic cache cleanup was introduced.
                          >
                          > New errors, bogus or not, happen after a program is changed so that
                          > it executes code paths that it did not execute before.
                          >
                          > I am going to take a very pragmatic decision. Having established
                          > that this is a bogus error, I am going to log it as a non-error.

                          Also released as postfix-2.7-20090102, with HISTORY file entry:

                          Workaround: don't report bogus Berkeley DB close errors as
                          fatal errors. All operations before close are already error
                          checked, so the data is known to be safe. File: util/dict_db.c.

                          Having spent the better part of today on bogus DB errors, I am now
                          going to spend the rest of this break on non-Postfix things.

                          Wietse

                          > If someone can figure out how to reliably reproduce this, I am
                          > mildly interested.
                          >
                          > Wietse
                          >
                          > *** ./dict_db.c- Thu Jan 4 09:06:07 2007
                          > --- ./dict_db.c Sat Jan 2 16:28:08 2010
                          > ***************
                          > *** 535,542 ****
                          > #endif
                          > if (DICT_DB_SYNC(dict_db->db, 0) < 0)
                          > msg_fatal("flush database %s: %m", dict_db->dict.name);
                          > if (DICT_DB_CLOSE(dict_db->db) < 0)
                          > ! msg_fatal("close database %s: %m", dict_db->dict.name);
                          > if (dict_db->key_buf)
                          > vstring_free(dict_db->key_buf);
                          > if (dict_db->val_buf)
                          > --- 535,553 ----
                          > #endif
                          > if (DICT_DB_SYNC(dict_db->db, 0) < 0)
                          > msg_fatal("flush database %s: %m", dict_db->dict.name);
                          > +
                          > + /*
                          > + * With some Berkeley DB implementations, close fails with a bogus ENOENT
                          > + * error, while it reports no errors with put+sync, no errors with
                          > + * del+sync, and no errors with the sync operation just before this
                          > + * comment. This happens in programs that never fork and that never share
                          > + * the database with other processes. The bogus close error has been
                          > + * reported for programs that use the first/next iterator. Instead of
                          > + * making Postfix look bad because it reports errors that other programs
                          > + * ignore, I'm going to report the bogus error as a non-error.
                          > + */
                          > if (DICT_DB_CLOSE(dict_db->db) < 0)
                          > ! msg_info("close database %s: %m", dict_db->dict.name);
                          > if (dict_db->key_buf)
                          > vstring_free(dict_db->key_buf);
                          > if (dict_db->val_buf)
                          >
                          >
                        • Alejandro Esteban Galvez
                          Hi! I have a Postfix+Dovecot+Ldap system, and this work is OK, but I need to run the quotes user using dovecot+ldap system. Any reply or idea for here? ...
                          Message 12 of 15 , Jan 2, 2010
                          • 0 Attachment
                            Hi! I have a Postfix+Dovecot+Ldap system, and this work is OK, but I need to run the
                            quotes user using dovecot+ldap system. Any reply or idea for here?

                            ------------------------------
                            Infomed - Red de Salud de Cuba
                            http://www.sld.cu/


                            --

                            Este mensaje le ha llegado mediante el servicio de correo electronico que ofrece Infomed para respaldar el cumplimiento de las misiones del Sistema Nacional de Salud. La persona que envia este correo asume el compromiso de usar el servicio a tales fines y cumplir con las regulaciones establecidas

                            Infomed: http://www.sld.cu/
                          • Wietse Venema
                            ... Perhaps you mean quotas? Wietse
                            Message 13 of 15 , Jan 2, 2010
                            • 0 Attachment
                              Alejandro Esteban Galvez:
                              > Hi! I have a Postfix+Dovecot+Ldap system, and this work is OK,
                              > but I need to run the quotes user using dovecot+ldap system. Any
                              > reply or idea for here?

                              Perhaps you mean quotas?

                              Wietse
                            • Alejandro Esteban Galvez
                              quotas ok ... Infomed - Red de Salud de Cuba http://www.sld.cu/ -- Este mensaje le ha llegado mediante el servicio de correo electronico que ofrece Infomed
                              Message 14 of 15 , Jan 2, 2010
                              • 0 Attachment
                                quotas ok

                                Mensaje citado por Wietse Venema <wietse@...>:

                                --- Alejandro Esteban Galvez:
                                --- > Hi! I have a Postfix+Dovecot+Ldap system, and this work is OK,
                                --- > but I need to run the quotes user using dovecot+ldap system. Any
                                --- > reply or idea for here?
                                ---
                                --- Perhaps you mean quotas?
                                ---
                                --- Wietse
                                ---




                                ------------------------------
                                Infomed - Red de Salud de Cuba
                                http://www.sld.cu/


                                --

                                Este mensaje le ha llegado mediante el servicio de correo electronico que ofrece Infomed para respaldar el cumplimiento de las misiones del Sistema Nacional de Salud. La persona que envia este correo asume el compromiso de usar el servicio a tales fines y cumplir con las regulaciones establecidas

                                Infomed: http://www.sld.cu/
                              • Ramiro Blanco
                                You could use dovecot lda with quota plugin if you want quota on deliver. More info: http://wiki.dovecot.org/Quota El 2 de enero de 2010 23:49, Alejandro
                                Message 15 of 15 , Jan 5, 2010
                                • 0 Attachment
                                  You could use dovecot lda with quota plugin if you want quota on deliver.
                                  More info: http://wiki.dovecot.org/Quota



                                  El 2 de enero de 2010 23:49, Alejandro Esteban Galvez <alejandro@...> escribió:
                                  quotas ok

                                  Mensaje citado por Wietse Venema <wietse@...>:

                                  --- Alejandro Esteban Galvez:
                                  --- > Hi! I have a Postfix+Dovecot+Ldap system, and this work is OK,
                                  --- > but I need to run the quotes user using dovecot+ldap system. Any
                                  --- > reply or idea for here?
                                  ---
                                  --- Perhaps you mean quotas?
                                  ---
                                  ---     Wietse
                                  ---




                                  ------------------------------
                                  Infomed - Red de Salud de Cuba
                                  http://www.sld.cu/


                                  --

                                  Este mensaje le ha llegado mediante el servicio de correo electronico que ofrece Infomed para respaldar el cumplimiento de las misiones del Sistema Nacional de Salud. La persona que envia este correo asume el compromiso de usar el servicio a tales fines y cumplir con las regulaciones establecidas

                                  Infomed: http://www.sld.cu/



                                  --
                                  Ramiro Blanco
                                Your message has been successfully submitted and would be delivered to recipients shortly.