Loading ...
Sorry, an error occurred while loading the content.

Message_size_limit issue with postfix v 2.8.8-1 on RHEL 6

Expand Messages
  • Nicolas HAHN
    Dear all, I m experiencing an issue now about message_size_limit. I ve modified message_size_limit from 20 Mb to 120 Mb this morning. I know that s not
    Message 1 of 29 , Apr 24, 2013
    • 0 Attachment

      Dear all,

      I'm experiencing an issue now about message_size_limit.

      I've modified message_size_limit from 20 Mb to 120 Mb this morning.
      I know that's not something to do and we explained to the customer a messaging system wasn't in any case a file transfer service... But after, politic came in and blablabla...

      And immediately after this change and reloading postfix configuration, the number of active e-mails in queue increased. Several tens of e-mails were unable to be sent by postfix, as if their processing stopped at the step of the qmgr.

      I then tried to set this limit lower at 50 Mb. The same issue persisted.

      Then I rolled back the settings to 20 Mb and the number of active emails suddenly decreased and all e-mails were sent. Queue was empty.

      Note: there is enough space on the partition where is located postfix spool directory (500 Gb).

      Does somebody knows what is happening? Does somebody had this issue?
      Is there a known limit we cannot go above for message_size_limit setting in main.cf?

      Thanks for your help.

      Regards,
      Nicolas


      ----------------------------------------------------------------
      This message was sent using IMP, the Internet Messaging Program.

    • Reindl Harald
      ... no because you missed to send any log-information maybe to less memory to proceed messages with 150 MB
      Message 2 of 29 , Apr 24, 2013
      • 0 Attachment
        Am 24.04.2013 14:58, schrieb Nicolas HAHN:
        > Does somebody knows what is happening?

        no because you missed to send any log-information
        maybe to less memory to proceed messages with 150 MB
      • Viktor Dukhovni
        ... Definitely. ... Postfix memory usage does not depend on message size. Far more likely some filter has message size issues, or perhaps mailbox_size_limit
        Message 3 of 29 , Apr 24, 2013
        • 0 Attachment
          On Wed, Apr 24, 2013 at 03:07:05PM +0200, Reindl Harald wrote:

          > Am 24.04.2013 14:58, schrieb Nicolas HAHN:
          > > Does somebody knows what is happening?
          >
          > no because you missed to send any log-information

          Definitely.

          > maybe to less memory to proceed messages with 150 MB

          Postfix memory usage does not depend on message size. Far more
          likely some filter has message size issues, or perhaps mailbox_size_limit
          has not been raised to match, or the OP failed to reload Postfix, so that
          the message size limit was different in some processes.

          --
          Viktor.
        • Wietse Venema
          ... Postfix logs all delivery attempts, successful or not. What does Postfix log when Several tens of e-mails were unable to be sent ? Wietse
          Message 4 of 29 , Apr 24, 2013
          • 0 Attachment
            Nicolas HAHN:
            > I've modified message_size_limit from 20 Mb to 120 Mb this morning.
            > I know that's not something to do and we explained to the customer
            > a messaging system wasn't in any case a file transfer service...
            > But after, politic came in and blablabla...
            >
            > And immediately after this change and reloading postfix configuration,
            > the number of active e-mails in queue increased. Several tens of
            > e-mails were unable to be sent by postfix, as if their processing
            > stopped at the step of the qmgr.

            Postfix logs all delivery attempts, successful or not. What does
            Postfix log when "Several tens of e-mails were unable to be sent"?

            Wietse
          • Nicolas HAHN
            OK. I did some tries and it seems that I cannot go above 40 Mb for message_size_limit to avoid the issue. As you wrote, here below is a set of log lines during
            Message 5 of 29 , Apr 24, 2013
            • 0 Attachment

              OK.

              I did some tries and it seems that I cannot go above 40 Mb for message_size_limit to avoid the issue.

              As you wrote, here below is a set of log lines during the issue. The emails staying in the growing active queue are the bounce messages (we intercept them to send a copy to postmaster):

              [root@iccpfxor04 postfix]# grep 6B34360BAA /var/log/maillog
              2013-04-24T12:32:01.439701+00:00 iccpfxor04 postfix/cleanup[24423]: 6B34360BAA: message-id=<20130424123201.6B34360BAA@...>
              2013-04-24T12:32:01.442962+00:00 iccpfxor04 postfix/qmgr[24391]: 6B34360BAA: from=<double-bounce@...>, size=8389, nrcpt=1 (queue active)
              2013-04-24T12:32:01.442970+00:00 iccpfxor04 postfix/bounce[26517]: D4B8460078: postmaster non-delivery notification: 6B34360BAA
              2013-04-24T12:36:09.981198+00:00 iccpfxor04 postfix/qmgr[27126]: 6B34360BAA: from=<double-bounce@...>, size=8389, nrcpt=1 (queue active)
              2013-04-24T12:40:44.391001+00:00 iccpfxor04 postfix/qmgr[27707]: 6B34360BAA: from=<double-bounce@...>, size=8389, nrcpt=1 (queue active)


              As you can see in the logs above, it seems to be blocked in the qmgr process, sending the same "from=<double-bounce....." to the rsyslog.

              As soon as we rolled back to a message_size_limit of 20Mb, we've seen in the logs the message has been sent out:

              2013-04-24T12:41:14.125840+00:00 iccpfxor04 postfix/local[27791]: 6B34360BAA: to=<bounceparser@...>, orig_to=<bounceparser>, relay=local, delay=553, delays=523/30/0/0.01, dsn=2.0.0, status=sent (delivered to command:  /bin/bash /usr/local/bin/captureNDR)
              2013-04-24T12:41:14.125943+00:00 iccpfxor04 postfix/qmgr[27707]: 6B34360BAA: removed




              Reindl Harald <h.reindl@...> a écrit :

              >
              >
              > Am 24.04.2013 14:58, schrieb Nicolas HAHN:
              >> Does somebody knows what is happening?
              >
              > no because you missed to send any log-information
              > maybe to less memory to proceed messages with 150 MB
              >
              >


              ----------------------------------------------------------------
              This message was sent using IMP, the Internet Messaging Program.

            • Nicolas HAHN
              ... Each time I ve reloaded (and even restarted) postfix after a config change. Mmhhhh.... But you re probably right. I might have forget to check the
              Message 6 of 29 , Apr 24, 2013
              • 0 Attachment

                > Postfix memory usage does not depend on message size.  Far more
                > likely some filter has message size issues, or perhaps mailbox_size_limit
                > has not been raised to match, or the OP failed to reload Postfix, so that
                > the message size limit was different in some processes.

                Each time I've reloaded (and even restarted) postfix after a config change.

                Mmhhhh.... But you're probably right. I might have forget to check the mailbox_size_limit to make it matching with the message_size_limit... I go check that


                >
                > --
                >         Viktor.
                >


                ----------------------------------------------------------------
                This message was sent using IMP, the Internet Messaging Program.

              • Reindl Harald
                ... these are the wrong log-lines and you clearly see by the sender they are NOT from the message, they are from the bounce
                Message 7 of 29 , Apr 24, 2013
                • 0 Attachment
                  Am 24.04.2013 15:22, schrieb Nicolas HAHN:
                  > As you wrote, here below is a set of log lines during the issue. The emails staying in the growing active queue are
                  > the bounce messages (we intercept them to send a copy to postmaster):
                  >
                  > [root@iccpfxor04 postfix]# grep 6B34360BAA /var/log/maillog
                  > 2013-04-24T12:32:01.439701+00:00 iccpfxor04 postfix/cleanup[24423]: 6B34360BAA:
                  > message-id=<20130424123201.6B34360BAA@...>
                  > 2013-04-24T12:32:01.442962+00:00 iccpfxor04 postfix/qmgr[24391]: 6B34360BAA:
                  > from=<double-bounce@...>, size=8389, nrcpt=1 (queue active)
                  > 2013-04-24T12:32:01.442970+00:00 iccpfxor04 postfix/bounce[26517]: D4B8460078: postmaster non-delivery
                  > notification: 6B34360BAA
                  > 2013-04-24T12:36:09.981198+00:00 iccpfxor04 postfix/qmgr[27126]: 6B34360BAA:
                  > from=<double-bounce@...>, size=8389, nrcpt=1 (queue active)
                  > 2013-04-24T12:40:44.391001+00:00 iccpfxor04 postfix/qmgr[27707]: 6B34360BAA:
                  > from=<double-bounce@...>, size=8389, nrcpt=1 (queue active)
                  >
                  > As you can see in the logs above, it seems to be blocked in the qmgr process, sending the same
                  > "from=<double-bounce....." to the rsyslog

                  these are the wrong log-lines and you clearly see by the sender
                  they are NOT from the message, they are from the bounce
                • Viktor Dukhovni
                  ... All three log entries have a different queue-manager process id. Your queue manager is exiting or being killed. Look for queue-manager fatal errors.
                  Message 8 of 29 , Apr 24, 2013
                  • 0 Attachment
                    On Wed, Apr 24, 2013 at 03:22:14PM +0200, Nicolas HAHN wrote:

                    > 2013-04-24T12:32:01.442962+00:00 iccpfxor04 postfix/qmgr[24391]: 6B34360BAA: from=<double-bounce@...>, size=8389, nrcpt=1 (queue active)
                    > 2013-04-24T12:36:09.981198+00:00 iccpfxor04 postfix/qmgr[27126]: 6B34360BAA: from=<double-bounce@...>, size=8389, nrcpt=1 (queue active)
                    > 2013-04-24T12:40:44.391001+00:00 iccpfxor04 postfix/qmgr[27707]: 6B34360BAA: from=<double-bounce@...>, size=8389, nrcpt=1 (queue active)

                    All three log entries have a different queue-manager process id.
                    Your queue manager is exiting or being killed. Look for queue-manager
                    fatal errors. (mailbox_size_limit too small, ...)

                    --
                    Viktor.
                  • Wietse Venema
                    Nicolas HAHN: Content-Description: Version texte brut du message [ Charset ISO-8859-1 unsupported, converting... ] ... You mis-interpret the logging. ... You
                    Message 9 of 29 , Apr 24, 2013
                    • 0 Attachment
                      Nicolas HAHN:
                      Content-Description: Version texte brut du message

                      [ Charset ISO-8859-1 unsupported, converting... ]
                      > OK.
                      >
                      > I did some tries and it seems that I cannot go above 40 Mb for message_size_limit to avoid the issue.
                      >
                      > As you wrote, here below is a set of log lines during the issue.
                      > The emails staying in the growing active queue are the bounce
                      > messages (we intercept them to send a copy to postmaster):

                      You mis-interpret the logging.

                      > [root@iccpfxor04 postfix]# grep 6B34360BAA /var/log/maillog
                      > 2013-04-24T12:32:01.439701+00:00 iccpfxor04 postfix/cleanup[24423]: 6B34360BAA: message-id=<20130424123201.6B34360BAA@...>
                      > 2013-04-24T12:32:01.442962+00:00 iccpfxor04 postfix/qmgr[24391]: 6B34360BAA: from=<double-bounce@...>, size=8389, nrcpt=1 (queue active)
                      > 2013-04-24T12:32:01.442970+00:00 iccpfxor04 postfix/bounce[26517]: D4B8460078: postmaster non-delivery notification: 6B34360BAA
                      > 2013-04-24T12:36:09.981198+00:00 iccpfxor04 postfix/qmgr[27126]: 6B34360BAA: from=<double-bounce@...>, size=8389, nrcpt=1 (queue active)
                      > 2013-04-24T12:40:44.391001+00:00 iccpfxor04 postfix/qmgr[27707]: 6B34360BAA: from=<double-bounce@...>, size=8389, nrcpt=1 (queue active)
                      >
                      > As you can see in the logs above, it seems to be blocked in the
                      > qmgr process, sending the same "from=<double-bounce....." to the
                      > rsyslog.

                      You mis-interpret the logging. Nothing is blocked by the queue manager.

                      More likely you have a mailbox size limit smaller than the message
                      size limit.

                      Wietse



                      > As soon as we rolled back to a message_size_limit of 20Mb, we've seen in the logs the message has been sent out:
                      >
                      > 2013-04-24T12:41:14.125840+00:00 iccpfxor04 postfix/local[27791]: 6B34360BAA: to=<bounceparser@...>, orig_to=<bounceparser>, relay=local, delay=553, delays=523/30/0/0.01, dsn=2.0.0, status=sent (delivered to command: /bin/bash /usr/local/bin/captureNDR)
                      > 2013-04-24T12:41:14.125943+00:00 iccpfxor04 postfix/qmgr[27707]: 6B34360BAA: removed
                      >
                      > Reindl Harald <h.reindl@...> a ?crit :
                      >
                      > >
                      > >
                      > > Am 24.04.2013 14:58, schrieb Nicolas HAHN:
                      > >> Does somebody knows what is happening?
                      > >
                      > > no because you missed to send any log-information
                      > > maybe to less memory to proceed messages with 150 MB
                      > >
                      > >
                      >
                      >
                      > ----------------------------------------------------------------
                      > This message was sent using IMP, the Internet Messaging Program.
                    • Nicolas HAHN
                      ... Yes, that s the reason. I completely forgot to check the matching between both settings. The mailbox size limit was set to its default value of 50 Mb.
                      Message 10 of 29 , Apr 24, 2013
                      • 0 Attachment

                        > More likely you have a mailbox size limit smaller than the message
                        > size limit.

                        Yes, that's the reason. I completely forgot to check the matching between both settings.

                        The mailbox size limit was set to its default value of 50 Mb. After setting it to 150 Mb, the issue was fixed.

                        Note: the bounce copies sent to the postmaster were the only one impacted by this issue, because the only ones delivered to a local mailbox (server role is a pure MTA).

                        Thanks to have pointing this out :)

                        Postfix feature request: that would be nice that Postfix be able to do this kind of basic checks by itself when starting (or when configuration is reloaded) between various inter-dependent configuration settings, and display in the logs at least some warnings when such kind of issue is detected :)

                        Best regards,
                        Nicolas


                        ----------------------------------------------------------------
                        This message was sent using IMP, the Internet Messaging Program.

                      • Viktor Dukhovni
                        ... 1. Don t send bounces to postmaster, just generate and read log summaries that may highlight aggregate problems with your mail stream. notify_classes =
                        Message 11 of 29 , Apr 24, 2013
                        • 0 Attachment
                          On Wed, Apr 24, 2013 at 03:37:17PM +0200, Nicolas HAHN wrote:

                          > > More likely you have a mailbox size limit smaller than the message
                          > > size limit.
                          >
                          > Yes, that's the reason. I completely forgot to check the matching
                          > between both settings.
                          >
                          > The mailbox size limit was set to its default value of 50 Mb.
                          > After setting it to 150 Mb, the issue was fixed.
                          >
                          > Note: the bounce copies sent to the postmaster were the only one
                          > impacted by this issue, because the only ones delivered to a local
                          > mailbox (server role is a pure MTA).

                          1. Don't send bounces to postmaster, just generate and read log summaries
                          that may highlight aggregate problems with your mail stream.

                          notify_classes =

                          This applies to any MTA handing mail for a large number of users,
                          it is fine to have postmaster notices for a machine with a small
                          handful of users, but after than postmaster notices are just a waste
                          of time and focus your attention on the wrong things (reading bounces
                          of other people's mail).

                          The logs not the postmaster mailbox are your notices of trouble.

                          2. When problems happen. READ THE LOGS.

                          > Postfix feature request: that would be nice that Postfix be able
                          > to do this kind of basic checks by itself when starting (or when
                          > configuration is reloaded) between various inter-dependent
                          > configuration settings, and display in the logs at least some
                          > warnings when such kind of issue is detected :)

                          http://www.postfix.org/DEBUG_README.html#logging

                          --
                          Viktor.
                        • Nicolas HAHN
                          ... We do it as any NDR is captured by the open source tool I m currently coding (I sent a mail there yesterday about version 0.9.13). That s a feature of the
                          Message 12 of 29 , Apr 24, 2013
                          • 0 Attachment

                            > 1.  Don't send bounces to postmaster, just generate and read log summaries
                            >     that may highlight aggregate problems with your mail stream.
                            >
                            >         notify_classes =
                            >
                            >     This applies to any MTA handing mail for a large number of users,
                            >     it is fine to have postmaster notices for a machine with a small
                            >     handful of users, but after than postmaster notices are just a waste
                            >     of time and focus your attention on the wrong things (reading bounces
                            >     of other people's mail).
                            >
                            >     The logs not the postmaster mailbox are your notices of trouble.

                            We do it as any NDR is captured by the open source tool I'm currently coding (I sent a mail there yesterday about version 0.9.13). That's a feature of the tool to allow any SMTP admins in United Nations to access any NDR generated. That's the UN policy as UN Internet Service Provider. This is, again, politic (this kind of reason is given for a lot of things in UN :-)
                            Furthermore, that's a really good think to have complete generated NDR in UN because with that, internal UN customers cannot say "Hey! We haven't received any NDR!". And customers themselves have access to the tool to have the confirmation NDRs are generated, and to see their content directly in the tool. Again, politic... Protection... blablabla

                            But we are not there to discuss UN political choices and features about our messaging services :x

                            >
                            > 2.  When problems happen. READ THE LOGS.

                            That's what I did (especially as coder of my real time postfix logging tool), but I think I might need some holidays after seeing Gb of logs each day since a lot of years... Finally, I start to miss things :-p

                            On the other hand, any software shouldn't allow such situation to happen: conflict between two settings that don't match then qmgr killed... Any software should prevent that and validate any settings for acceptance, conformity, dependencies, ... That's why my feature request :)

                            >
                            >> Postfix feature request: that would be nice that Postfix be able
                            >> to do this kind of basic checks by itself when starting (or when
                            >> configuration is reloaded) between various inter-dependent
                            >> configuration settings, and display in the logs at least some
                            >> warnings when such kind of issue is detected :)
                            >
                            >     http://www.postfix.org/DEBUG_README.html#logging
                            >
                            > --
                            >         Viktor.
                            >


                            ----------------------------------------------------------------
                            This message was sent using IMP, the Internet Messaging Program.

                          • Wietse Venema
                            ... This is already in the logfile. Learn to read it. Wietse
                            Message 13 of 29 , Apr 24, 2013
                            • 0 Attachment
                              Nicolas HAHN:
                              > Postfix feature request: that would be nice that Postfix be able
                              > to do this kind of basic checks by itself when starting (or when
                              > configuration is reloaded) between various inter-dependent
                              > configuration settings, and display in the logs at least some
                              > warnings when such kind of issue is detected :)

                              This is already in the logfile. Learn to read it.

                              Wietse
                            • Nicolas HAHN
                              Yes, but if I m right, the log message is emitted at the time there is an e-mail processed (by postfix/local for my issue). What I m speaking about there is
                              Message 14 of 29 , Apr 24, 2013
                              • 0 Attachment

                                Yes, but if I'm right, the log message is emitted at the time there is an e-mail processed (by postfix/local for my issue).

                                What I'm speaking about there is that postfix should check the configuration for this issue (for example) at the time it is started or its configuration is reloaded, and REFUSE to start because of "fatal: main.cf configuration error: mailbox_size_limit is smaller than message_size_limit".

                                Don't you think?

                                But I learn I learn (well... I try)


                                Wietse Venema <wietse@...> a écrit :

                                > Nicolas HAHN:
                                >> Postfix feature request: that would be nice that Postfix be able
                                >> to do this kind of basic checks by itself when starting (or when
                                >> configuration is reloaded) between various inter-dependent
                                >> configuration settings, and display in the logs at least some
                                >> warnings when such kind of issue is detected :)
                                >
                                > This is already in the logfile. Learn to read it.
                                >
                                >         Wietse
                                >


                                ----------------------------------------------------------------
                                This message was sent using IMP, the Internet Messaging Program.

                              • Nicolas HAHN
                                This story also makes me think, suddenly, that I should integrate in my Log Search Tool a feature allowing real time fatal error catching (and not only fatal)
                                Message 15 of 29 , Apr 24, 2013
                                • 0 Attachment

                                  This story also makes me think, suddenly, that I should integrate in my Log Search Tool a feature allowing real time fatal error catching (and not only fatal) form postfix logs and real time alerting of the users using the tool in case a fatal comes during e-mail procesing. Will see that for versions 0.9.14 or 0.9.15.

                                  There is always something positive to take somewhere :)

                                  BR.
                                  nicolas

                                  Wietse Venema <wietse@...> a écrit :

                                  > Nicolas HAHN:
                                  >> Postfix feature request: that would be nice that Postfix be able
                                  >> to do this kind of basic checks by itself when starting (or when
                                  >> configuration is reloaded) between various inter-dependent
                                  >> configuration settings, and display in the logs at least some
                                  >> warnings when such kind of issue is detected :)
                                  >
                                  > This is already in the logfile. Learn to read it.
                                  >
                                  >         Wietse
                                  >


                                  ----------------------------------------------------------------
                                  This message was sent using IMP, the Internet Messaging Program.

                                • Wietse Venema
                                  ... If you think about duplicating all configuration tests that are inside Postfix, into a program that can be run *before* Postfix, then you can forget about
                                  Message 16 of 29 , Apr 24, 2013
                                  • 0 Attachment
                                    Nicolas HAHN:
                                    > Yes, but if I'm right, the log message is emitted at the time there
                                    > is an e-mail processed (by postfix/local for my issue).
                                    >
                                    > What I'm speaking about there is that postfix should check the
                                    > configuration for this issue (for example) at the time it is started
                                    > or its configuration is reloaded, and REFUSE to start because of
                                    > "fatal: main.cf configuration error: mailbox_size_limit is smaller
                                    > than message_size_limit".
                                    >
                                    > Don't you think?

                                    If you think about duplicating all configuration tests that are
                                    inside Postfix, into a program that can be run *before* Postfix,
                                    then you can forget about that idea.

                                    Configuration tests will not be duplicated by hand, and they will
                                    not duplicated by some automated program that extracts them from
                                    Postfix source code. It is not going to happen.

                                    You will have to learn to read logfiles. Get used to it.

                                    Wietse
                                  • Nicolas HAHN
                                    This is a reply to myself because I m reviewing the way it works. ... In fact, the fatal is written in the logs each minute and 1 second for this issue in my
                                    Message 17 of 29 , Apr 24, 2013
                                    • 0 Attachment

                                      This is a reply to myself because I'm reviewing the way it works.

                                      > Yes, but if I'm right, the log message is emitted at the time there
                                      > is an e-mail processed (by postfix/local for my issue).

                                      In fact, the fatal is written in the logs each minute and 1 second  for this issue in my case by the postfix/local daemon



                                      ----------------------------------------------------------------
                                      This message was sent using IMP, the Internet Messaging Program.

                                    • Viktor Dukhovni
                                      ... Can you post the fatal error messages you found, especially the messages that log why the queue manager was restarting, as that is the real problem. It is
                                      Message 18 of 29 , Apr 24, 2013
                                      • 0 Attachment
                                        On Wed, Apr 24, 2013 at 04:47:26PM +0200, Nicolas HAHN wrote:

                                        > This is a reply to myself because I'm reviewing the way it works.
                                        >
                                        > > Yes, but if I'm right, the log message is emitted at the time there
                                        > > is an e-mail processed (by postfix/local for my issue).
                                        >
                                        > In fact, the fatal is written in the logs each minute and 1 second
                                        > for this issue in my case by the postfix/local daemon

                                        Can you post the fatal error messages you found, especially the
                                        messages that log why the queue manager was restarting, as that
                                        is the real problem.

                                        It is clear why local(8) was having issues, it writes mailbox files.
                                        It is not yet clear why qmgr(8) was having issues. Though the need
                                        to generate bounces into a non-working postmaster mailbox is unfortunate,
                                        when the postmaster mailbox can't be written, that should just lead to
                                        double bounces, which then get thrown away, so I don't see why the
                                        queue-manager would restart...

                                        --
                                        Viktor.
                                      • Nicolas HAHN
                                        ... Here is what I found: 2013-04-24T10:04:38.005665+00:00 iccpfxor04 postfix/local[9370]: fatal: main.cf configuration error: mailbox_size_limit is smaller
                                        Message 19 of 29 , Apr 24, 2013
                                        • 0 Attachment

                                          > Can you post the fatal error messages you found, especially the
                                          > messages that log why the queue manager was restarting, as that
                                          > is the real problem.

                                          Here is what I found:

                                          2013-04-24T10:04:38.005665+00:00 iccpfxor04 postfix/local[9370]: fatal: main.cf configuration error: mailbox_size_limit is smaller than message_size_limit
                                          2013-04-24T10:04:39.006185+00:00 iccpfxor04 postfix/master[26402]: warning: process /usr/libexec/postfix/local pid 9370 exit status 1
                                          2013-04-24T10:04:39.006209+00:00 iccpfxor04 postfix/master[26402]: warning: /usr/libexec/postfix/local: bad command startup -- throttling

                                          For the qmgr, except what I've posted previously, there is nothing else.

                                          BUT that's normal if you consider this scenario:

                                          When I've seen the emails for local delivery were growing in active queue, I've updated several times the message_size_limit and reloaded postfix configuration each time.
                                          The fact qmgr shows a different PID can then considered being normal each time the configuration is reloaded (processes are stopped then restarted certainly).

                                          The lesson for me in fact is that even if you code a log search engine tool, even the coder should not rely ONLY on it for anything (especially when this is an ongoing work and the tool in question is not parsing fatal or warnings from postfix logs for now). I then did some searches directly in the logs based on the QIDs of the emails that stayed in the active queue. And as the warning and fatal don't write the QID in the logs (which is normal because the fatal was written each minute and not related to any email processed finally), they were not catched by my simple grep filter when investigating the logs. That's a conjunction of things that lead me to write my first email there about this issue. It's then true that I should have investigated the logs more carefully :-X

                                          So the conclusion is that there is nothing wrong with the QMGR.

                                          What I consider just abnormal as already written is that for me (so it's my opinion), Postfix should refuse to start when it detects a fatal about a configuration issue in the config files. But it starts any way and display each minute the fatal in the log file. It should display also a message on the console just before to exit. But there are probably good reasons it is actually designed like that.

                                          That's for this reason I'm going to integrate warning and fatal real time parsing in my tool for next coming version. That should help to prevent this kind of behavior from lazzy SMTP admins :-D

                                          >
                                          > It is clear why local(8) was having issues, it writes mailbox files.
                                          > It is not yet clear why qmgr(8) was having issues.  Though the need
                                          > to generate bounces into a non-working postmaster mailbox is unfortunate,
                                          > when the postmaster mailbox can't be written, that should just lead to
                                          > double bounces, which then get thrown away, so I don't see why the
                                          > queue-manager would restart...
                                          >
                                          > --
                                          >         Viktor.
                                          >


                                          ----------------------------------------------------------------
                                          This message was sent using IMP, the Internet Messaging Program.

                                        • Wietse Venema
                                          ... If you think about duplicating all configuration tests that are inside Postfix, into a program that can be run *before* Postfix, then you can forget about
                                          Message 20 of 29 , Apr 24, 2013
                                          • 0 Attachment
                                            Nicolas HAHN:
                                            > What I consider just abnormal as already written is that for me
                                            > (so it's my opinion), Postfix should refuse to start when it detects
                                            > a fatal about a configuration issue in the config files. But it
                                            > starts any way and display each minute the fatal in the log file.

                                            If you think about duplicating all configuration tests that are
                                            inside Postfix, into a program that can be run *before* Postfix,
                                            then you can forget about that idea.

                                            Configuration tests will not be duplicated by hand, and they will
                                            not duplicated by some automated program that extracts them from
                                            Postfix source code. It is not going to happen.

                                            You will have to learn to read logfiles. Get used to it.

                                            Wietse
                                          • Nicolas HAHN
                                            Yea. Thanks, i ve seen it the first time you posted it. But that s not for this reason I ll change my mind about this. BR. nicolas ... This message was sent
                                            Message 21 of 29 , Apr 24, 2013
                                            • 0 Attachment

                                              Yea. Thanks, i've seen it the first time you posted it.

                                              But that's not for this reason I'll change my mind about this.

                                              BR.
                                              nicolas

                                              Wietse Venema <wietse@...> a écrit :

                                              > Nicolas HAHN:
                                              >> What I consider just abnormal as already written is that for me
                                              >> (so it's my opinion), Postfix should refuse to start when it detects
                                              >> a fatal about a configuration issue in the config files. But it
                                              >> starts any way and display each minute the fatal in the log file.
                                              >
                                              > If you think about duplicating all configuration tests that are
                                              > inside Postfix, into a program that can be run *before* Postfix,
                                              > then you can forget about that idea.
                                              >
                                              > Configuration tests will not be duplicated by hand, and they will
                                              > not duplicated by some automated program that extracts them from
                                              > Postfix source code. It is not going to happen.
                                              >
                                              > You will have to learn to read logfiles. Get used to it.
                                              >
                                              >         Wietse
                                              >


                                              ----------------------------------------------------------------
                                              This message was sent using IMP, the Internet Messaging Program.

                                            • /dev/rob0
                                              ... Here, local(8) is having a fatal error. This error occurs whenever local (and probably virtual(8) as well) is invoked for delivery. Conversely, this error
                                              Message 22 of 29 , Apr 24, 2013
                                              • 0 Attachment
                                                On Wed, Apr 24, 2013 at 05:23:09PM +0200, Nicolas HAHN wrote:
                                                > > Can you post the fatal error messages you found, especially the
                                                > > messages that log why the queue manager was restarting, as that
                                                > > is the real problem.
                                                >
                                                > Here is what I found:
                                                >
                                                > 2013-04-24T10:04:38.005665+00:00 iccpfxor04 postfix/local[9370]:
                                                > fatal: main.cf configuration error: mailbox_size_limit is smaller
                                                > than message_size_limit

                                                Here, local(8) is having a fatal error. This error occurs whenever
                                                local (and probably virtual(8) as well) is invoked for delivery.
                                                Conversely, this error does NOT occur otherwise. The rest of the
                                                system might be fine. (Postfix has a modular design, BTW.)

                                                > 2013-04-24T10:04:39.006185+00:00 iccpfxor04 postfix/master[26402]:
                                                > warning: process /usr/libexec/postfix/local pid 9370 exit status 1
                                                > 2013-04-24T10:04:39.006209+00:00 iccpfxor04 postfix/master[26402]:
                                                > warning: /usr/libexec/postfix/local: bad command startup --
                                                > throttling

                                                And see, master(8) is doing fine, letting you know that there is a
                                                problem with local.

                                                [snip]
                                                > What I consider just abnormal as already written is that for me (so
                                                > it's my opinion), Postfix should refuse to start when it detects a
                                                > fatal about a configuration issue in the config files. But it

                                                This seems to betray a lack of understanding of the architecture. I
                                                think at this point you'd benefit from further study:

                                                http://www.postfix.org/OVERVIEW.html

                                                What about a system with no local/virtual delivery? Suppose delivery
                                                is via pipe(8) or lmtp(8)? Or suppose there are no hosted domains at
                                                all, merely a MSA/relay for other hosts? This fatal error in local
                                                won't ever affect such a system.

                                                > starts any way and display each minute the fatal in the log file.

                                                Every time local attempts to deliver.

                                                > It should display also a message on the console just before to

                                                There is no console, this is a daemon process detached from the
                                                terminal.

                                                > exit. But there are probably good reasons it is actually designed
                                                > like that.

                                                Yes.

                                                > That's for this reason I'm going to integrate warning and fatal
                                                > real time parsing in my tool for next coming version. That should
                                                > help to prevent this kind of behavior from lazzy SMTP admins :-D

                                                It's probably impossible to compensate for a sysadmin who lacks
                                                understanding of the system s/he is running.
                                                --
                                                http://rob0.nodns4.us/ -- system administration and consulting
                                                Offlist GMX mail is seen only if "/dev/rob0" is in the Subject:
                                              • Nicolas HAHN
                                                The archietcture is not a good excuse for me, I m sorry. As a coder, allowing a software to start despite the fact there is a FATAL is a total non-sens. And
                                                Message 23 of 29 , Apr 24, 2013
                                                • 0 Attachment

                                                  The "archietcture" is not a good excuse for me, I'm sorry. As a coder, allowing a software to start despite the fact there is a FATAL is a total non-sens. And saying finally that is just a daemon which will not run but the others will, I really don't know how to take it...

                                                  And you can daemonize later. This also is not a good excuse.

                                                  I take the Nagios example:

                                                  [root@server nagios]# service nagios restart
                                                  Running configuration check... CONFIG ERROR!  Restart aborted.  Check your Nagios configuration.
                                                  [root@server nagios]#

                                                  And after having fixed the configuration files:

                                                  [root@server nagios]# service nagios restart
                                                  Running configuration check...done.
                                                  Stopping nagios: ...done.
                                                  Starting nagios: done.
                                                  [root@server nagios]#


                                                  But OK I understand your points and will stop to post my blabla.


                                                  /dev/rob0 <rob0@...> a écrit :

                                                  > On Wed, Apr 24, 2013 at 05:23:09PM +0200, Nicolas HAHN wrote:
                                                  >> > Can you post the fatal error messages you found, especially the
                                                  >> > messages that log why the queue manager was restarting, as that
                                                  >> > is the real problem.
                                                  >>
                                                  >> Here is what I found:
                                                  >>
                                                  >> 2013-04-24T10:04:38.005665+00:00 iccpfxor04 postfix/local[9370]:
                                                  >> fatal: main.cf configuration error: mailbox_size_limit is smaller
                                                  >> than message_size_limit
                                                  >
                                                  > Here, local(8) is having a fatal error. This error occurs whenever
                                                  > local (and probably virtual(8) as well) is invoked for delivery.
                                                  > Conversely, this error does NOT occur otherwise. The rest of the
                                                  > system might be fine. (Postfix has a modular design, BTW.)
                                                  >
                                                  >> 2013-04-24T10:04:39.006185+00:00 iccpfxor04 postfix/master[26402]:
                                                  >> warning: process /usr/libexec/postfix/local pid 9370 exit status 1
                                                  >> 2013-04-24T10:04:39.006209+00:00 iccpfxor04 postfix/master[26402]:
                                                  >> warning: /usr/libexec/postfix/local: bad command startup --
                                                  >> throttling
                                                  >
                                                  > And see, master(8) is doing fine, letting you know that there is a
                                                  > problem with local.
                                                  >
                                                  > [snip]
                                                  >> What I consider just abnormal as already written is that for me (so
                                                  >> it's my opinion), Postfix should refuse to start when it detects a
                                                  >> fatal about a configuration issue in the config files. But it
                                                  >
                                                  > This seems to betray a lack of understanding of the architecture. I
                                                  > think at this point you'd benefit from further study:
                                                  >
                                                  > http://www.postfix.org/OVERVIEW.html
                                                  >
                                                  > What about a system with no local/virtual delivery? Suppose delivery
                                                  > is via pipe(8) or lmtp(8)? Or suppose there are no hosted domains at
                                                  > all, merely a MSA/relay for other hosts? This fatal error in local
                                                  > won't ever affect such a system.
                                                  >
                                                  >> starts any way and display each minute the fatal in the log file.
                                                  >
                                                  > Every time local attempts to deliver.
                                                  >
                                                  >> It should display also a message on the console just before to
                                                  >
                                                  > There is no console, this is a daemon process detached from the
                                                  > terminal.
                                                  >
                                                  >> exit. But there are probably good reasons it is actually designed
                                                  >> like that.
                                                  >
                                                  > Yes.
                                                  >
                                                  >> That's for this reason I'm going to integrate warning and fatal
                                                  >> real time parsing in my tool for next coming version. That should
                                                  >> help to prevent this kind of behavior from lazzy SMTP admins :-D
                                                  >
                                                  > It's probably impossible to compensate for a sysadmin who lacks
                                                  > understanding of the system s/he is running.
                                                  > --
                                                  >   http://rob0.nodns4.us/ -- system administration and consulting
                                                  >   Offlist GMX mail is seen only if "/dev/rob0" is in the Subject:
                                                  >


                                                  ----------------------------------------------------------------
                                                  This message was sent using IMP, the Internet Messaging Program.

                                                • Viktor Dukhovni
                                                  ... You don t yet known enough about Postfix to appreciate the answer, this is not your fault, but the design is fine. -- Viktor.
                                                  Message 24 of 29 , Apr 24, 2013
                                                  • 0 Attachment
                                                    On Wed, Apr 24, 2013 at 07:45:52PM +0200, Nicolas HAHN wrote:

                                                    > The "archietcture" is not a good excuse for me, I'm sorry.

                                                    You don't yet known enough about Postfix to appreciate the answer,
                                                    this is not your fault, but the design is fine.

                                                    --
                                                    Viktor.
                                                  • Wietse Venema
                                                    ... OK, before things get even more embarassing, maybe you should come back when you have some experience with a multi-programmed system of 200k lines of code
                                                    Message 25 of 29 , Apr 24, 2013
                                                    • 0 Attachment
                                                      Nicolas HAHN:
                                                      > The "archietcture" is not a good excuse for me, I'm sorry. As a
                                                      > coder, allowing a software to start despite the fact there is a
                                                      > FATAL is a total non-sens. And saying finally that is just a daemon
                                                      > which will not run but the others will, I really don't know how
                                                      > to take it...

                                                      OK, before things get even more embarassing, maybe you should come
                                                      back when you have some experience with a multi-programmed system
                                                      of 200k lines of code scattered over 40 different programs.

                                                      Let's not waste more time debating this.

                                                      Wietse
                                                    • Reindl Harald
                                                      ... well, that s the difference between coder and delevoper a coder writes something which works for now and every few years all is thrown away because
                                                      Message 26 of 29 , Apr 24, 2013
                                                      • 0 Attachment
                                                        Am 24.04.2013 19:45, schrieb Nicolas HAHN:
                                                        > The "archietcture" is not a good excuse for me, I'm sorry. As a coder

                                                        well, that's the difference between "coder" and "delevoper"

                                                        a "coder" writes something which works for now and every
                                                        few years all is thrown away because the architecture
                                                        and software-design does not fit in growing needs

                                                        look back how many years postfix is perfectly maintained
                                                        AND documentaed like no other software with nearly zero
                                                        breakages of existung setups while it is as scaleable
                                                        as possible for near to any environment

                                                        sorry, but after following the thread you are not qualified
                                                        enough to judge design-patterns of a software you do not
                                                        understand enough
                                                      • Nicolas Hahn
                                                        It s true I don t have your experience as you are the postfix coders. But it s also true you don t know what can be my expericence. Considering your political
                                                        Message 27 of 29 , Apr 24, 2013
                                                        • 0 Attachment
                                                          It's true I don't have your experience as you are the postfix coders.

                                                          But it's also true you don't know what can be my expericence.

                                                          Considering your political answers since the beginning, embarassing for who? You're right that's enough. I'm going to answer you in private.

                                                          Envoyé de mon iPad

                                                          Le 24 avr. 2013 à 20:00, Wietse Venema <wietse@...> a écrit :

                                                          > Nicolas HAHN:
                                                          >> The "archietcture" is not a good excuse for me, I'm sorry. As a
                                                          >> coder, allowing a software to start despite the fact there is a
                                                          >> FATAL is a total non-sens. And saying finally that is just a daemon
                                                          >> which will not run but the others will, I really don't know how
                                                          >> to take it...
                                                          >
                                                          > OK, before things get even more embarassing, maybe you should come
                                                          > back when you have some experience with a multi-programmed system
                                                          > of 200k lines of code scattered over 40 different programs.
                                                          >
                                                          > Let's not waste more time debating this.
                                                          >
                                                          > Wietse
                                                        • Nicolas HAHN
                                                          You re right that s enough and I ll answer you in private. ... This message was sent using IMP, the Internet Messaging Program. You re right that s enough and
                                                          Message 28 of 29 , Apr 24, 2013
                                                          • 0 Attachment

                                                            You're right that's enough and I'll answer you in private.


                                                            Wietse Venema <wietse@...> a écrit :

                                                            > Nicolas HAHN:
                                                            >> The "archietcture" is not a good excuse for me, I'm sorry. As a
                                                            >> coder, allowing a software to start despite the fact there is a
                                                            >> FATAL is a total non-sens. And saying finally that is just a daemon
                                                            >> which will not run but the others will, I really don't know how
                                                            >> to take it...
                                                            >
                                                            > OK, before things get even more embarassing, maybe you should come
                                                            > back when you have some experience with a multi-programmed system
                                                            > of 200k lines of code scattered over 40 different programs.
                                                            >
                                                            > Let's not waste more time debating this.
                                                            >
                                                            >         Wietse
                                                            >


                                                            ----------------------------------------------------------------
                                                            This message was sent using IMP, the Internet Messaging Program.

                                                          • Nicolas HAHN
                                                            ... I agree totally on that. That s why I write in the users mailing list, not in the developpers mailing list. To stop this thread that is borring for
                                                            Message 29 of 29 , Apr 24, 2013
                                                            • 0 Attachment

                                                              > sorry, but after following the thread you are not qualified
                                                              > enough to judge design-patterns of a software you do not
                                                              > understand enough

                                                              I agree totally on that. That's why I write in the users mailing list, not in the developpers mailing list.

                                                              To stop this thread that is borring for everybody now, the next part of my answer in private.



                                                              ----------------------------------------------------------------
                                                              This message was sent using IMP, the Internet Messaging Program.

                                                            Your message has been successfully submitted and would be delivered to recipients shortly.