Loading ...
Sorry, an error occurred while loading the content.

Re: Postdrop doesn't always stop when "postfix stop" is issued

Expand Messages
  • Quanah Gibson-Mount
    --On Wednesday, August 31, 2011 7:58 PM -0400 Wietse Venema ... Hi Wietse, Thanks, I think I understand what is happening. This is the Zimbra Postfix, not the
    Message 1 of 7 , Aug 31, 2011
    • 0 Attachment
      --On Wednesday, August 31, 2011 7:58 PM -0400 Wietse Venema
      <wietse@...> wrote:

      > Quanah Gibson-Mount:
      >> This is extremely difficult to reproduce, but it does happen
      >> occasionally -- We will tell postfix to stop, and once that is
      >> complete, a "postdrop" process will sometimes remain, and will run
      >> until it is manually killed.
      >>
      >> Is this an expected behavior of postdrop -- That after the master
      >> postfix is stopped, it is expected sometimes that it may continue
      >> running, regardless?
      >
      > This is 100% intentional. The Postfix sendmail command MUST NOT
      > drop mail on the floor while the mail system is down.
      >
      > For example there are programs that run at boot time that rely on
      > the availability of sendmail command-line submission, such as text
      > editors that want to send "how to recover your session" email.
      >
      > Other daemons such as cron may be running while the Postfix daemons
      > are down for whatever reason. Their mail should not be lost, either.

      Hi Wietse,

      Thanks, I think I understand what is happening. This is the Zimbra
      Postfix, not the system one. We generally see this when upgrading Zimbra
      to a newer version. I see that the order services stop is to have the
      mailbox server (which receives email from postfix over LMTP) stop before
      postfix is stopped. My guess is that postfix is in the middle of trying to
      deliver an email to it when this happens. I'll change the stop order so
      that postfix is stopped long before the mailbox, which should give postdrop
      time to finish any deliveries it needs before the mailbox server is stopped.

      --Quanah

      --

      Quanah Gibson-Mount
      Sr. Member of Technical Staff
      Zimbra, Inc
      A Division of VMware, Inc.
      --------------------
      Zimbra :: the leader in open source messaging and collaboration
    • Victor Duchovni
      ... Well, yes, postdrop(1) is expected to reliably enqueue mail, even when the mail system is down. This said, it is not really expected to enter an infinite
      Message 2 of 7 , Sep 1, 2011
      • 0 Attachment
        On Wed, Aug 31, 2011 at 07:58:55PM -0400, Wietse Venema wrote:

        > > This is extremely difficult to reproduce, but it does happen occasionally
        > > -- We will tell postfix to stop, and once that is complete, a "postdrop"
        > > process will sometimes remain, and will run until it is manually killed.
        > >
        > > Is this an expected behavior of postdrop -- That after the master postfix
        > > is stopped, it is expected sometimes that it may continue running,
        > > regardless?
        >
        > This is 100% intentional. The Postfix sendmail command MUST NOT
        > drop mail on the floor while the mail system is down.

        Well, yes, postdrop(1) is expected to reliably enqueue mail, even when
        the mail system is down. This said, it is not really expected to enter
        an infinite loop!

        On Wed, Aug 31, 2011 at 04:36:22PM -0700, Quanah Gibson-Mount wrote:

        > This is extremely difficult to reproduce, but it does happen
        > occasionally -- We will tell postfix to stop, and once that is
        > complete, a "postdrop" process will sometimes remain, and will run
        > until it is manually killed.
        >
        > Is this an expected behavior of postdrop -- That after the master
        > postfix is stopped, it is expected sometimes that it may continue
        > running, regardless?

        Normally, postdrop(1) will enqueue the message and exit, whether the
        mail system is up or not. The only plausible failure reason is inability
        to access the "maildrop" directory, either because the setgid bit has
        been cleared on the postdrop(1) binary, or because the directory has
        been moved, deleted, modified to not allow group write access, ...

        So the question is what is it that is causing postdrop to loop while
        trying to create the queue file?

        /*
        * Create a file with a temporary name that does not collide. The process
        * ID alone is not sufficiently unique: maildrops can be shared via the
        * network. Not that I recommend using a network-based queue, or having
        * multiple hosts write to the same queue, but we should try to avoid
        * losing mail if we can.
        *
        * If someone is racing against us, try to win.
        */
        for (;;) {
        GETTIMEOFDAY(tp);
        vstring_sprintf(temp_path, "%s/%d.%d", queue_name,
        (int) tp->tv_usec, pid);
        if ((fd = open(STR(temp_path), O_RDWR | O_CREAT | O_EXCL, mode)) >= 0)
        break;
        if (errno == EEXIST || errno == EISDIR)
        continue;
        msg_warn("%s: create file %s: %m", myname, STR(temp_path));
        sleep(10);
        }

        Are the "create file" warnings found in the system log?

        --
        Viktor.
      • Quanah Gibson-Mount
        --On Thursday, September 01, 2011 2:03 PM -0400 Victor Duchovni ... Yes: Mar 22 19:24:52 domain postfix/postdrop[3624]: warning: mail_queue_enter: create file
        Message 3 of 7 , Sep 1, 2011
        • 0 Attachment
          --On Thursday, September 01, 2011 2:03 PM -0400 Victor Duchovni
          <Victor.Duchovni@...> wrote:

          > So the question is what is it that is causing postdrop to loop while
          > trying to create the queue file?
          >
          > /*
          > * Create a file with a temporary name that does not collide. The
          > process * ID alone is not sufficiently unique: maildrops can be
          > shared via the * network. Not that I recommend using a network-based
          > queue, or having * multiple hosts write to the same queue, but we
          > should try to avoid * losing mail if we can.
          > *
          > * If someone is racing against us, try to win.
          > */
          > for (;;) {
          > GETTIMEOFDAY(tp);
          > vstring_sprintf(temp_path, "%s/%d.%d", queue_name,
          > (int) tp->tv_usec, pid);
          > if ((fd = open(STR(temp_path), O_RDWR | O_CREAT | O_EXCL, mode))
          > >= 0) break;
          > if (errno == EEXIST || errno == EISDIR)
          > continue;
          > msg_warn("%s: create file %s: %m", myname, STR(temp_path));
          > sleep(10);
          > }
          >
          > Are the "create file" warnings found in the system log?

          Yes:

          Mar 22 19:24:52 domain postfix/postdrop[3624]: warning: mail_queue_enter:
          create file maildrop/976917.3624: No such file or directory

          for example.

          However, what is odd about this is we have postfix explicitly use a queue
          directory that is always present (/opt/zimbra/data/postfix/spool/), so it
          shouldn't be encountering any errors creating a file. :/

          I was also wrong about the shutdown order -- We shutdown postfix first, and
          then the other services.

          --Quanah

          --

          Quanah Gibson-Mount
          Sr. Member of Technical Staff
          Zimbra, Inc
          A Division of VMware, Inc.
          --------------------
          Zimbra :: the leader in open source messaging and collaboration
        • Wietse Venema
          ... Well, yes, one is not supposed to remove the submission directory and ignore postdrop error messages. If people use Postfix, then at least they have a
          Message 4 of 7 , Sep 1, 2011
          • 0 Attachment
            Victor Duchovni:
            > On Wed, Aug 31, 2011 at 07:58:55PM -0400, Wietse Venema wrote:
            >
            > > > This is extremely difficult to reproduce, but it does happen occasionally
            > > > -- We will tell postfix to stop, and once that is complete, a "postdrop"
            > > > process will sometimes remain, and will run until it is manually killed.
            > > >
            > > > Is this an expected behavior of postdrop -- That after the master postfix
            > > > is stopped, it is expected sometimes that it may continue running,
            > > > regardless?
            > >
            > > This is 100% intentional. The Postfix sendmail command MUST NOT
            > > drop mail on the floor while the mail system is down.
            >
            > Well, yes, postdrop(1) is expected to reliably enqueue mail, even when
            > the mail system is down. This said, it is not really expected to enter
            > an infinite loop!

            Well, yes, one is not supposed to remove the submission directory and
            ignore postdrop error messages.

            If people use Postfix, then at least they have a chance to re-create
            the missing directory or permissions, and avoid losing mail.

            Wietse
          • Victor Duchovni
            ... So, most likely the maildrop directory is no longer present, or the queue directory itself has been moved, unmounted, ... The postdrop(1) process
            Message 5 of 7 , Sep 1, 2011
            • 0 Attachment
              On Thu, Sep 01, 2011 at 11:26:48AM -0700, Quanah Gibson-Mount wrote:

              > > msg_warn("%s: create file %s: %m", myname, STR(temp_path));
              > >
              > >Are the "create file" warnings found in the system log?
              >
              > Yes:
              >
              > Mar 22 19:24:52 domain postfix/postdrop[3624]: warning:
              > mail_queue_enter: create file maildrop/976917.3624: No such file or
              > directory
              >
              > for example.

              So, most likely the "maildrop" directory is no longer present, or the
              queue directory itself has been moved, unmounted, ... The postdrop(1)
              process performs a chdir(2) to the queue_directory, so if that is
              replaced, it won't find a maildrop sub-directory...

              > However, what is odd about this is we have postfix explicitly use a
              > queue directory that is always present
              > (/opt/zimbra/data/postfix/spool/), so it shouldn't be encountering
              > any errors creating a file. :/

              This claim looks implausible, or main.cf was briefly modified to cause
              postdrop(1) to use the wrong directory, ...

              Make sure you are checking the correct instance (generally the default
              one with sendmail/postdrop).

              --
              Viktor.
            Your message has been successfully submitted and would be delivered to recipients shortly.