Loading ...
Sorry, an error occurred while loading the content.

Re: Postdrop doesn't always stop when "postfix stop" is issued

Expand Messages
  • Victor Duchovni
    ... Well, yes, postdrop(1) is expected to reliably enqueue mail, even when the mail system is down. This said, it is not really expected to enter an infinite
    Message 1 of 7 , Sep 1, 2011
    • 0 Attachment
      On Wed, Aug 31, 2011 at 07:58:55PM -0400, Wietse Venema wrote:

      > > This is extremely difficult to reproduce, but it does happen occasionally
      > > -- We will tell postfix to stop, and once that is complete, a "postdrop"
      > > process will sometimes remain, and will run until it is manually killed.
      > >
      > > Is this an expected behavior of postdrop -- That after the master postfix
      > > is stopped, it is expected sometimes that it may continue running,
      > > regardless?
      >
      > This is 100% intentional. The Postfix sendmail command MUST NOT
      > drop mail on the floor while the mail system is down.

      Well, yes, postdrop(1) is expected to reliably enqueue mail, even when
      the mail system is down. This said, it is not really expected to enter
      an infinite loop!

      On Wed, Aug 31, 2011 at 04:36:22PM -0700, Quanah Gibson-Mount wrote:

      > This is extremely difficult to reproduce, but it does happen
      > occasionally -- We will tell postfix to stop, and once that is
      > complete, a "postdrop" process will sometimes remain, and will run
      > until it is manually killed.
      >
      > Is this an expected behavior of postdrop -- That after the master
      > postfix is stopped, it is expected sometimes that it may continue
      > running, regardless?

      Normally, postdrop(1) will enqueue the message and exit, whether the
      mail system is up or not. The only plausible failure reason is inability
      to access the "maildrop" directory, either because the setgid bit has
      been cleared on the postdrop(1) binary, or because the directory has
      been moved, deleted, modified to not allow group write access, ...

      So the question is what is it that is causing postdrop to loop while
      trying to create the queue file?

      /*
      * Create a file with a temporary name that does not collide. The process
      * ID alone is not sufficiently unique: maildrops can be shared via the
      * network. Not that I recommend using a network-based queue, or having
      * multiple hosts write to the same queue, but we should try to avoid
      * losing mail if we can.
      *
      * If someone is racing against us, try to win.
      */
      for (;;) {
      GETTIMEOFDAY(tp);
      vstring_sprintf(temp_path, "%s/%d.%d", queue_name,
      (int) tp->tv_usec, pid);
      if ((fd = open(STR(temp_path), O_RDWR | O_CREAT | O_EXCL, mode)) >= 0)
      break;
      if (errno == EEXIST || errno == EISDIR)
      continue;
      msg_warn("%s: create file %s: %m", myname, STR(temp_path));
      sleep(10);
      }

      Are the "create file" warnings found in the system log?

      --
      Viktor.
    • Quanah Gibson-Mount
      --On Thursday, September 01, 2011 2:03 PM -0400 Victor Duchovni ... Yes: Mar 22 19:24:52 domain postfix/postdrop[3624]: warning: mail_queue_enter: create file
      Message 2 of 7 , Sep 1, 2011
      • 0 Attachment
        --On Thursday, September 01, 2011 2:03 PM -0400 Victor Duchovni
        <Victor.Duchovni@...> wrote:

        > So the question is what is it that is causing postdrop to loop while
        > trying to create the queue file?
        >
        > /*
        > * Create a file with a temporary name that does not collide. The
        > process * ID alone is not sufficiently unique: maildrops can be
        > shared via the * network. Not that I recommend using a network-based
        > queue, or having * multiple hosts write to the same queue, but we
        > should try to avoid * losing mail if we can.
        > *
        > * If someone is racing against us, try to win.
        > */
        > for (;;) {
        > GETTIMEOFDAY(tp);
        > vstring_sprintf(temp_path, "%s/%d.%d", queue_name,
        > (int) tp->tv_usec, pid);
        > if ((fd = open(STR(temp_path), O_RDWR | O_CREAT | O_EXCL, mode))
        > >= 0) break;
        > if (errno == EEXIST || errno == EISDIR)
        > continue;
        > msg_warn("%s: create file %s: %m", myname, STR(temp_path));
        > sleep(10);
        > }
        >
        > Are the "create file" warnings found in the system log?

        Yes:

        Mar 22 19:24:52 domain postfix/postdrop[3624]: warning: mail_queue_enter:
        create file maildrop/976917.3624: No such file or directory

        for example.

        However, what is odd about this is we have postfix explicitly use a queue
        directory that is always present (/opt/zimbra/data/postfix/spool/), so it
        shouldn't be encountering any errors creating a file. :/

        I was also wrong about the shutdown order -- We shutdown postfix first, and
        then the other services.

        --Quanah

        --

        Quanah Gibson-Mount
        Sr. Member of Technical Staff
        Zimbra, Inc
        A Division of VMware, Inc.
        --------------------
        Zimbra :: the leader in open source messaging and collaboration
      • Wietse Venema
        ... Well, yes, one is not supposed to remove the submission directory and ignore postdrop error messages. If people use Postfix, then at least they have a
        Message 3 of 7 , Sep 1, 2011
        • 0 Attachment
          Victor Duchovni:
          > On Wed, Aug 31, 2011 at 07:58:55PM -0400, Wietse Venema wrote:
          >
          > > > This is extremely difficult to reproduce, but it does happen occasionally
          > > > -- We will tell postfix to stop, and once that is complete, a "postdrop"
          > > > process will sometimes remain, and will run until it is manually killed.
          > > >
          > > > Is this an expected behavior of postdrop -- That after the master postfix
          > > > is stopped, it is expected sometimes that it may continue running,
          > > > regardless?
          > >
          > > This is 100% intentional. The Postfix sendmail command MUST NOT
          > > drop mail on the floor while the mail system is down.
          >
          > Well, yes, postdrop(1) is expected to reliably enqueue mail, even when
          > the mail system is down. This said, it is not really expected to enter
          > an infinite loop!

          Well, yes, one is not supposed to remove the submission directory and
          ignore postdrop error messages.

          If people use Postfix, then at least they have a chance to re-create
          the missing directory or permissions, and avoid losing mail.

          Wietse
        • Victor Duchovni
          ... So, most likely the maildrop directory is no longer present, or the queue directory itself has been moved, unmounted, ... The postdrop(1) process
          Message 4 of 7 , Sep 1, 2011
          • 0 Attachment
            On Thu, Sep 01, 2011 at 11:26:48AM -0700, Quanah Gibson-Mount wrote:

            > > msg_warn("%s: create file %s: %m", myname, STR(temp_path));
            > >
            > >Are the "create file" warnings found in the system log?
            >
            > Yes:
            >
            > Mar 22 19:24:52 domain postfix/postdrop[3624]: warning:
            > mail_queue_enter: create file maildrop/976917.3624: No such file or
            > directory
            >
            > for example.

            So, most likely the "maildrop" directory is no longer present, or the
            queue directory itself has been moved, unmounted, ... The postdrop(1)
            process performs a chdir(2) to the queue_directory, so if that is
            replaced, it won't find a maildrop sub-directory...

            > However, what is odd about this is we have postfix explicitly use a
            > queue directory that is always present
            > (/opt/zimbra/data/postfix/spool/), so it shouldn't be encountering
            > any errors creating a file. :/

            This claim looks implausible, or main.cf was briefly modified to cause
            postdrop(1) to use the wrong directory, ...

            Make sure you are checking the correct instance (generally the default
            one with sendmail/postdrop).

            --
            Viktor.
          Your message has been successfully submitted and would be delivered to recipients shortly.