Loading ...
Sorry, an error occurred while loading the content.

277663Re: internal protocol error from sendmail for messages > message_size_limit

Expand Messages
  • Paul B. Henson
    Jun 2, 2011
      On Wed, Jun 01, 2011 at 05:45:31PM -0700, Wietse Venema wrote:

      > Postfix hasn't changed, as far as I can tell. Perhaps something on
      > your system has changed.

      We did update a number of components, and after viewing postfix
      changelogs and source code without anything popping out, we went back to
      a system with the original config and updated only postfix (and db,
      which the packaging system pulled in), and verified the same change in
      behavior before I posted.

      > I notice that your postdrop aborts before it logs the "File too
      > large" warning. Apparently the program is killed, and that explains
      > why sendmail logs the protocol error.

      The postdrop executable appears to exit normally and successfully send a
      status back to sendmail:

      [pid 24094] close(4) = 0
      [pid 24094] write(1, "status\0001\0reason\0\0\0", 18) = 18
      [pid 24094] exit_group(1) = ?

      For comparison, this is strace output from the older version of postfix:

      [pid 2227] close(4) = 0
      [pid 2227] write(2, "postdrop: warning: uid=0: File t"..., 41postdrop:
      warning: uid=0: File too large
      [pid 2227] gettimeofday({1307042952, 7166}, NULL) = 0
      [pid 2227] sendto(3, "<20>Jun 2 19:29:12 postfix/post"..., 75,
      MSG_NOSIGNAL, NULL, 0) = 75
      [pid 2227] rt_sigaction(SIGINT, {SIG_IGN, [INT],
      SA_RESTORER|SA_RESTART, 0x7fba90647230}, {0x402e50, [INT],
      SA_RESTORER|SA_RESTART, 0x7fba90647230}, 8) = 0
      [pid 2227] rt_sigaction(SIGQUIT, {SIG_IGN, [QUIT],
      SA_RESTORER|SA_RESTART, 0x7fba90647230}, {0x402e50, [QUIT],
      SA_RESTORER|SA_RESTART, 0x7fba90647230}, 8) = 0
      [pid 2227] rt_sigaction(SIGTERM, {SIG_IGN, [TERM],
      SA_RESTORER|SA_RESTART, 0x7fba90647230}, {0x402e50, [TERM],
      SA_RESTORER|SA_RESTART, 0x7fba90647230}, 8) = 0
      [pid 2227] rt_sigaction(SIGHUP, {SIG_IGN, [HUP],
      SA_RESTORER|SA_RESTART, 0x7fba90647230}, {0x402e50, [HUP],
      SA_RESTORER|SA_RESTART, 0x7fba90647230}, 8) = 0
      [pid 2227] unlink("maildrop/088D870065") = 0
      [pid 2227] write(1, "status\0004\0reason\0\0\0", 18) = 18
      [pid 2227] exit_group(4) = ?

      The new version returns a status of 1 rather than 4, and does not print
      out an error or send one to syslog.

      Here's the tail end of the sendmail process with the new version:

      [pid 24078] write(5, "testN\4testN\4testN\4testN\4testN\4te"..., 2964
      <unfinished ...>
      [pid 24078] <... write resumed> ) = 2964
      [pid 24078] --- SIGCHLD (Child exited) @ 0 (0) ---
      [pid 24078] read(5, "status\0001\0reason\0\0\0", 4096) = 18
      [pid 24078] close(5) = 0
      [pid 24078] wait4(24094, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0,
      NULL) = 24094
      [pid 24078] write(2, "sendmail: fatal: root(0): intern"..., 50sendmail:
      fatal: root(0): internal protocol error
      [pid 24078] sendto(3, "<18>Jun 2 06:57:10 postfix/send"..., 85,
      MSG_NOSIGNAL, NULL, 0) = 85
      [pid 24078] exit_group(75) = ?


      And the old:

      [pid 2220] write(5, "testN\4testN\4testN\4testN\4testN\4te"..., 2964
      <unfinished ...>
      [pid 2220] <... write resumed> ) = 2964
      [pid 2220] read(5, <unfinished ...>
      [pid 2220] <... read resumed> "status\0004\0reason\0\0\0", 4096) = 18
      [pid 2220] close(5 <unfinished ...>
      [pid 2220] <... close resumed> ) = 0
      [pid 2220] wait4(2227, [{WIFEXITED(s) && WEXITSTATUS(s) == 4}], 0,
      NULL) = 2227
      [pid 2220] --- SIGCHLD (Child exited) @ 0 (0) ---
      [pid 2220] write(2, "sendmail: fatal: root(0): messag"..., 47sendmail:
      fatal: root(0): message file too big
      [pid 2220] gettimeofday({1307042952, 12857}, NULL) = 0
      [pid 2220] sendto(3, "<18>Jun 2 12:29:12 postfix/send"..., 81,
      MSG_NOSIGNAL, NULL, 0) = 81
      [pid 2220] exit_group(75) = ?


      The postdrop process in the new version does seem to exit before
      sendmail reads the status, as opposed to the old version, but that might
      just be a timing issue with the trace. There's no apparent failure of
      the postdrop process, it seems to be intentionally sending an status of
      1 and voluntarily exiting.

      Any other thoughts? What's the best way to turn on debugging for
      postdrop? The faq mentions updating master.cf for daemons, but I didn't
      see a way to pass a couple -v's to postdrop. Woops, cancel that, looks
      like running sendmail with -v passes that on to postdrop.

      Debug output from new version:

      postdrop: rec_get: type N len 4 data test
      postdrop: rec_get: type X len 0 data
      postdrop: rec_get: type R len 20 data henson@csu
      postdrop: rec_get: type E len 0 data
      postdrop: vstream_fflush_some: fd 4 flush 4096
      postdrop: send attr status = 1
      postdrop: send attr reason =
      postdrop: vstream_fflush_some: fd 1 flush 18
      sendmail: vstream_buf_get_ready: fd 5 got 18
      sendmail: /usr/sbin/postdrop -r -v -v -v -v: wanted attribute: status
      sendmail: input attribute name: status
      sendmail: input attribute value: 1
      sendmail: /usr/sbin/postdrop -r -v -v -v -v: wanted attribute: (list
      terminator)
      sendmail: input attribute name: reason
      sendmail: input attribute value: (end)
      sendmail: /usr/sbin/postdrop -r -v -v -v -v: wanted attribute: (list
      terminator)
      sendmail: input attribute name: (end)
      sendmail: fatal: root(0): internal protocol error

      From old version:

      postdrop: rec_get: type N len 4 data test
      postdrop: rec_get: type X len 0 data
      postdrop: rec_get: type R len 20 data henson@csu
      postdrop: rec_get: type E len 0 data
      postdrop: vstream_fflush_some: fd 4 flush 4096
      postdrop: vstream_fflush_some: fd 4 flush 4096
      postdrop: warning: uid=0: File too large
      postdrop: send attr status = 4
      postdrop: send attr reason =
      postdrop: vstream_fflush_some: fd 1 flush 18
      sendmail: vstream_buf_get_ready: fd 5 got 18
      sendmail: /usr/sbin/postdrop -r -v -v -v: wanted attribute: status
      sendmail: input attribute name: status
      sendmail: input attribute value: 4
      sendmail: /usr/sbin/postdrop -r -v -v -v: wanted attribute: (list
      terminator)
      sendmail: input attribute name: reason
      sendmail: input attribute value: (end)
      sendmail: /usr/sbin/postdrop -r -v -v -v: wanted attribute: (list
      terminator)
      sendmail: input attribute name: (end)
      sendmail: fatal: root(0): message file too big

      The only difference I see is the new version only has 1
      vstream_fflush_some, while the old version has 2. And the new version
      reports a status of 1 rather than 4.

      Thanks...


      --
      Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/
      Operating Systems and Network Analyst | henson@...
      California State Polytechnic University | Pomona CA 91768
    • Show all 8 messages in this topic