Loading ...
Sorry, an error occurred while loading the content.

Re: sporadic bouts of lost connections to exchange 2010 hub transport

Expand Messages
  • Stan Hoeppner
    ... First, this does seem to be a rare issue. Given the behavior you re seeing it seems likely the problem is in the TCP stack. TCP window scaling
    Message 1 of 10 , Sep 25, 2012
    • 0 Attachment
      On 9/25/2012 8:29 AM, Ralf Hildebrandt wrote:
      > * Mikael Bak <mbak@...>:
      >> Hi Stan,
      >>
      >> On 09/25/2012 08:22 AM, Stan Hoeppner wrote:
      >>>
      >>> Apparently Linux and Windows TCP window scaling doesn't always work
      >>> reliably together. Try disabling TCP window scaling on the Linux box(en):
      >>>
      >> [snip]
      >>
      >> Perhaps off topic, but do you have any links to documents or similar
      >> that proves that there is a problem between the two operationg systems
      >> with regard to TCP window scaling. This is the first time I hear about
      >> this to be honest.
      >
      > I was wondering about this as well. I mean, it doesn't happen THAT
      > often.

      First, this does seem to be a rare issue. Given the behavior you're
      seeing it seems likely the problem is in the TCP stack. TCP window
      scaling mis-negotiation simply seems a likely culprit. Linux kernels
      have a workaround hack for window scaling issues:

      man 7 tcp

      tcp_workaround_signed_windows (Boolean; default: disabled;
      since Linux 2.6.26)

      If enabled, assume that no receipt of a window-scaling
      option means that the remote TCP is broken and treats
      the window as a signed quantity. If disabled, assume
      that the remote TCP is not broken even if we do not
      receive a window scaling option from it.

      To me this seems a partial workaround, not an absolute, which is why I
      recommended testing with window scaling totally disabled on one side of
      the connection. Since window scaling is designed to maximize throughput
      for streaming data transfer applications such as FTP, disabling it will
      have little, if any, negative impact on SMTP traffic, which is
      transactional and bursty in nature. Disabling windows scaling in your
      Postfix/Exchange case should simply force both to use the RFC1323 64KB
      max window size. If the problem is window negotiation, disabling it
      should fix the problem.

      The rarity of manifestation seems to indicate that on occasion you have
      long bursts of traffic between the two hosts--bursts of sufficient
      duration to cause one or both hosts to initiate window scaling to
      increase throughput. When this occurs, and if negotiation fails, you
      may see things break at the application level.

      Regarding docs or links, I couldn't find any official documentation
      describing this issue, only a few scattered forum posts, which is likely
      directly related to the rarity of occurrence.

      You could always put a trace on the Linux ethernet interface to confirm
      the TCP problem. But given the rarity of occurrence, twice in 4 weeks,
      that would yield a rather large file to search.

      --
      Stan
    Your message has been successfully submitted and would be delivered to recipients shortly.