PATCH (lost connection with domain while...)
- Wietse Venema:
> Are you perhaps behind a NAT gateway? This may expire the connectionLinux specifies the interval in seconds:
> from its tables too early. Such boxes tend to be optimized for
> short-lived http connections which is bad for email.
> Is the remote SMTP server behind a NAT gateway?
> In either case, it may help to turn on keep-alives.,
> For example, in FreeBSD:
> sysctl -w net.inet.tcp.keepidle=100000
sysctl -w net.ipv4.tcp_keepalive_time=100
Solaris specifies it in milliseconds, like *BSD:
ndd -set /dev/tcp tcp_keepalive_interval 100000
Linux sends keepalive probes only after an application turns on
the SO_KEEPALIVE option on a socket.
I suppose Solaris has the same behavior.
To turn on the SO_KEEPALIVE in Postfix, see attached patches for
Postfix 2.3, and for 2.4 and later. It takes an existing workaround
for Solaris, and turns it on for all platforms.
>Are you perhaps behind a NAT gateway? This may expire the connectionfrom its tables too early. Such boxes tend to be optimized for
short-lived http connections which is bad for email.
I'm not behind a NAT gateway.
>Is the remote SMTP server behind a NAT gateway?The remote SMTP servers include hundreds of servers such as verizon.net,
yahoo, many .edus, gmail, etc.
>In either case, it may help to turn on keep-alives.,Current Solaris setting:
> ndd -get /dev/tcp tcp_keepalive_interval7200000
>This is currently not built into Postfix.I have not installed the Postfix patch that you provided in the separate
message. I'm just wondering how my inbound SMTP servers could have been
running for three + years without this patch or problem. How could it
be necessary all of a sudden?
>Are there large messages that DON'T fail?Yes many large messages have no problems. Oddly enough this seems to
happen when a message contains a .vcf or .html file attachment.
>How many EMAIL MESSAGES are you sending in parallel?default_destination_concurrency_limit = 20
- Hargis, Mandy:
> > ndd -get /dev/tcp tcp_keepalive_intervalOn Solaris you don't need the patch. Postfix keepalives are already
turned on to work around kernel bugs.
However 7200000 milliseconds is two hours and that won't make a
difference of the problem is caused by NAT boxes with too short
timeouts. Try 10s and see if it makes a difference.
ndd -set /dev/tcp tcp_keepalive_interval 10000
> I have not installed the Postfix patch that you provided in the separateI suppose that if Postfix didn't change, then something else did.
> message. I'm just wondering how my inbound SMTP servers could have been
> running for three + years without this patch or problem. How could it
> be necessary all of a sudden?
Either this, or the problem already existed and you just didn't
know about it....
If you experience this problem with many sites, then it is
very likely that the problem is at your end of the world.
This is another reason why I suspect that something in your
infrastructure was changed recently.