[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4DF0E638.2010506@uth.tmc.edu>
Date: Thu, 9 Jun 2011 10:26:48 -0500
From: Charles Bearden <Charles.F.Bearden@....tmc.edu>
To: <netdev@...r.kernel.org>
Subject: TCP keepalives ignored by kernel when the contain timestamps
I have come across a case that looks like it might be a kernel bug. It appears
that tcp keepalives sent by a remote system are ignored when they contain tcp
timestamps, but are ACKed when they don't. When they are ignored, the remote
system resets the connection after a number of retries.
I have replicated this problem on both Ubuntu 10.04 with a 2.6.32-32-server
kernel (x86_64) and CentOS 5.6 with a 2.6.18-238.12.1.el5 kernel. I'm sorry that
I haven't had a chance to try to replicate the bug with a newer kernel, though a
co-worker has looked through changelogs for more recent kernels and didn't find
anything that looked relevant.
From either of these hosts I run an application that connects to a remote host
for 2-3 minutes, and that for most of that time sends no application data back
and forth. After 30 seconds of no data from the Linux host, the remote host
sends a garden variety keepalive. When the remote host includes tcp timestamps
in the keepalives, they are ignored by the Linux host, and the remote host
resets the connection after 10 unACKed keepalives. When timestamps are absent
from the keepalives, the Linux host ACKs each one, and all is copacetic.
Text output of a tcpdump trace of a connection that fails:
http://pastebin.com/v6CpteJ9
Text output of a tcpdump trace of a connection that succeeds:
http://pastebin.com/KVLb3Mzh
More details, in case you think they are relevant:
My application creates a JDBC connection to a remote MS SQL Server and
executes a statement that does not return a result set, and so it doesn't
need to pass application data back and forth while it executes. The
statement takes 2 or 3 minutes to complete. I connect to two different
remote hosts: a Win2003 machine, and a Win2008R2 machine. The Win2003
machine doesn't put timestamps in its keep-alives, so the application
completes successfully when connecting to that host. If tcp timestamps
are enabled on the Linux host, the Win2008 host includes them in its
keepalives, and they are unACKed, so the connection is reset; if they
are disabled on the Linux host, the Win2008 host doesn't include them in
the keepalives, and the application completes successfully. I use (as
you might expect) sysctl to disable tcp timestamps on the Linux hosts.
I have dumps for all permutations of CentOS/Ubuntu, Win200[38], and +/-
timestamps on the Linux side, and I will post them if the developers think that
they would be useful.
Thanks,
Chuck Bearden
Programmer Analyst IV
The University of Texas Health Science Center at Houston
School of Biomedical Informatics
Email: Charles.F.Bearden@....tmc.edu
Download attachment "smime.p7s" of type "application/pkcs7-signature" (5168 bytes)
Powered by blists - more mailing lists