[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20130919221850.77620129@samsung-9>
Date: Thu, 19 Sep 2013 22:18:50 -0700
From: Stephen Hemminger <stephen@...workplumber.org>
To: netdev@...r.kernel.org
Subject: Fw: [Bug 61681] New: Incoming TCP4 connections fail to start, don't
get past SYN_RECV and then quickly disappear
Begin forwarded message:
Date: Thu, 19 Sep 2013 09:42:15 -0700
From: "bugzilla-daemon@...zilla.kernel.org" <bugzilla-daemon@...zilla.kernel.org>
To: "stephen@...workplumber.org" <stephen@...workplumber.org>
Subject: [Bug 61681] New: Incoming TCP4 connections fail to start, don't get past SYN_RECV and then quickly disappear
https://bugzilla.kernel.org/show_bug.cgi?id=61681
Bug ID: 61681
Summary: Incoming TCP4 connections fail to start, don't get
past SYN_RECV and then quickly disappear
Product: Networking
Version: 2.5
Kernel Version: Linux xxxxxx 3.4.57-48.42.amzn1.x86_64 #1 SMP Mon Aug
12 21:43:36 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
Hardware: IA-64
OS: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: IPV4
Assignee: shemminger@...ux-foundation.org
Reporter: dcrooke@...il.com
Regression: No
This bug appears to be very rare, but entirely real, and it dates back a long
time. I tried to debug it thoroughly looking at both kernel and webserver
settings, and then got down to looking at netstat.
The Linux kernel can sometimes get into a state where it fails to complete
approx 98% of incoming TCP connection attempts, and only correctly processes
about 2%. These numbers may be relevant as others have posted finding the same
"1 in 50" ratio on much older kernels over the years.
I did not get a chance to capture traffic with iptables / pcap / Wireshark
(production box so we gave up quickly and tried a reboot) but other folks with
the same issue indicate that Linux is sending the wrong remote sequence number
back in the SYN-ACK packet, and the client simply drops it. My experience is
that the half formed connection is torn down almost immediately - I was running
netstat in a continuous loop to see this, others have observed that their
clients send RST in response to the malformed SYN-ACK.
http://serverfault.com/questions/297134/server-not-sending-a-syn-ack-packet-in-response-to-a-syn-packet
http://ask.wireshark.org/questions/23885/rst-after-syn-ack
For us, the problem went away on a reboot and so far has stayed away, so I am
wondering if it is a factor of cumulative traffic but TCP sequence number
wraparound on the Linux end shouldn't cause this afaict, it should be simply
replying to the client with the sequence number that came in the SYN packet.
A number of people have had very similar looking issues due to broken
multi-path network config or a broken NAT device. Obviously this is not the
case here, Amazon knows how to do IT, this box only has one interface, and in
any case the Linux kernel is still responsible for the sequence number it
replies with.
--
You are receiving this mail because:
You are the assignee for the bug.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists