[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200806151537.11986.didier@raboud.com>
Date: Sun, 15 Jun 2008 15:37:09 +0200
From: Didier Raboud <didier@...oud.com>
To: "Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Netdev <netdev@...r.kernel.org>, bugme-daemon@...zilla.kernel.org
Subject: Re: [Bugme-new] [Bug 10903] New: ssh connections hang with 2.6.26-rc5
Le samedi 14 juin 2008 22:45:41 Ilpo Järvinen, vous avez écrit :
> On Fri, 13 Jun 2008, Andrew Morton wrote:
> > (switched to email. Please respond via emailed reply-to-all, not via the
> > bugzilla web interface).
OK.
> > On Fri, 13 Jun 2008 02:39:17 -0700 (PDT) bugme-daemon@...zilla.kernel.org
wrote:
> > > http://bugzilla.kernel.org/show_bug.cgi?id=10903
> > >
> > > Summary: ssh connections hang with 2.6.26-rc5
> > > Product: Networking
> > > Version: 2.5
> > > KernelVersion: 2.6.26-rc5
> > > Platform: All
> > > OS/Version: Linux
> > > Tree: Mainline
> > > Status: NEW
> > > Severity: normal
> > > Priority: P1
> > > Component: Other
> > > AssignedTo: acme@...stprotocols.net
> > > ReportedBy: didier@...oud.com
> > >
> > >
> > > Latest working kernel version: 2.6.25-2
> > > Earliest failing kernel version: 2.6.26-rc5
> > > Distribution: Debian (Lenny + Sid)
> > > Hardware Environment: amd64 (Dell Latitude D630)
> > > Software Environment: KDE
> > > Problem Description:
> > >
> > > With kernel version 2.6.26-rc5, the ssh connections to remote servers
> > > randomly
> > > hang (no error message). No amelioration despite the activation of
> > > "ServerAliveInterval" on both sides.
>
> Thanks for reporting. Could you please clarify couple of things:
Hi.
I will try to, with my time and knowledge.
> Does this only happen with a particular server/servers?
I have only tried with two of my home servers. One runs 2.6.22-4-686 and the
other 2.6.18-6-vserver-686.
> Any middleboxes in between (NAT, firewall, etc.)?
There is a ADSL router which "provides" internet to the servers by NAT. I have
tried from "inside" the house (so in the same subnet) and from outside: it
hangs in both cases.
The common point is my use of "iwl3945" : I have always tried the ssh
connections through WiFi.
> Do all ssh connections hang simultaneously?
Well... It is hard to say. As far as I have seen, no. When I get one hang, I
can successfully connect to the same server.
> How long have you waited until concluding that TCP is "hung"?
Well. The "ServerAliveInterval" option of openssh now leads to "Received
disconnect from $IP: 2: Timeout, your session not responding." after the
hang. So the openssh server notices that my session is not responding and so
cuts the connection.
> Is TSO enabled (ethtool -k)? Have you tried without it?
Doesn't seem:
----
# ethtool -k wlan0
Offload parameters for wlan0:
Cannot get device rx csum settings: Operation not supported
Cannot get device tx csum settings: Operation not supported
Cannot get device scatter-gather settings: Operation not supported
Cannot get device tcp segmentation offload settings: Operation not supported
Cannot get device udp large send offload settings: Operation not supported
Cannot get device generic segmentation offload settings: Operation not
supported
no offload info available
----
> It wouldn't hurt to include info about eth hw too (e.g., lspci), though
> it might turn unneeded at some point of time but it might save an email
> round-trip.
lspci attached.
> TCP can appear to hang due to vast number of reasons. Only recent changes
> that are suspectable is the DEFERRED_ACCEPT thing which is already
> reverted in the very latest Linus' tree (even -rc6 is too old for that)
> and few FRTO fixes (you can exclude FRTO by turning
> /proc/sys/net/ipv4/tcp_frto sysctl to 0 but it seems quite unlikely to
> change anything); your problem might well come from something else and TCP
> hang is just a symptom of other problem downstream.
I can't understand everything, but what I can say is that with the exact same
software, I get no hangs with 2.6.25-2 but I get some with 2.6.26-rc5.
> So please gather this information (at least for the relevant connections):
>
> $ netstat -pn
> $ cat /proc/net/tcp
Attached.
> ...Also a tcpdump might be handy (though I don't know yet).
Well. It seems that there is another bug here: everytime I tried a
# tcpdump -w /tmp/tcpdump.wlan0 -i wlan0
I got a CPU lockup (or similar, can't know exactly, but keyboard blocked and
nothing doable).
> ...Depending on your privacy needs, you may want obfuscate ip addresses
> that are revealed by all of those logs (ie., if you don't want to reveal
> with whom you're communicating with, ssh payload is encrypted anyway).
>
> (...)
>
> (I'll be away nearly a month after Tuesday, so I probably won't have much
> time to resolve this issue but I hope I've some time to take a look before
> I leave).
We'll see ;)
Regards,
OdyX
--
Didier Raboud, proud Debian user.
CH-1802 Corseaux
didier@...oud.com
View attachment "lspci" of type "text/plain" (2106 bytes)
View attachment "netstat_-pn" of type "text/plain" (31089 bytes)
View attachment "proc_net_tcp" of type "text/plain" (4351 bytes)
Download attachment "signature.asc " of type "application/pgp-signature" (198 bytes)
Powered by blists - more mailing lists