netdev - Re: scp stalls mysteriously

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.00.0911300010340.15770@melkinpaasi.cs.helsinki.fi>
Date:	Mon, 30 Nov 2009 00:13:31 +0200 (EET)
From:	"Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
To:	Frederic Leroy <fredo@...rox.org>
cc:	Netdev <netdev@...r.kernel.org>, Asdo <asdo@...ftmail.org>
Subject: Re: scp stalls mysteriously

On Sat, 28 Nov 2009, Ilpo Järvinen wrote:

> I restored Ccs. Please keep them.
> 
> On Sat, 28 Nov 2009, Frederic Leroy wrote:
> 
> > Le Sat, 28 Nov 2009 00:12:23 +0200 (EET),
> > "Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi> a écrit :
> > 
> > > On Fri, 27 Nov 2009, Frederic Leroy wrote:
> > > 
> > > > I put traces of stall here : 
> > > > http://www.starox.org/pub/scp_stall/
> > >
> > > Your proc/net/tcp capture on houba was perhaps made too late? ...The 
> > > connection is missing already.
> > 
> > It could be ! I had a doubt while using my 2 keyboards ... 
> > 
> > For information for the pcaps, I filtered and used "tcpdump ... ether
> > host xx:xx:xx:xx:xx"
> > I waited a bit after the stall and kill the scp with ctrl-c.
> > 
> > > But anyway, at least the problem is visible...
> > 
> > Great!
> > 
> > > It seems that
> > > 3998:4046 gets never retransmitted, not even by RTO which seems very
> > > very strange to me... And after this: 23:21:56.154269 IP
> > > 192.168.1.19.50028 > 192.168.1.15.22: . ack 3998 win 379 ... sack 3 
> > > {4238:4286}{4142:4190}{4046:4094}> also fast retransmit should have 
> > > already triggered. ...I'll look more into this if I can figure it out 
> > > from the current traces but it'll take a while.
> > 
> > Can it help you, if I make other traces ?
> > 
> > I won't be available until monday.
> 
> Perhaps having the /proc/net/tcp would at least tell what state the timer 
> is (if I cannot reproduce right away). ...It is rather strange that two 
> independent mechanisms for loss recovery seem both to fail to get 
> triggered here, no traces of retransmission whatsoever. I think it is for 
> now enough to concentrate on what happens on 192.168.1.15 (=houba?) and 
> get tcpdump and proc/net/tcp from there, the other end/direction has very 
> little significance here (except for the fact that bidirectionality might 
> be needed to actually trigger it). You could even think of getting 
> proc/net/tcp a bit more often, right from the start:
> 
> while [ : ]; do grep ":0016" /proc/net/tcp; sleep 0.1; done | tee scp_stall-houba.x.proc_net_tcp
> 
> ...Please wait at least 2 minutes before hitting ctrl-c or otherwise 
> artificially intervening.

So far no luck in reproducing the exactly same scenario as you do, 
however, I'm currently solving another problem I found related to excess 
growth in RTT estimator which is enough for me to get a temporal, but 
long-lasting, - stalled - with scp (that growth happens only with 
timestamps so if I disable them I've better success with the transfer).

-- 
 i.