[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.0911281257270.25587@melkinpaasi.cs.helsinki.fi>
Date: Sat, 28 Nov 2009 13:31:14 +0200 (EET)
From: "Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
To: Frederic Leroy <fredo@...rox.org>
cc: Netdev <netdev@...r.kernel.org>, Asdo <asdo@...ftmail.org>
Subject: Re: scp stalls mysteriously
I restored Ccs. Please keep them.
On Sat, 28 Nov 2009, Frederic Leroy wrote:
> Le Sat, 28 Nov 2009 00:12:23 +0200 (EET),
> "Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi> a écrit :
>
> > On Fri, 27 Nov 2009, Frederic Leroy wrote:
> >
> > > I put traces of stall here :
> > > http://www.starox.org/pub/scp_stall/
> >
> > Your proc/net/tcp capture on houba was perhaps made too late? ...The
> > connection is missing already.
>
> It could be ! I had a doubt while using my 2 keyboards ...
>
> For information for the pcaps, I filtered and used "tcpdump ... ether
> host xx:xx:xx:xx:xx"
> I waited a bit after the stall and kill the scp with ctrl-c.
>
> > But anyway, at least the problem is visible...
>
> Great!
>
> > It seems that
> > 3998:4046 gets never retransmitted, not even by RTO which seems very
> > very strange to me... And after this: 23:21:56.154269 IP
> > 192.168.1.19.50028 > 192.168.1.15.22: . ack 3998 win 379 ... sack 3
> > {4238:4286}{4142:4190}{4046:4094}> also fast retransmit should have
> > already triggered. ...I'll look more into this if I can figure it out
> > from the current traces but it'll take a while.
>
> Can it help you, if I make other traces ?
>
> I won't be available until monday.
Perhaps having the /proc/net/tcp would at least tell what state the timer
is (if I cannot reproduce right away). ...It is rather strange that two
independent mechanisms for loss recovery seem both to fail to get
triggered here, no traces of retransmission whatsoever. I think it is for
now enough to concentrate on what happens on 192.168.1.15 (=houba?) and
get tcpdump and proc/net/tcp from there, the other end/direction has very
little significance here (except for the fact that bidirectionality might
be needed to actually trigger it). You could even think of getting
proc/net/tcp a bit more often, right from the start:
while [ : ]; do grep ":0016" /proc/net/tcp; sleep 0.1; done | tee scp_stall-houba.x.proc_net_tcp
...Please wait at least 2 minutes before hitting ctrl-c or otherwise
artificially intervening.
--
i.
Powered by blists - more mailing lists