[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080526135940.GB24870@elte.hu>
Date: Mon, 26 May 2008 15:59:40 +0200
From: Ingo Molnar <mingo@...e.hu>
To: Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>
Cc: LKML <linux-kernel@...r.kernel.org>,
Netdev <netdev@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
"Rafael J. Wysocki" <rjw@...k.pl>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+
* Ilpo Järvinen <ilpo.jarvinen@...sinki.fi> wrote:
> > in terms of debugging there's not much i can do i'm afraid. It's not
> > possible to get a tcpdump of this incident, given the extreme amount
> > of load these testboxes handle.
>
> ...but you can still tcpdump that particular flow once the situation
> is discovered to see if TCP still tries to do something, no? One needs
> to tcpdump couple of minutes at minimum. Also please get /proc/net/tcp
> for that flow around the same time.
ok, will try those.
> > One clue (which might or might not matter) is that distcc is one of
> > the very few applications that makes use of sendfile().
>
> Can you please try with /proc/sys/net/ipv4/tcp_frto set to zero though
> recv-q symptom seems weird would it be related to that (but there were
> some recent fixes to FRTO and retrans_stamp change could have some
> significance here)?
>
> Other than that, nothing since -rc1 seems suspicious to me (though I
> hardly understand every part of networking).
ok, i will first wait for it to trigger on a box and will do the tcpdump
session (and /proc/net/tcp output), then i'll continue the tests with
this done in the rc.local:
echo 0 > /proc/sys/net/ipv4/tcp_frto
and will see whether the hung connections still occur. The cycle of
testing will be very slow i suspect.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists