[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0805312332270.2760@wrl-59.cs.helsinki.fi>
Date: Sun, 1 Jun 2008 00:39:05 +0300 (EEST)
From: "Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
To: "Håkon Løvdal" <hlovdal@...il.com>
cc: LKML <linux-kernel@...r.kernel.org>,
Netdev <netdev@...r.kernel.org>, Ingo Molnar <mingo@...e.hu>,
"David S. Miller" <davem@...emloft.net>,
"Rafael J. Wysocki" <rjw@...k.pl>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+
On Sat, 31 May 2008, Håkon Løvdal wrote:
> 2008/5/31 Ilpo Järvinen <ilpo.jarvinen@...sinki.fi>:
>
> > So you had that '-' earlier and you checked at that time but the
> > connection is now already dead?
>
> This is only from checking after the connection was dead.
Could you please rephrase the answer, I failed to understand it... :-)
...You said earlier that you had '-' owned connections like Ingo, when did
that happen (now the connections won't exists anymore, so at what point of
time you saw those non-owned connections)?
> By the way,
> I just had to remotely reboot the new machine because the window
> manager locked up, however the old PC are still listing the defunct
> connections after this.
Ok.
> > :-(, I would some much liked to see what they were doing.
>
> I can of course keep on copying for testing purposes, but then I would
> like to be able to dump only that single tcp connection, any tips of how
> to do that?
> I found nothing specific in the manuals of wireshark and tcpdump. Of
> cours it is possible to capture everything and filter afterwards, but
> since I will be transferring lots of data the logs will get huge and I
> would not like to have even additional traffic inside...
I didn't really mean tcpdump, I was more thinking of syscall what is the
syscall where the process is waiting. Though tcpdump might reveal
something as well about the behavior when nearing the problem,
tcpdump -n -i <iface> host <blahblah> and port <portno> and ...
Host & port as written above matches for either src and dst, I don't
remember how one could specify just one of them but it's not usually
necessary (won't be here either).
> > These 7C/D... certainly seem strange values. Which TCP variant you
> > have in use (cat /proc/sys/net/ipv4/tcp_congestion_control)? It seems
> > that vegas, veno and yeah at least contain 0x7fffffff there for some
> > rtt, which could perhaps somehow leak.
>
> I have not done any specific selection myself. On old_pc: bic, new_pc:
> cubic.
Ok, after some searching it also seems that it was a dead-end anyway:
- icsk_retransmit_timer is only set to icsk->icsk_timeout or
jiffies + (HZ / 20)
- icsk_timeout is only set after if (when > max_when) limiting (in
unsigned quantities)
- max_when is always given TCP_RTO_MAX by TCP...
...I'm currently out of ideas with this one then, I think I checked all
types too and nothing came up :-(.
Hmm, perhaps periodically checking /proc/net/tcp (e.g., once per 10s) if
the timeout is larger than TCP_RTO_MAX might allow some script to
immediately notice when things broke while reproducing it. Storing all
those once per 10s values shouldn't be a too big either, it could even be
done in both ends for a single flow (but I'll leave a script to do that on
Monday).
--
i.
Powered by blists - more mailing lists