lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4885E482.5020502@davidnewall.com>
Date:	Tue, 22 Jul 2008 23:15:38 +0930
From:	David Newall <davidn@...idnewall.com>
To:	Ingo Molnar <mingo@...e.hu>
CC:	Linus Torvalds <torvalds@...ux-foundation.org>,
	David Miller <davem@...emloft.net>, akpm@...ux-foundation.org,
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
	Stefan Richter <stefanr@...6.in-berlin.de>
Subject: Re: [TCP bug] stuck distcc connections in latest -git

Ingo Molnar wrote:
> hm, the distcc TCP hangs are back:
>   

The missing four client-side connections are more interesting than the
unsent data.

> I.e. the client side send-queue is stuck in established state, server 
> side thinks it's a proper established connection. Nobody makes any 
> progress.
>   

I might be missing something obvious, but I don't think there's anything
unusual in the three sessions displayed on the client.  They should be
"ESTABLISHED", and on the server, too, just as they are.

> Also note the final 4 connections on the server side - those are not 
> present on the client box.
>   

Now this is interesting.  I would be much more interested in how the
client's sides for these disappeared.

> The hung condition seemed permanent (i waited a couple of minutes).
>   

Not nearly long enough.  Retransmits can be sent as infrequently as per
180 seconds.  I think there's an argument to use one of the the various
patches that reduce your TCP_RTO_MAX, for example OBATA Noboru's
(http://marc.info/?l=linux-netdev&m=118422471428855): you don't have to
wait unreasonably long before seeing a retransmit.  Remember, three minutes!


> I retried the same build 10 times and it would not reproduce - so this 
> again is a hard to reproduce condition. (and there's no chance to get a 
> proper tcpdump either, at these traffic levels)

You really should start that capture, and on both client and server. 
You don't need to dump everything, only traffic to or from server:distcc.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ