lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 6 Jun 2008 21:25:42 +0300 (EEST)
From:	"Ilpo Järvinen" <ilpo.jarvinen@...sinki.fi>
To:	Patrick McManus <mcmanus@...ksong.com>
cc:	Ingo Molnar <mingo@...e.hu>, David Miller <davem@...emloft.net>,
	peterz@...radead.org, LKML <linux-kernel@...r.kernel.org>,
	Netdev <netdev@...r.kernel.org>, rjw@...k.pl,
	Andrew Morton <akpm@...ux-foundation.org>, johnpol@....mipt.ru
Subject: Re: [fixed] [patch] Re: [bug] stuck localhost TCP connections,
 v2.6.26-rc3+

On Fri, 6 Jun 2008, Patrick McManus wrote:

> > This Ingo's testcase should anyway be quite "simple", I mean that distcc 
> > shouldn't do anything unexpected in a sense it shouldn't abort the flows 
> > by not sending data, close the listening socket or other things like that.
> 
> maybe - I've noted that I can get the distcc server to crash with just a
> little fuzz (telnet to it and close the telnet) - but it is true I
> haven't seen anything odd using the distcc client.

In addition I think I've also seen some bits floating around that 
occassionally distcc does something weird in a correct setup too.

I briefly looked how distcc behaved while doing the stress_accept. Distcc 
basically seems to have n processes each accept()ing and some kind of 
memleak killer by limiting number of successive accepts then exit, while 
the parent who did the listen is only periodically (had some sleep(1)s) 
collecting dead ones & respawning them.

> Anyhow, my news is that using rc5 I have managed to reproduce it on
> localhost - so it isn't just ingo anymore ! ;)

Also Peter Z has reported it earlier, it was distcc+localhost for him as 
well.

> and has intentionally broken dependencies so it just keeps recompiling 
> stuff.

...Trying to invent perpetual motion machine? :-/

> The input files are
> approximately 135k, 98k, and 16k after running gcc -E on them (which I
> what I assume distcc does before putting them down the socket).
>
> On rc5 I could get the lockup in under 20 minutes.. usually 10. I think
> I did it 4 times. My compile test is probably a better trigger than the
> kernel compile because the distcc connects are never staggered like they
> would be in a large directory of files. (3 files, -j4).

It could be even easier if you make next in path gcc to play with 
nice, trying a number of different values might reveal some really fast 
to reproduce scenario.

> When I apply the locking patch you (Ilpo) wrote, I cannot reproduce the
> error at all in the first 90 minutes of testing. I'll let the test run
> and update the list.

At least it helps some :-), like it should.

> I'm holding out hope that Ingo's report did not have the locking patch
> on the distcc server end - because it certainly makes a difference for 
> me.

...He had some issue with different versions being deployed at least in 
the past, and I failed to follow his latest answer :-).


-- 
 i.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ