lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 24 Nov 2008 09:46:43 +0100
From:	Petr Tesarik <ptesarik@...e.cz>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Frank Mayhar <fmayhar@...gle.com>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Doug Chapman <doug.chapman@...com>, mingo@...e.hu,
	roland@...hat.com, adobriyan@...il.com, akpm@...ux-foundation.org,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: regression introduced by - timers: fix itimer/many thread hang

Peter Zijlstra píše v Ne 23. 11. 2008 v 15:24 +0100:
> On Fri, 2008-11-21 at 19:42 +0100, Petr Tesarik wrote:
> 
> > > > In any event, while this particular implementation may not be optimal,
> > > > at least it's _right_.  Whatever happened to "make it right, then make
> > > > it fast?"
> > >
> > > Well, I'm not thinking you did it right ;-)
> > >
> > > While I agree that the linear loop is sub-optimal, but it only really
> > > becomes a problem when you have hundreds or thousands of threads in your
> > > application, which I'll argue to be insane anyway.
> > 
> > This is just not true. I've seen a very real example of a lockup with a very 
> > sane number of threads (one per CPU), but on a very large machine (1024 CPUs 
> > IIRC). The application set per-process CPU profiling with an interval of 1 
> > tick, which translates to 1024 timers firing off with each tick...
> > 
> > Well, yes, that was broken, too, but that's the way one quite popular FORTRAN 
> > compiler works...
> 
> I'm not sure what side you're arguing...

In this particular case I'm arguing against both, it seems. The old
behaviour is broken and the new one is not better. :(

> The current (per-cpu) code is utterly broken on large machines too, I've
> asked SGI to run some tests on real numa machines (something multi-brick
> altix) and even moderately small machines with 256 cpus in them grind to
> a halt (or make progress at a snails pace) when the itimer stuff is
> enabled.
> 
> Furthermore, I really dislike the per-process-per-cpu memory cost, it
> bloats applications and makes the new per-cpu alloc work rather more
> difficult than it already is.
> 
> I basically think the whole process wide itimer stuff is broken by
> design, there is no way to make it work on reasonably large machines,
> the whole problem space just doesn't scale. You simply cannot maintain a
> global count without bouncing cachelines like mad, so you might as well
> accept it and do the process wide counter and bounce only a single line,
> instead of bouncing a line per-cpu.

Very true. Unfortunately per-process itimers are prescribed by the
Single Unix Specification, so we have to cope with them in some way,
while not permitting a non-privileged process a DoS attack. This is
going to be hard, and we'll probably have to twist the specification a
bit to still conform to its wording. :((

I really don't think it's a good idea to set a per-process ITIMER_PROF
to one timer tick on a large machine, but the kernel does allow any
process to do it, and then it can even cause hard freeze on some
hardware. This is _not_ acceptable.

What is worse, we can't just limit the granularity of itimers, because
threads can come into being _after_ the itimer was set.


> Furthermore, I stand by my claims that anything that runs more than a
> hand-full of threads per physical core is utterly braindead and deserves
> all the pain it can get. (Yes, I'm a firm believer in state machines and
> don't think just throwing threads at a problem is a sane solution).

Yes, anything with many threads per-core is badly designed. My point is
that it's not the only broken case.

Petr Tesarik


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists