[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1227519208.7685.21951.camel@twins>
Date: Mon, 24 Nov 2008 10:33:28 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Petr Tesarik <ptesarik@...e.cz>
Cc: Frank Mayhar <fmayhar@...gle.com>,
Christoph Lameter <cl@...ux-foundation.org>,
Doug Chapman <doug.chapman@...com>, mingo@...e.hu,
roland@...hat.com, adobriyan@...il.com, akpm@...ux-foundation.org,
linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: regression introduced by - timers: fix itimer/many thread hang
On Mon, 2008-11-24 at 09:46 +0100, Petr Tesarik wrote:
> Peter Zijlstra píše v Ne 23. 11. 2008 v 15:24 +0100:
> > On Fri, 2008-11-21 at 19:42 +0100, Petr Tesarik wrote:
> >
> > > > > In any event, while this particular implementation may not be optimal,
> > > > > at least it's _right_. Whatever happened to "make it right, then make
> > > > > it fast?"
> > > >
> > > > Well, I'm not thinking you did it right ;-)
> > > >
> > > > While I agree that the linear loop is sub-optimal, but it only really
> > > > becomes a problem when you have hundreds or thousands of threads in your
> > > > application, which I'll argue to be insane anyway.
> > >
> > > This is just not true. I've seen a very real example of a lockup with a very
> > > sane number of threads (one per CPU), but on a very large machine (1024 CPUs
> > > IIRC). The application set per-process CPU profiling with an interval of 1
> > > tick, which translates to 1024 timers firing off with each tick...
> > >
> > > Well, yes, that was broken, too, but that's the way one quite popular FORTRAN
> > > compiler works...
> >
> > I'm not sure what side you're arguing...
>
> In this particular case I'm arguing against both, it seems. The old
> behaviour is broken and the new one is not better. :(
OK, then we agree ;-)
> > The current (per-cpu) code is utterly broken on large machines too, I've
> > asked SGI to run some tests on real numa machines (something multi-brick
> > altix) and even moderately small machines with 256 cpus in them grind to
> > a halt (or make progress at a snails pace) when the itimer stuff is
> > enabled.
> >
> > Furthermore, I really dislike the per-process-per-cpu memory cost, it
> > bloats applications and makes the new per-cpu alloc work rather more
> > difficult than it already is.
> >
> > I basically think the whole process wide itimer stuff is broken by
> > design, there is no way to make it work on reasonably large machines,
> > the whole problem space just doesn't scale. You simply cannot maintain a
> > global count without bouncing cachelines like mad, so you might as well
> > accept it and do the process wide counter and bounce only a single line,
> > instead of bouncing a line per-cpu.
>
> Very true. Unfortunately per-process itimers are prescribed by the
> Single Unix Specification, so we have to cope with them in some way,
> while not permitting a non-privileged process a DoS attack. This is
> going to be hard, and we'll probably have to twist the specification a
> bit to still conform to its wording. :((
Feel like reading the actual spec and trying to come up with a creative
interpretation? :-)
> I really don't think it's a good idea to set a per-process ITIMER_PROF
> to one timer tick on a large machine, but the kernel does allow any
> process to do it, and then it can even cause hard freeze on some
> hardware. This is _not_ acceptable.
>
> What is worse, we can't just limit the granularity of itimers, because
> threads can come into being _after_ the itimer was set.
Currently it has jiffy granularity, right? And jiffies are different
depending on some compile time constant (HZ), so can't we, for the sake
of per-process itimers, pretend to have a 1 minute jiffie?
That should be as compliant as we are now, and utterly useless for
everybody, thereby discouraging its use, hmm? :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists