[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1209748226.13467.6.camel@bobble.smo.corp.google.com>
Date: Fri, 02 May 2008 10:10:26 -0700
From: Frank Mayhar <fmayhar@...gle.com>
To: Petr Tesarik <ptesarik@...e.cz>
Cc: Björn Steinbrink <B.Steinbrink@....de>,
Thomas Gleixner <tglx@...utronix.de>,
Oleg Nesterov <oleg@...sign.ru>, linux-kernel@...r.kernel.org,
Roland McGrath <roland@...hat.com>
Subject: Re: CPU POSIX timers livelock
On Fri, 2008-05-02 at 18:13 +0200, Petr Tesarik wrote:
> On Fri, 2008-05-02 at 17:21 +0200, Björn Steinbrink wrote:
> > [Added Roland McGrath and Frank Mayhar to Cc:, as this sounds similar
> > enough to what has been discussed here http://lkml.org/lkml/2008/2/6/505]
>
> Yes, I've just now found the thread too, read it, and I think this is
> just another case where the current implementation does not scale.
>
> Was there any followup to the patch posted on the 7th of March? The
> interesting discussion seems to be interrupted there. :(
Roland and I have continued the conversation but we took it off the LKML
since it was really getting into the nitty-gritty details of the
implementation and wasn't that interesting to someone not actually
directly involved.
The upshot is that I have a proposed patch that I have handed to Roland
to review. He's pretty busy, though, so he may not have gotten to it
yet. Perhaps this thread will give him further incentive. :-)
Petr's analysis pretty much matches mine, except that he went into a bit
more detail in actually computing numbers and whatnot whereas I just
reasoned that with a sufficiently large number of threads pretty much
any process that uses POSIX timers can cause the system to livelock,
simply because repeatedly running the thread group list in
run_posix_cpu_timers() will at some point take as long as the timer tick
itself.
My proposed patch that Roland is reviewing corrects the implementation
of run_posix_cpu_timers() to make it run in constant time for a
particular machine by defining a couple of new structures and keeping
the thread group timers in one of these. It's way more complex than
this and I have a README that goes into detail if anyone is interested.
I've tested the patch with as many as 200,000 threads (all of which are
running a prime number sieve and are therefore CPU bound) and it appears
to work fine. Before I post it again, though, I want Roland to sign off
on it.
--
Frank Mayhar <fmayhar@...gle.com>
Google, Inc.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists