lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1209748226.13467.6.camel@bobble.smo.corp.google.com>
Date:	Fri, 02 May 2008 10:10:26 -0700
From:	Frank Mayhar <fmayhar@...gle.com>
To:	Petr Tesarik <ptesarik@...e.cz>
Cc:	Björn Steinbrink <B.Steinbrink@....de>,
	Thomas Gleixner <tglx@...utronix.de>,
	Oleg Nesterov <oleg@...sign.ru>, linux-kernel@...r.kernel.org,
	Roland McGrath <roland@...hat.com>
Subject: Re: CPU POSIX timers livelock

On Fri, 2008-05-02 at 18:13 +0200, Petr Tesarik wrote: 
> On Fri, 2008-05-02 at 17:21 +0200, Björn Steinbrink wrote:
> > [Added Roland McGrath and Frank Mayhar to Cc:, as this sounds similar
> > enough to what has been discussed here http://lkml.org/lkml/2008/2/6/505]
> 
> Yes, I've just now found the thread too, read it, and I think this is
> just another case where the current implementation does not scale.
> 
> Was there any followup to the patch posted on the 7th of March? The
> interesting discussion seems to be interrupted there. :(

Roland and I have continued the conversation but we took it off the LKML
since it was really getting into the nitty-gritty details of the
implementation and wasn't that interesting to someone not actually
directly involved.

The upshot is that I have a proposed patch that I have handed to Roland
to review.  He's pretty busy, though, so he may not have gotten to it
yet.  Perhaps this thread will give him further incentive. :-)

Petr's analysis pretty much matches mine, except that he went into a bit
more detail in actually computing numbers and whatnot whereas I just
reasoned that with a sufficiently large number of threads pretty much
any process that uses POSIX timers can cause the system to livelock,
simply because repeatedly running the thread group list in
run_posix_cpu_timers() will at some point take as long as the timer tick
itself.

My proposed patch that Roland is reviewing corrects the implementation
of run_posix_cpu_timers() to make it run in constant time for a
particular machine by defining a couple of new structures and keeping
the thread group timers in one of these.  It's way more complex than
this and I have a README that goes into detail if anyone is interested.

I've tested the patch with as many as 200,000 threads (all of which are
running a prime number sieve and are therefore CPU bound) and it appears
to work fine.  Before I post it again, though, I want Roland to sign off
on it.
-- 
Frank Mayhar <fmayhar@...gle.com>
Google, Inc.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ