linux-kernel - Re: regression introduced by - timers: fix itimer/many thread hang

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1227531589.4259.117.camel@twins>
Date:	Mon, 24 Nov 2008 13:59:49 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Petr Tesarik <ptesarik@...e.cz>
Cc:	Frank Mayhar <fmayhar@...gle.com>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Doug Chapman <doug.chapman@...com>, mingo@...e.hu,
	roland@...hat.com, adobriyan@...il.com, akpm@...ux-foundation.org,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: regression introduced by - timers: fix itimer/many thread hang

On Mon, 2008-11-24 at 13:32 +0100, Petr Tesarik wrote:

> > Feel like reading the actual spec and trying to come up with a creative
> > interpretation? :-)
> 
> Yes, I've just spent a few hours doing that... And I feel very
> depressed, as expected.

Thanks for doing that though!

> > > I really don't think it's a good idea to set a per-process ITIMER_PROF
> > > to one timer tick on a large machine, but the kernel does allow any
> > > process to do it, and then it can even cause hard freeze on some
> > > hardware. This is _not_ acceptable.
> > > 
> > > What is worse, we can't just limit the granularity of itimers, because
> > > threads can come into being _after_ the itimer was set.
> > 
> > Currently it has jiffy granularity, right? And jiffies are different
> > depending on some compile time constant (HZ), so can't we, for the sake
> > of per-process itimers, pretend to have a 1 minute jiffie?
> > 
> > That should be as compliant as we are now, and utterly useless for
> > everybody, thereby discouraging its use, hmm? :-)
> 
> I've got a copy of IEEE Std 10003.1-2004 here, and it suggests that this
> should be generally possible. In particular, the description for
> itimer_set says:
> 
> Implementations may place limitations on the granularity of timer values. For
> each interval timer, if the requested timer value requires a finer granularity
> than the implementation supports, the actual timer value shall be rounded up
> to the next supported value.
> 
> However, it seems to be vaguely linked to CLOCK_PROCESS_CPUTIME_ID,
> which is defined as:
> 
> The identifier of the CPU-time clock associated with the process making a
> clock ( ) or timer*( ) function call.
> 
> POSIX does not specify whether this clock is identical to the one used
> for setitimer et al., or not, but it seems logical that it should. Then,
> the kernel should probably return the coarse granularity in
> clock_getres(), too.
> 
> I tried to find out how this is currently implemented in Linux, and it's
> broken. How else. :-/
> 
> 1. clock_getres() always returns a resolution of 1ns
> 
> This is actually good news, because it means that nobody really cares
> whether the actual granularity is greater, so I guess we can safely
> return any bogus number in clock_getres().
> 
> What about using an actual granularity of NR_CPUS*HZ, which should be
> safe for any (at least remotely) sane usage?

nr_cpu_ids * 1/HZ should do I guess, although a cubic function would buy
us even more slack.

> 2. clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &ts) returns -EINVAL
> 
> Should not happen. Looking further into it, I think this line in
> cpu_clock_sample_group():
> 
> 	switch (which_clock) {
> 
> should look like a similar line in cpu_clock_sample(), ie:
> 
> 	switch (CPUCLOCK_WHICH(which_clock)) {
> 
> Shall I send a patch?

Feel free - its not an area I'm intimately familiar with, I'll look into
whipping up a patch removing all the per-cpu crap from there.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/