lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4F6B5ED1.1080300@tilera.com>
Date:	Thu, 22 Mar 2012 13:18:09 -0400
From:	Chris Metcalf <cmetcalf@...era.com>
To:	Gilad Ben-Yossef <gilad@...yossef.com>
CC:	Christoph Lameter <cl@...ux.com>,
	Frederic Weisbecker <fweisbec@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	<linaro-sched-sig@...ts.linaro.org>,
	Alessio Igor Bogani <abogani@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Avi Kivity <avi@...hat.com>,
	Daniel Lezcano <daniel.lezcano@...aro.org>,
	Geoff Levand <geoff@...radead.org>,
	Ingo Molnar <mingo@...nel.org>,
	Max Krasnyansky <maxk@...lcomm.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Stephen Hemminger <shemminger@...tta.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Sven-Thorsten Dietrich <thebigcorporation@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Zen Lin <zen@...nhuawei.org>
Subject: Re: [PATCH 11/32] nohz/cpuset: Don't turn off the tick if rcu needs
 it

On 3/22/2012 3:38 AM, Gilad Ben-Yossef wrote:
> On Wed, Mar 21, 2012 at 4:54 PM, Christoph Lameter <cl@...ux.com> wrote:
>> On Wed, 21 Mar 2012, Frederic Weisbecker wrote:
>>
>>> If RCU is waiting for the current CPU to complete a grace
>>> period, don't turn off the tick. Unlike dynctik-idle, we
>>> are not necessarily going to enter into rcu extended quiescent
>>> state, so we may need to keep the tick to note current CPU's
>>> quiescent states.
>> Is there any way for userspace to know that the tick is not off yet due to
>> this? It would make sense for us to have busy loop in user space that
>> waits until the OS has completed all processing if that avoids future
>> latencies for the application.
>>
> I previously suggested having the user register to receive a signal
> when the tick
> is turned off. Since the tick is always turned off the user task is
> the current task
> by design, *I think* you can simply mark the signal pending when you
> turn the tick off.
>
> The user would register a signal handler to set a flag when it is
> called and then busy
> loop waiting for a flag to clear.

This sounds plausible, but the kernel would have to know that the tick not
only was stopped currently, but also would still be stopped when the signal
handler's sigreturn syscall was performed.  The problem we've seen is that
it's sometimes somewhat nondeterministic when the kernel might decide it
needed some more ticking, once you let kernel code start to run.  For
example, for RCU ops the kernel can choose to ignore the nohz cpuset cores
when they're running userspace code only, but as soon as they get back into
the kernel for any reason, you may need to schedule a grace period, and so
just returning from the "you have no more ticks!" signal handler ends up
causing ticks to be scheduled.

The approach we took for the Tilera dataplane mode was to have a syscall
that would hold the task in the kernel until any ticks were done, and only
then return to userspace.  (This is the same set_dataplane() syscall that
also offers some flags to control and debug the dataplane stuff in general;
in fact the "hold in kernel" support is a mode we set for all syscalls, to
keep things deterministic.)  This way the "busy loop" is done in the
kernel, but in fact we explicitly go into idle until the next tick, so it's
lower-power.

An alternative approach, not so good for power but at least avoiding the
"use the kernel to avoid the kernel" aspect of signals, would be to
register a location in userspace that the kernel would write to when it
disabled the tick, and userspace could then just spin reading memory.

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ