lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110817023000.GD32132@somewhere.redhat.com>
Date:	Wed, 17 Aug 2011 04:30:02 +0200
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Anton Blanchard <anton@....ibm.com>,
	Avi Kivity <avi@...hat.com>, Ingo Molnar <mingo@...e.hu>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Paul Menage <menage@...gle.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Stephen Hemminger <shemminger@...tta.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Tim Pepper <lnxninja@...ux.vnet.ibm.com>
Subject: Re: [PATCH 24/32] nohz/cpuset: Handle kernel entry/exit to account
 cputime

On Tue, Aug 16, 2011 at 01:38:20PM -0700, Paul E. McKenney wrote:
> On Mon, Aug 15, 2011 at 05:52:21PM +0200, Frederic Weisbecker wrote:
> > Provide a few APIs that archs can call to tell they are entering
> > or exiting the kernel so that when we are in nohz adaptive mode
> > we know precisely where we need to account the cputime.
> > 
> > The new APIs are:
> > 
> > - tick_nohz_enter_kernel() (called when we enter a syscall)
> > - tick_nohz_exit_kernel() (called when we exit a syscall)
> > - tick_nohz_enter_exception() (called when we enter any
> >   exception, trap, faults...but not irqs)
> > - tick_nohz_exit_exception() (called when we exit any exception)
> > 
> > Hooks into syscalls are typically driven by the TIF_NOHZ thread
> > flag.
> > 
> > In addition, we use the value returned by user_mode(regs) from
> > the timer interrupt to know where we are.
> > Nonetheless, we can rely on user_mode(regs) != 0 to know
> > we are in userspace, but we can't rely on user_mode(regs) == 0
> > to know we are in the system.
> > 
> > Consider the following scenario: we stop the tick after syscall
> > return, so we set TIF_NOHZ but the syscall exit hook is behind us.
> > If we haven't yet returned to userspace, then we have
> > user_mode(regs) == 0. If on top of that we consider we are in
> > system mode, and later we issue a syscall but restart the tick
> > right before reaching the syscall entry hook, then we have no clue
> > that the whole elapsed cputime was not in the system but in the
> > userspace.
> > 
> > The only way to fix this is to only start entering nohz mode once
> > we know we are in userspace a first time, like when we reach the
> > kernel exit hook or when a timer tick with user_mode(regs) == 1
> > fires. Kernel threads don't have this worry.
> > 
> > This sucks but for now I have no better solution. Let's hope we
> > can find better.
> > 
> > TODO: wrap operation on jiffies?
> 
> Hmmm...  Does the RCU dyntick-idle code need to know about exception
> entry and exit?
> 
> 							Thanx, Paul

At that time it doesn't because we don't yet call rcu_enter_nohz()
when switching to userspace. Instead we shutdown the tick and
restart it when needed when a remote CPU sends us an IPI to complete
a grace period.

The patch that switches to extended qs is the 31/32 and it handles
syscalls and exceptions as well.

I wanted to have support on rcu extended quiescent states late
in the patchset so that it's considered as an incremental feature
and not a core piece of the adaptive nohz (ie: it's no mandatory thing,
just an optimization). This way we can use cpuset nohz without that
rcu extended quiescent state feature and hence make that small part
bisectable.

Patch 30 activates support for cpuset nohz (support from x86).
Patch 31 activates the rcu extended quiescent state support in
userspace as a bonus.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ