lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140613160002.GL6635@localhost.localdomain>
Date:	Fri, 13 Jun 2014 18:00:04 +0200
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Josh Triplett <josh@...htriplett.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Subject: Re: [PATCH] rcu: Only pin GP kthread when full dynticks is actually
 used

On Fri, Jun 13, 2014 at 08:52:33AM -0700, Paul E. McKenney wrote:
> On Fri, Jun 13, 2014 at 02:47:16PM +0200, Frederic Weisbecker wrote:
> > On Thu, Jun 12, 2014 at 06:35:15PM -0700, Paul E. McKenney wrote:
> > > On Thu, Jun 12, 2014 at 06:24:32PM -0700, Paul E. McKenney wrote:
> > > > On Fri, Jun 13, 2014 at 02:16:59AM +0200, Frederic Weisbecker wrote:
> > > > > CONFIG_NO_HZ_FULL may be enabled widely on distros nowadays but actual
> > > > > users should be a tiny minority, if actually any.
> > > > > 
> > > > > Also there is a risk that affining the GP kthread to a single CPU could
> > > > > end up noticeably reducing RCU performances and increasing energy
> > > > > consumption.
> > > > > 
> > > > > So lets affine the GP kthread only when nohz full is actually used
> > > > > (ie: when the nohz_full= parameter is filled or CONFIG_NO_HZ_FULL_ALL=y)
> > > 
> > > Which reminds me...  Kernel-heavy workloads running NO_HZ_FULL_ALL=y
> > > can see long RCU grace periods, as in about two seconds each.  It is
> > > not hard for me to detect this situation.
> > 
> > Ah yeah sounds quite long.
> > 
> > > Is there some way I can
> > > call for a given CPU's scheduling-clock interrupt to be turned on?
> > 
> > Yeah, once the nohz kick patchset (https://lwn.net/Articles/601214/) is merged,
> > a simple call to tick_nohz_full_kick_cpu() should do the trick. Although the
> > right condition must be there on the IPI side. Maybe with rcu_needs_cpu() or such.
> 
> I could record the offending GP, and make rcu_needs_cpu() return true
> if the current GP matches the offending one.
> 
> > But it would be interesting to identify the sources of these extended grace periods.
> > If we only restart the tick, we may ignore some deeper oustanding issue.
> 
> Some of them have been fixable by other means, but they will probably
> come back as system sizes grow.  And I really have put preemption points
> into kernel code in response to RCU CPU stall warnings, and the current
> state of NO_HZ_FULL effectively ignores these preemption points.  :-/

I'm not sure I really understand the issue though. So you have RCU CPU stalls due
to very extended grace periods, right?

I'm not sure how preemption points would solve that. Or maybe you're
trying to trigger quiescent states reports through these preemption points?

Is it because we have dynticks CPUs staying too long in the kernel without
taking any quiescent states? Are we perhaps missing some rcu_user_enter() or
things?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ