lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140708220556.GB4603@linux.vnet.ibm.com>
Date:	Tue, 8 Jul 2014 15:05:56 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Frederic Weisbecker <fweisbec@...il.com>
Cc:	linux-kernel@...r.kernel.org, mingo@...nel.org,
	laijs@...fujitsu.com, dipankar@...ibm.com,
	akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
	josh@...htriplett.org, niv@...ibm.com, tglx@...utronix.de,
	peterz@...radead.org, rostedt@...dmis.org, dhowells@...hat.com,
	edumazet@...gle.com, dvhart@...ux.intel.com, oleg@...hat.com,
	sbw@....edu
Subject: Re: [PATCH tip/core/rcu 11/17] rcu: Bind grace-period kthreads to
 non-NO_HZ_FULL CPUs

On Tue, Jul 08, 2014 at 10:40:11PM +0200, Frederic Weisbecker wrote:
> On Tue, Jul 08, 2014 at 12:58:37PM -0700, Paul E. McKenney wrote:
> > On Tue, Jul 08, 2014 at 08:38:47PM +0200, Frederic Weisbecker wrote:
> > > On Tue, Jul 08, 2014 at 08:47:23AM -0700, Paul E. McKenney wrote:
> > > > On Tue, Jul 08, 2014 at 05:24:00PM +0200, Frederic Weisbecker wrote:
> > > > > On Mon, Jul 07, 2014 at 03:38:15PM -0700, Paul E. McKenney wrote:
> > > > > > From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
> > > > > > 
> > > > > > Binding the grace-period kthreads to the timekeeping CPU resulted in
> > > > > > significant performance decreases for some workloads.  For more detail,
> > > > > > see:
> > > > > > 
> > > > > > https://lkml.org/lkml/2014/6/3/395 for benchmark numbers
> > > > > > 
> > > > > > https://lkml.org/lkml/2014/6/4/218 for CPU statistics
> > > > > > 
> > > > > > It turns out that it is necessary to bind the grace-period kthreads
> > > > > > to the timekeeping CPU only when all but CPU 0 is a nohz_full CPU
> > > > > > on the one hand or if CONFIG_NO_HZ_FULL_SYSIDLE=y on the other.
> > > > > > In other cases, it suffices to bind the grace-period kthreads to the
> > > > > > set of non-nohz_full CPUs.
> > > > > > 
> > > > > > This commit therefore creates a tick_nohz_not_full_mask that is the
> > > > > > complement of tick_nohz_full_mask, and then binds the grace-period
> > > > > > kthread to the set of CPUs indicated by this new mask, which covers
> > > > > > the CONFIG_NO_HZ_FULL_SYSIDLE=n case.  The CONFIG_NO_HZ_FULL_SYSIDLE=y
> > > > > > case still binds the grace-period kthreads to the timekeeping CPU.
> > > > > > This commit also includes the tick_nohz_full_enabled() check suggested
> > > > > > by Frederic Weisbecker.
> > > > > > 
> > > > > > Reported-by: Jet Chen <jet.chen@...el.com>
> > > > > > Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> > > > > > [ paulmck: Created housekeeping_affine() per fweisbec feedback. ]
> > > > > > ---
> > > > > >  include/linux/tick.h     | 19 +++++++++++++++++++
> > > > > >  kernel/rcu/tree_plugin.h | 14 +++++++++-----
> > > > > >  kernel/time/tick-sched.c | 10 ++++++++++
> > > > > >  3 files changed, 38 insertions(+), 5 deletions(-)
> > > > > > 
> > > > > > diff --git a/include/linux/tick.h b/include/linux/tick.h
> > > > > > index b84773cb9f4c..c39af3261351 100644
> > > > > > --- a/include/linux/tick.h
> > > > > > +++ b/include/linux/tick.h
> > > > > > @@ -12,6 +12,7 @@
> > > > > >  #include <linux/hrtimer.h>
> > > > > >  #include <linux/context_tracking_state.h>
> > > > > >  #include <linux/cpumask.h>
> > > > > > +#include <linux/sched.h>
> > > > > >  
> > > > > >  #ifdef CONFIG_GENERIC_CLOCKEVENTS
> > > > > >  
> > > > > > @@ -162,6 +163,7 @@ static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; }
> > > > > >  #ifdef CONFIG_NO_HZ_FULL
> > > > > >  extern bool tick_nohz_full_running;
> > > > > >  extern cpumask_var_t tick_nohz_full_mask;
> > > > > > +extern cpumask_var_t tick_nohz_not_full_mask;
> > > > > 
> > > > > So I'm still puzzled by this mask.
> > > > > 
> > > > > How about creating a temporary cpumask on top of tick_nohz_full_mask
> > > > > from housekeeping_affine().
> > > > > 
> > > > > If you wonder about performance, this can be called once for good from
> > > > > rcu_spawn_gp_kthread() (that would be much better than checking that all
> > > > > the time from the kthread itself anyway).
> > > > 
> > > > I was figuring that a fair number of the kthreads might eventually
> > > > be using this, not just for the grace-period kthreads.
> > > 
> > > Ok makes sense. But can we just rename the cpumask to housekeeping_mask?
> > 
> > Good point!  After all, it someday might be something other than the
> > complement of tick_nohz_full_mask.
> > 
> > > > In addition,
> > > > my concern about once-for-good affinity is for the NO_HZ_FULL_SYSIDLE=y
> > > > case, where moving the grace-period kthreads prevents ever entering
> > > > full-system idle state.
> > > > 
> > > > Or am I missing some use case?
> > > 
> > > No that's what I had in mind. But rcu_spawn_gp_kthread() still looks like
> > > a better place for that. Or am I missing something else?
> > 
> > My fear was that sysadmins would move it in the NO_HZ_FULL_SYSIDLE=y
> > case, in which case it needs to move back.
> 
> If you use kthread_bind(), this can't be overriden by the user. See PF_NO_SETAFFINITY.

Fair point.  This would be a kthread_bind_housekeeping(), then.

But I need to create an kthread_bind_mask() or some such that acts like
kthread_bind(), but which takes a cpumask rather than a cpu.  Easy
enough to do, but sounds like 3.18 material rather than 3.17 material.
So this is on my 3.18 list.

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ