lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 28 Apr 2012 10:21:27 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Mike Galbraith <efault@....de>
Cc:	linux-kernel@...r.kernel.org, mingo@...e.hu, laijs@...fujitsu.com,
	dipankar@...ibm.com, akpm@...ux-foundation.org,
	mathieu.desnoyers@...ymtl.ca, josh@...htriplett.org,
	niv@...ibm.com, tglx@...utronix.de, peterz@...radead.org,
	rostedt@...dmis.org, Valdis.Kletnieks@...edu, dhowells@...hat.com,
	eric.dumazet@...il.com, darren@...art.com, fweisbec@...il.com,
	patches@...aro.org
Subject: Re: [PATCH RFC tip/core/rcu 6/6] rcu: Reduce cache-miss
 initialization latencies for large systems

On Sat, Apr 28, 2012 at 06:42:22AM +0200, Mike Galbraith wrote:
> On Fri, 2012-04-27 at 08:15 -0700, Paul E. McKenney wrote: 
> > On Fri, Apr 27, 2012 at 06:36:11AM +0200, Mike Galbraith wrote:
> > > On Mon, 2012-04-23 at 09:42 -0700, Paul E. McKenney wrote: 
> > > > From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
> > > > 
> > > > Commit #0209f649 (rcu: limit rcu_node leaf-level fanout) set an upper
> > > > limit of 16 on the leaf-level fanout for the rcu_node tree.  This was
> > > > needed to reduce lock contention that was induced by the synchronization
> > > > of scheduling-clock interrupts, which was in turn needed to improve
> > > > energy efficiency for moderate-sized lightly loaded servers.
> > > > 
> > > > However, reducing the leaf-level fanout means that there are more
> > > > leaf-level rcu_node structures in the tree, which in turn means that
> > > > RCU's grace-period initialization incurs more cache misses.  This is
> > > > not a problem on moderate-sized servers with only a few tens of CPUs,
> > > 
> > > With a distro config (4096 CPUs) interrupt latency is bad even on a
> > > quad.  Traversing empty nodes taking locks and cache misses hurts.
> > 
> > Agreed -- and I will be working on an additional patch that makes RCU
> > avoid initializing its data structures for CPUs that don't exist.
> 
> That's still on my todo list too, your initial patch (and my butchery
> thereof to skip taking lock) showed this helps a heap.

Indeed, I am a bit worried about the safety of skipping the lock in
that case.  Either way, if you would rather keep working on this, I
have no problem pushing it further down my todo list.

> > That said, increasing the leaf-level fanout from 16 to 64 should reduce
> > the latency pain by a factor of four.  In addition, I would expect that
> > real-time builds of the kernel would set NR_CPUS to some value much
> > smaller than 4096.  ;-)
> 
> Yup, else you would have heard whimpering months ago ;-)

I am not sure that it would have been exactly whimpering, but yes.  ;-)

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ