lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 28 Apr 2012 06:42:22 +0200
From:	Mike Galbraith <efault@....de>
To:	paulmck@...ux.vnet.ibm.com
Cc:	linux-kernel@...r.kernel.org, mingo@...e.hu, laijs@...fujitsu.com,
	dipankar@...ibm.com, akpm@...ux-foundation.org,
	mathieu.desnoyers@...ymtl.ca, josh@...htriplett.org,
	niv@...ibm.com, tglx@...utronix.de, peterz@...radead.org,
	rostedt@...dmis.org, Valdis.Kletnieks@...edu, dhowells@...hat.com,
	eric.dumazet@...il.com, darren@...art.com, fweisbec@...il.com,
	patches@...aro.org
Subject: Re: [PATCH RFC tip/core/rcu 6/6] rcu: Reduce cache-miss
 initialization latencies for large systems

On Fri, 2012-04-27 at 08:15 -0700, Paul E. McKenney wrote: 
> On Fri, Apr 27, 2012 at 06:36:11AM +0200, Mike Galbraith wrote:
> > On Mon, 2012-04-23 at 09:42 -0700, Paul E. McKenney wrote: 
> > > From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
> > > 
> > > Commit #0209f649 (rcu: limit rcu_node leaf-level fanout) set an upper
> > > limit of 16 on the leaf-level fanout for the rcu_node tree.  This was
> > > needed to reduce lock contention that was induced by the synchronization
> > > of scheduling-clock interrupts, which was in turn needed to improve
> > > energy efficiency for moderate-sized lightly loaded servers.
> > > 
> > > However, reducing the leaf-level fanout means that there are more
> > > leaf-level rcu_node structures in the tree, which in turn means that
> > > RCU's grace-period initialization incurs more cache misses.  This is
> > > not a problem on moderate-sized servers with only a few tens of CPUs,
> > 
> > With a distro config (4096 CPUs) interrupt latency is bad even on a
> > quad.  Traversing empty nodes taking locks and cache misses hurts.
> 
> Agreed -- and I will be working on an additional patch that makes RCU
> avoid initializing its data structures for CPUs that don't exist.

That's still on my todo list too, your initial patch (and my butchery
thereof to skip taking lock) showed this helps a heap.

> That said, increasing the leaf-level fanout from 16 to 64 should reduce
> the latency pain by a factor of four.  In addition, I would expect that
> real-time builds of the kernel would set NR_CPUS to some value much
> smaller than 4096.  ;-)

Yup, else you would have heard whimpering months ago ;-)

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ