linux-kernel - Re: [PATCH tip/core/rcu 6/7] rcu: Drive quiescent-state-forcing delay from HZ

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20130516094519.GJ19669@dyad.programming.kicks-ass.net>
Date:	Thu, 16 May 2013 11:45:19 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:	Josh Triplett <josh@...htriplett.org>,
	linux-kernel@...r.kernel.org, mingo@...e.hu, laijs@...fujitsu.com,
	dipankar@...ibm.com, akpm@...ux-foundation.org,
	mathieu.desnoyers@...ymtl.ca, niv@...ibm.com, tglx@...utronix.de,
	rostedt@...dmis.org, Valdis.Kletnieks@...edu, dhowells@...hat.com,
	edumazet@...gle.com, darren@...art.com, fweisbec@...il.com,
	sbw@....edu
Subject: Re: [PATCH tip/core/rcu 6/7] rcu: Drive quiescent-state-forcing
 delay from HZ

On Wed, May 15, 2013 at 10:31:42AM -0700, Paul E. McKenney wrote:
> On Wed, May 15, 2013 at 11:02:34AM +0200, Peter Zijlstra wrote:

> > Earlier you said that improving EQS behaviour was expensive in that it
> > would require taking (global) locks or somesuch.
> > 
> > Would it not be possible to have the cpu performing a FQS finish this
> > work; that way the first FQS would be a little slow, but after that no
> > FQS would be needed anymore, right? Since we'd no longer require the
> > other CPUs to end a grace period.
> 
> It is not just the first FQS that would be slow, it would also be slow
> the next time that this CPU transitioned from idle to non-idle, which
> is when this work would need to be undone.

Hurm, yes I suppose that is true. If you've saved more on FQS cost it might be
worth it for the throughput people though.

But somehow I imagined making a CPU part of the GP would be easier than taking
it out. After all, taking it out is dangerous and careful work, one is not to
accidentally execute a callback or otherwise end a GP before time.

When entering the GP cycle there is no such concern, the CPU state is clean
after all.

> Furthermore, in this approach, RCU would still need to scan all the CPUs
> to see if any did the first part of the transition to idle.  And if we
> have to scan either way, why not keep the idle-nonidle transitions cheap
> and continue to rely on the scan?  Here are the rationales I can think
> of and what I am thinking in terms of doing instead:
> 
> 1.	The scan could become a scalability bottleneck.  There is one
> 	way to handle this today, and one possible future change.  The way
> 	to handle this today is to increas rcutree.jiffies_till_first_fqs,
> 	for example, the SGI guys set it to 20 or thereabouts.  If this
> 	becomes problematic, I could easily create multiple kthreads to
> 	carry out the FQS scan in parallel for large systems.

*groan* whoever thought all this SMP nonsense was worth it again? :-)

> 2.	Someone could demonstrate that RCU's grace periods were significantly
> 	delaying boot.  There are several ways of dealing with this:

Surely there's also non-boot cases where most of the machine is 'idle' and
we're running into FQS? Esp. now with that userspace NO_HZ stuff from Frederic.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/