linux-kernel - Re: [PATCH tip/core/rcu 23/23] rcu: Simplify quiescent-state detection

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120906211858.GA26173@Krystal>
Date:	Thu, 6 Sep 2012 17:18:59 -0400
From:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	linux-kernel@...r.kernel.org, mingo@...e.hu, laijs@...fujitsu.com,
	dipankar@...ibm.com, akpm@...ux-foundation.org,
	josh@...htriplett.org, niv@...ibm.com, tglx@...utronix.de,
	rostedt@...dmis.org, Valdis.Kletnieks@...edu, dhowells@...hat.com,
	eric.dumazet@...il.com, darren@...art.com, fweisbec@...il.com,
	sbw@....edu, patches@...aro.org,
	"Paul E. McKenney" <paul.mckenney@...aro.org>
Subject: Re: [PATCH tip/core/rcu 23/23] rcu: Simplify quiescent-state
 detection

* Paul E. McKenney (paulmck@...ux.vnet.ibm.com) wrote:
> On Thu, Sep 06, 2012 at 04:36:02PM +0200, Peter Zijlstra wrote:
> > On Thu, 2012-08-30 at 11:18 -0700, Paul E. McKenney wrote:
> > > From: "Paul E. McKenney" <paul.mckenney@...aro.org>
> > > 
> > > The current quiescent-state detection algorithm is needlessly
> > > complex.
> > 
> > Heh! Be careful, we might be led into believing all this RCU is actually
> > really rather simple and this complexity is a bug on your end ;-)
> 
> Actually, the smallest "toy" implementation of RCU is only about 20
> lines of code -- and on a mythical sequentially consistent system, it
> would be smaller still.  Of course, the Linux kernel implementation if
> somewhat larger.  Something about wanting scalability above a few tens of
> CPUs, real-time response (also on huge numbers of CPUs), energy-efficient
> handling of dyntick-idle mode, detection of stalled CPUs, and so on.  ;-)
> 
> > >   It records the grace-period number corresponding to
> > > the quiescent state at the time of the quiescent state, which
> > > works, but it seems better to simply erase any record of previous
> > > quiescent states at the time that the CPU notices the new grace
> > > period.  This has the further advantage of removing another piece
> > > of RCU for which lockless reasoning is required. 
> > 
> > So why didn't you do that from the start? :-)
> 
> Because I was slow and stupid!  ;-)
> 
> > That is, I'm curious to know some history, why was it so and what led
> > you to this insight?
> 
> I had historically (as in for almost 20 years now) used a counter
> to track grace periods.  Now these are in theory subject to integer
> overflow, but DYNIX/ptx was non-preemptible, so the general line of
> reasoning was that anything that might stall long enough for even a 32-bit
> grace-period counter to overflow would necessarily stall grace periods,
> thus preventing overflow.
> 
> Of course, the advent of CONFIG_PREEMPT in the Linux kernel invalidated
> this assumption, but for most uses, if the grace-period counter overflows,
> you have waited way more than a grace period, so who cares?
> 
> Then combination of TREE_RCU and dyntick-idle came along, and it became
> necessary to more carefully associate quiescent states with the corresponding
> grace period.  Now here overflow is dangerous, because it can result in
> associating an ancient quiescent state with the current grace period.
> But my attitude was that if you have a task preempted for more than one
> year, getting soft-lockup warnings every two minutes during that time,
> well, you got what you deserved.  And even then at very low probability.
> 
> However, formal validation software (such as Promela) do not take kindly
> to free-running counters.  The usual trick is to use a much narrower
> counter.  But that would mean that any attempted mechanical validation
> would give a big fat false positive on the counter used to associate
> quiescent states with grace periods.  Because I have a long-term goal
> of formally validating RCU is it sits in the Linux kernel, that counter
> had to go.

I believe this approach bring the kernel RCU implementation slightly
closer to the userspace RCU implementation we use for 32-bit QSBR and
the 32/64-bit "urcu mb" variant for libraries, of which we've indeed
been able to make a complete formal model in Promela. Simplifying the
algorithm (mainly its state-space) in order to let formal verifiers cope
with it entirely has a lot of value I think: it lets us mechanically
verify safety and progress. A nice way to lessen the number of headaches
caused by RCU! ;-)

Thanks!

Mathieu

> 
> And I do believe that the result is easier for humans to understand, so
> it is all to the good.
> 
> This time, at least.  ;-)
> 
> 							Thanx, Paul
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/