linux-kernel - Re: [PATCH RFC tip/core/rcu] rcu: direct algorithmic SRCU implementation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120216105005.GA11674@Krystal>
Date:	Thu, 16 Feb 2012 05:50:06 -0500
From:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	linux-kernel@...r.kernel.org, mingo@...e.hu, laijs@...fujitsu.com,
	dipankar@...ibm.com, akpm@...ux-foundation.org,
	josh@...htriplett.org, niv@...ibm.com, tglx@...utronix.de,
	rostedt@...dmis.org, Valdis.Kletnieks@...edu, dhowells@...hat.com,
	eric.dumazet@...il.com, darren@...art.com, fweisbec@...il.com,
	patches@...aro.org, Avi Kiviti <avi@...hat.com>,
	Chris Mason <chris.mason@...cle.com>,
	Eric Paris <eparis@...hat.com>
Subject: Re: [PATCH RFC tip/core/rcu] rcu: direct algorithmic SRCU
	implementation

* Paul E. McKenney (paulmck@...ux.vnet.ibm.com) wrote:
> On Wed, Feb 15, 2012 at 01:59:23PM +0100, Peter Zijlstra wrote:
> > On Sun, 2012-02-12 at 18:09 -0800, Paul E. McKenney wrote:
> > > The current implementation of synchronize_srcu_expedited() can cause
> > > severe OS jitter due to its use of synchronize_sched(), which in turn
> > > invokes try_stop_cpus(), which causes each CPU to be sent an IPI.
> > > This can result in severe performance degradation for real-time workloads
> > > and especially for short-interation-length HPC workloads.  Furthermore,
> > > because only one instance of try_stop_cpus() can be making forward progress
> > > at a given time, only one instance of synchronize_srcu_expedited() can
> > > make forward progress at a time, even if they are all operating on
> > > distinct srcu_struct structures.
> > > 
> > > This commit, inspired by an earlier implementation by Peter Zijlstra
> > > (https://lkml.org/lkml/2012/1/31/211) and by further offline discussions,
> > > takes a strictly algorithmic bits-in-memory approach.  This has the
> > > disadvantage of requiring one explicit memory-barrier instruction in
> > > each of srcu_read_lock() and srcu_read_unlock(), but on the other hand
> > > completely dispenses with OS jitter and furthermore allows SRCU to be
> > > used freely by CPUs that RCU believes to be idle or offline.
> > > 
> > > The update-side implementation handles the single read-side memory
> > > barrier by rechecking the per-CPU counters after summing them and
> > > by running through the update-side state machine twice.
> > 
> > Yeah, getting rid of that second memory barrier in srcu_read_lock() is
> > pure magic :-)
> > 
> > > This implementation has passed moderate rcutorture testing on both 32-bit
> > > x86 and 64-bit Power.  A call_srcu() function will be present in a later
> > > version of this patch.
> > 
> > Goodness ;-)
> 
> Glad you like the magic and the prospect of call_srcu().  ;-)
> 
> > > @@ -131,10 +214,11 @@ int __srcu_read_lock(struct srcu_struct *sp)
> > >  	int idx;
> > >  
> > >  	preempt_disable();
> > > -	idx = sp->completed & 0x1;
> > > -	barrier();  /* ensure compiler looks -once- at sp->completed. */
> > > -	per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]++;
> > > -	srcu_barrier();  /* ensure compiler won't misorder critical section. */
> > > +	idx = rcu_dereference_index_check(sp->completed,
> > > +					  rcu_read_lock_sched_held()) & 0x1;
> > > +	ACCESS_ONCE(per_cpu_ptr(sp->per_cpu_ref, smp_processor_id())->c[idx]) +=
> > > +		SRCU_USAGE_COUNT + 1;
> > > +	smp_mb(); /* B */  /* Avoid leaking the critical section. */
> > >  	preempt_enable();
> > >  	return idx;
> > >  }
> > 
> > You could use __this_cpu_* muck to shorten some of that.
> 
> Ah, so something like this?
> 
> 	ACCESS_ONCE(this_cpu_ptr(sp->per_cpu_ref)->c[idx]) += 
> 		SRCU_USAGE_COUNT + 1;
> 
> Now that you mention it, this does look nicer, applied here and to
> srcu_read_unlock().

I think Peter refers to __this_cpu_add().

Thanks,

Mathieu

> 
> > Acked-by: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> 
> 
> 							Thanx, Paul
> 

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/