linux-kernel - Re: [PATCH RFC tip/core/rcu 05/14] rcu: Abstract sequence counting from synchronize_sched

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150701221804.GW3717@linux.vnet.ibm.com>
Date:	Wed, 1 Jul 2015 15:18:04 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	linux-kernel@...r.kernel.org, mingo@...nel.org,
	laijs@...fujitsu.com, dipankar@...ibm.com,
	akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
	josh@...htriplett.org, tglx@...utronix.de, rostedt@...dmis.org,
	dhowells@...hat.com, edumazet@...gle.com, dvhart@...ux.intel.com,
	fweisbec@...il.com, oleg@...hat.com, bobby.prani@...il.com
Subject: Re: [PATCH RFC tip/core/rcu 05/14] rcu: Abstract sequence counting
 from synchronize_sched_expedited()

On Wed, Jul 01, 2015 at 12:27:17PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 30, 2015 at 03:25:45PM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
> > 
> > This commit creates rcu_exp_gp_seq_start() and rcu_exp_gp_seq_end() to
> > bracket an expedited grace period, rcu_exp_gp_seq_snap() to snapshot the
> > sequence counter, and rcu_exp_gp_seq_done() to check to see if a full
> > expedited grace period has elapsed since the snapshot.  These will be
> > applied to synchronize_rcu_expedited().  These are defined in terms of
> > underlying rcu_seq_start(), rcu_seq_end(), rcu_seq_snap(), rcu_seq_done(),
> > which will be applied to _rcu_barrier().
> 
> It would be good to explain why you cannot use seqcount primitives.
> They're >.< close.

They are indeed!  I gave it some thought, but it would inflict an
unnecessary smp_mb() on seqlocks, as you note below.

> > Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> > ---
> >  kernel/rcu/tree.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++--------
> >  1 file changed, 58 insertions(+), 10 deletions(-)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index c58fd27b4a22..f96500e462fd 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -3307,6 +3307,60 @@ void cond_synchronize_sched(unsigned long oldstate)
> >  }
> >  EXPORT_SYMBOL_GPL(cond_synchronize_sched);
> >  
> > +/* Adjust sequence number for start of update-side operation. */
> > +static void rcu_seq_start(unsigned long *sp)
> > +{
> > +	WRITE_ONCE(*sp, *sp + 1);
> > +	smp_mb(); /* Ensure update-side operation after counter increment. */
> > +	WARN_ON_ONCE(!(*sp & 0x1));
> > +}
> 
> That wants to be an ACQUIRE, right?

I cannot put the acquire in the WARN_ON_ONCE() because there
are configurations where WARN_ON_ONCE() is compiled out.  I could
conditionally compile, but given that this is nothing like a fastpath,
I cannot really justify doing that.

We could define an smp_store_acquire(), but that would require a full
barrier against subsequent loads.  The C++ committee hit this one when
trying to implement seqeunce locking using the C/C++11 atomics.  ;-)

> > +
> > +/* Adjust sequence number for end of update-side operation. */
> > +static void rcu_seq_end(unsigned long *sp)
> > +{
> > +	smp_mb(); /* Ensure update-side operation before counter increment. */
> 
> And that wants to be a RELEASE, right?
> 
> > +	WRITE_ONCE(*sp, *sp + 1);
> 
> 	smp_store_release();
> 
> even if balanced against a full barrier, might be better here?

I -think- it -might- be, and if it was in a fastpath, I might be
more motivated to worry about it.  I am not so sure that pairing an
smp_store_release() with a full memory barrier is in any way an aid to
readability, though.

> > +	WARN_ON_ONCE(*sp & 0x1);
> > +}
> 
> And the only difference between these and
> raw_write_seqcount_{begin,end}() is the smp_wmb() vs your smp_mb().
> 
> Since seqcounts have a distinct read vs writer side, we really only care
> about limiting the stores. I suspect you really do care about reads
> between these 'sequence points'. A few words to that effect could
> explain the existence of these primitives.

Excellent point!  I have updated the commit log accordingly.

> > +/* Take a snapshot of the update side's sequence number. */
> > +static unsigned long rcu_seq_snap(unsigned long *sp)
> > +{
> > +	unsigned long s;
> > +
> > +	smp_mb(); /* Caller's modifications seen first by other CPUs. */
> > +	s = (READ_ONCE(*sp) + 3) & ~0x1;
> > +	smp_mb(); /* Above access must not bleed into critical section. */
> 
> 	smp_load_acquire() then?

I have transitivity concerns.  Which might well be baseless, but again,
this is nowhere near a fastpath.

> > +	return s;
> > +}
> > +
> > +/*
> > + * Given a snapshot from rcu_seq_snap(), determine whether or not a
> > + * full update-side operation has occurred.
> > + */
> > +static bool rcu_seq_done(unsigned long *sp, unsigned long s)
> > +{
> > +	return ULONG_CMP_GE(READ_ONCE(*sp), s);
> 
> I'm always amused you're not wanting to rely on 2s complement for
> integer overflow. I _know_ its undefined behaviour in the C rule book,
> but the entire rest of the kernel hard assumes it.

I take it you have never seen the demonic glow in the eyes of a compiler
implementer when thinking of all the code that can be broken^W^W^W^W^W
optimizations that are enabled by relying on undefined behavior for
signed integer overflow?  ;-)

> > +}
> > +
> > +/* Wrapper functions for expedited grace periods.  */
> > +static void rcu_exp_gp_seq_start(struct rcu_state *rsp)
> > +{
> > +	rcu_seq_start(&rsp->expedited_sequence);
> > +}
> > +static void rcu_exp_gp_seq_end(struct rcu_state *rsp)
> > +{
> > +	rcu_seq_end(&rsp->expedited_sequence);
> > +}
> > +static unsigned long rcu_exp_gp_seq_snap(struct rcu_state *rsp)
> > +{
> > +	return rcu_seq_snap(&rsp->expedited_sequence);
> > +}
> > +static bool rcu_exp_gp_seq_done(struct rcu_state *rsp, unsigned long s)
> > +{
> > +	return rcu_seq_done(&rsp->expedited_sequence, s);
> > +}
> 
> This is wrappers for wrappers sake? Why?

For _rcu_barrier(), as noted in the commit log.

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/