[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170804014326.7ea32fdb@roar.ozlabs.ibm.com>
Date: Fri, 4 Aug 2017 01:43:26 +1000
From: Nicholas Piggin <npiggin@...il.com>
To: Will Deacon <will.deacon@....com>
Cc: Peter Zijlstra <peterz@...radead.org>,
torvalds@...ux-foundation.org, oleg@...hat.com,
paulmck@...ux.vnet.ibm.com, benh@...nel.crashing.org,
mpe@...erman.id.au, linux-kernel@...r.kernel.org, mingo@...nel.org,
stern@...land.harvard.edu
Subject: Re: [PATCH -v2 3/4] locking: Introduce smp_mb__after_spinlock().
On Thu, 3 Aug 2017 16:28:20 +0100
Will Deacon <will.deacon@....com> wrote:
> On Wed, Aug 02, 2017 at 01:38:40PM +0200, Peter Zijlstra wrote:
> > Since its inception, our understanding of ACQUIRE, esp. as applied to
> > spinlocks, has changed somewhat. Also, I wonder if, with a simple
> > change, we cannot make it provide more.
> >
> > The problem with the comment is that the STORE done by spin_lock isn't
> > itself ordered by the ACQUIRE, and therefore a later LOAD can pass over
> > it and cross with any prior STORE, rendering the default WMB
> > insufficient (pointed out by Alan).
> >
> > Now, this is only really a problem on PowerPC and ARM64, both of
> > which already defined smp_mb__before_spinlock() as a smp_mb().
> >
> > At the same time, we can get a much stronger construct if we place
> > that same barrier _inside_ the spin_lock(). In that case we upgrade
> > the RCpc spinlock to an RCsc. That would make all schedule() calls
> > fully transitive against one another.
> >
> > Cc: Alan Stern <stern@...land.harvard.edu>
> > Cc: Nicholas Piggin <npiggin@...il.com>
> > Cc: Ingo Molnar <mingo@...nel.org>
> > Cc: Will Deacon <will.deacon@....com>
> > Cc: Linus Torvalds <torvalds@...ux-foundation.org>
> > Cc: Michael Ellerman <mpe@...erman.id.au>
> > Cc: Oleg Nesterov <oleg@...hat.com>
> > Cc: Benjamin Herrenschmidt <benh@...nel.crashing.org>
> > Cc: Paul McKenney <paulmck@...ux.vnet.ibm.com>
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> > ---
> > arch/arm64/include/asm/spinlock.h | 2 ++
> > arch/powerpc/include/asm/spinlock.h | 3 +++
> > include/linux/atomic.h | 3 +++
> > include/linux/spinlock.h | 36 ++++++++++++++++++++++++++++++++++++
> > kernel/sched/core.c | 4 ++--
> > 5 files changed, 46 insertions(+), 2 deletions(-)
> >
> > --- a/arch/arm64/include/asm/spinlock.h
> > +++ b/arch/arm64/include/asm/spinlock.h
> > @@ -367,5 +367,7 @@ static inline int arch_read_trylock(arch
> > * smp_mb__before_spinlock() can restore the required ordering.
> > */
> > #define smp_mb__before_spinlock() smp_mb()
> > +/* See include/linux/spinlock.h */
> > +#define smp_mb__after_spinlock() smp_mb()
> >
> > #endif /* __ASM_SPINLOCK_H */
>
> Acked-by: Will Deacon <will.deacon@....com>
Yeah this looks good to me. I don't think there would ever be a reason
to use smp_mb__before_spinlock() rather than smp_mb__after_spinlock().
Powered by blists - more mailing lists