[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151011102520.GB27351@fixme-laptop.cn.ibm.com>
Date: Sun, 11 Oct 2015 18:25:20 +0800
From: Boqun Feng <boqun.feng@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
Ingo Molnar <mingo@...nel.org>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Michael Ellerman <mpe@...erman.id.au>,
Thomas Gleixner <tglx@...utronix.de>,
Will Deacon <will.deacon@....com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Waiman Long <waiman.long@...com>
Subject: Re: [RFC v2 5/7] powerpc: atomic: Implement cmpxchg{,64}_* and
atomic{,64}_cmpxchg_* variants
On Sat, Oct 10, 2015 at 09:58:05AM +0800, Boqun Feng wrote:
> Hi Peter,
>
> Sorry for replying late.
>
> On Thu, Oct 01, 2015 at 02:27:16PM +0200, Peter Zijlstra wrote:
> > On Wed, Sep 16, 2015 at 11:49:33PM +0800, Boqun Feng wrote:
> > > Unlike other atomic operation variants, cmpxchg{,64}_acquire and
> > > atomic{,64}_cmpxchg_acquire don't have acquire semantics if the cmp part
> > > fails, so we need to implement these using assembly.
> >
> > I think that is actually expected and documented. That is, a cmpxchg
> > only implies barriers on success. See:
> >
> > ed2de9f74ecb ("locking/Documentation: Clarify failed cmpxchg() memory ordering semantics")
>
> I probably didn't make myself clear here, my point is that if we use
> __atomic_op_acquire() to built *_cmpchg_acquire(For ARM and PowerPC),
> the barrier will be implied _unconditionally_, meaning no matter cmp
> fails or not, there will be a barrier after the cmpxchg operation.
> Therefore we have to use assembly to implement the operations right now.
>
Or let me try another way to explain this. What I wanted to say here is
that unlike the implementation of xchg family, which needs only to
implement _relaxed version and *remove* the fully ordered version, the
implementation of cmpxchg family needs to *remain* the fully ordered
version and implement the _acquire version in assembly. Because if we
use __atomic_op_*(), the barriers in the cmpxchg family will be implied
*unconditionally*, for example:
cmpxchg() on PPC will be(built by my version of __atomic_op_fence()):
smp_lwsync();
cmpxchg_relaxed(...);
smp_mb__after_atomic(); // a full barrier regardless of success
// or failure.
In order to have a conditional barrier, we need a way to jump out of a
ll/sc loop, which could only(?) be done by assembly code.
My commit log surely failed to explain this clearly, I will modifiy that
in next series. In the meanwhile, looking forwards to suggestion on the
implementation of cmpxchg familiy ;-)
BTW, Will, could you please check whether the barriers in cmpxchg family
are unconditional or not in the current implementation of ARM? IIUC,
they are currently unconditional, right?
Regards,
Boqun
Download attachment "signature.asc" of type "application/pgp-signature" (474 bytes)
Powered by blists - more mailing lists