[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151012064621.GL3604@twins.programming.kicks-ass.net>
Date: Mon, 12 Oct 2015 08:46:21 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Boqun Feng <boqun.feng@...il.com>
Cc: linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
Ingo Molnar <mingo@...nel.org>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Michael Ellerman <mpe@...erman.id.au>,
Thomas Gleixner <tglx@...utronix.de>,
Will Deacon <will.deacon@....com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Waiman Long <waiman.long@...com>
Subject: Re: [RFC v2 5/7] powerpc: atomic: Implement cmpxchg{,64}_* and
atomic{,64}_cmpxchg_* variants
On Sun, Oct 11, 2015 at 06:25:20PM +0800, Boqun Feng wrote:
> On Sat, Oct 10, 2015 at 09:58:05AM +0800, Boqun Feng wrote:
> > Hi Peter,
> >
> > Sorry for replying late.
> >
> > On Thu, Oct 01, 2015 at 02:27:16PM +0200, Peter Zijlstra wrote:
> > > On Wed, Sep 16, 2015 at 11:49:33PM +0800, Boqun Feng wrote:
> > > > Unlike other atomic operation variants, cmpxchg{,64}_acquire and
> > > > atomic{,64}_cmpxchg_acquire don't have acquire semantics if the cmp part
> > > > fails, so we need to implement these using assembly.
> > >
> > > I think that is actually expected and documented. That is, a cmpxchg
> > > only implies barriers on success. See:
> > >
> > > ed2de9f74ecb ("locking/Documentation: Clarify failed cmpxchg() memory ordering semantics")
> >
> > I probably didn't make myself clear here, my point is that if we use
> > __atomic_op_acquire() to built *_cmpchg_acquire(For ARM and PowerPC),
> > the barrier will be implied _unconditionally_, meaning no matter cmp
> > fails or not, there will be a barrier after the cmpxchg operation.
> > Therefore we have to use assembly to implement the operations right now.
See later, but no, you don't _have_ to.
> Or let me try another way to explain this. What I wanted to say here is
> that unlike the implementation of xchg family, which needs only to
> implement _relaxed version and *remove* the fully ordered version, the
> implementation of cmpxchg family needs to *remain* the fully ordered
> version and implement the _acquire version in assembly. Because if we
> use __atomic_op_*(), the barriers in the cmpxchg family will be implied
> *unconditionally*, for example:
So the point that confused me, and which is still valid for the above,
is your use of 'need'.
You don't need to omit the barrier at all. Its perfectly valid to issue
too many barriers (pointless and a waste of time, yes; incorrect, no).
So what you want to say is: "Optimize cmpxchg_acquire() to avoid
superfluous barrier".
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists