[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151019011718.GB924@fixme-laptop.cn.ibm.com>
Date: Mon, 19 Oct 2015 09:17:18 +0800
From: Boqun Feng <boqun.feng@...il.com>
To: Will Deacon <will.deacon@....com>
Cc: Peter Zijlstra <peterz@...radead.org>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Michael Ellerman <mpe@...erman.id.au>,
linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org,
Anton Blanchard <anton@...ba.org>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH v2] barriers: introduce smp_mb__release_acquire and
update documentation
On Fri, Oct 09, 2015 at 10:40:39AM +0100, Will Deacon wrote:
> On Fri, Oct 09, 2015 at 10:31:38AM +0200, Peter Zijlstra wrote:
[snip]
> >
> > So lots of little confusions added up to complete fail :-{
> >
> > Mostly I think it was the UNLOCK x + LOCK x are fully ordered (where I
> > forgot: but not against uninvolved CPUs) and RELEASE/ACQUIRE are
> > transitive (where I forgot: RELEASE/ACQUIRE _chains_ are transitive, but
> > again not against uninvolved CPUs).
> >
> > Which leads me to think I would like to suggest alternative rules for
> > RELEASE/ACQUIRE (to replace those Will suggested; as I think those are
> > partly responsible for my confusion).
>
> Yeah, sorry. I originally used the phrase "fully ordered" but changed it
> to "full barrier", which has stronger transitivity (newly understood
> definition) requirements that I didn't intend.
>
> RELEASE -> ACQUIRE should be used for message passing between two CPUs
> and not have ordering effects on other observers unless they're part of
> the RELEASE -> ACQUIRE chain.
>
> > - RELEASE -> ACQUIRE is fully ordered (but not a full barrier) when
> > they operate on the same variable and the ACQUIRE reads from the
> > RELEASE. Notable, RELEASE/ACQUIRE are RCpc and lack transitivity.
>
> Are we explicit about the difference between "fully ordered" and "full
> barrier" somewhere else, because this looks like it will confuse people.
>
This is confusing me right now. ;-)
Let's use a simple example for only one primitive, as I understand it,
if we say a primitive A is "fully ordered", we actually mean:
1. The memory operations preceding(in program order) A can't be
reordered after the memory operations following(in PO) A.
and
2. The memory operation(s) in A can't be reordered before the
memory operations preceding(in PO) A and after the memory
operations following(in PO) A.
If we say A is a "full barrier", we actually means:
1. The memory operations preceding(in program order) A can't be
reordered after the memory operations following(in PO) A.
and
2. The memory ordering guarantee in #1 is visible globally.
Is that correct? Or "full barrier" is more strong than I understand,
i.e. there is a third property of "full barrier":
3. The memory operation(s) in A can't be reordered before the
memory operations preceding(in PO) A and after the memory
operations following(in PO) A.
IOW, is "full barrier" a more strong version of "fully ordered" or not?
Regards,
Boqun
> > - RELEASE -> ACQUIRE can be upgraded to a full barrier (including
> > transitivity) using smp_mb__release_acquire(), either before RELEASE
> > or after ACQUIRE (but consistently [*]).
>
> Hmm, but we don't actually need this for RELEASE -> ACQUIRE, afaict. This
> is just needed for UNLOCK -> LOCK, and is exactly what RCU is currently
> using (for PPC only).
>
> Stepping back a second, I believe that there are three cases:
>
>
> RELEASE X -> ACQUIRE Y (same CPU)
> * Needs a barrier on TSO architectures for full ordering
>
> UNLOCK X -> LOCK Y (same CPU)
> * Needs a barrier on PPC for full ordering
>
> RELEASE X -> ACQUIRE X (different CPUs)
> UNLOCK X -> ACQUIRE X (different CPUs)
> * Fully ordered everywhere...
> * ... but needs a barrier on PPC to become a full barrier
>
>
Download attachment "signature.asc" of type "application/pgp-signature" (474 bytes)
Powered by blists - more mailing lists