linux-kernel - Re: [PATCH] barriers: introduce smp_mb__release

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150917025012.GB4000@fixme-laptop.cn.ibm.com>
Date:	Thu, 17 Sep 2015 10:50:12 +0800
From:	Boqun Feng <boqun.feng@...il.com>
To:	Will Deacon <will.deacon@....com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	"linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] barriers: introduce smp_mb__release_acquire and update
 documentation

On Wed, Sep 16, 2015 at 12:07:06PM +0100, Will Deacon wrote:
> On Wed, Sep 16, 2015 at 11:43:14AM +0100, Peter Zijlstra wrote:
> > On Wed, Sep 16, 2015 at 11:29:08AM +0100, Will Deacon wrote:
> > > > Indeed, that is a hole in the definition, that I think we should close.
> > 
> > > I'm struggling to understand the hole, but here's my intuition. If an
> > > ACQUIRE on CPUx reads from a RELEASE by CPUy, then I'd expect CPUx to
> > > observe all memory accessed performed by CPUy prior to the RELEASE
> > > before it observes the RELEASE itself, regardless of this new barrier.
> > > I think this matches what we currently have in memory-barriers.txt (i.e.
> > > acquire/release are neither transitive or multi-copy atomic).
> > 
> > Ah agreed. I seem to have gotten my brain in a tangle.
> > 
> > Basically where a program order release+acquire relies on an address
> > dependency, a cross cpu release+acquire relies on causality. If we
> > observe the release, we must also observe everything prior to it etc.
> 
> Yes, and crucially, the "everything prior to it" only encompasses accesses
> made by the releasing CPU itself (in the absence of other barriers and
> synchronisation).
> 

Just want to make sure I understand you correctly, do you mean that in
the following case:

CPU 1			CPU 2				CPU 3
==============		============================	===============
{ A = 0, B = 0 }
WRITE_ONCE(A,1);	r1 = READ_ONCE(A);		r2 = smp_load_acquire(&B);
			smp_store_release(&B, 1);	r3 = READ_ONCE(A);

r1 == 1 && r2 == 1 && r3 == 0 is not prohibitted?

However, according to the discussion of Paul and Peter:

https://lkml.org/lkml/2015/9/15/707

I think that's prohibitted on architectures except s390 for sure. And
for s390, we are waiting for the maintainers to verify this. If s390
also prohibits this, then a release-acquire pair(on different CPUs) to
the same variable does guarantee transitivity.

Did I misunderstand you or miss something here?

> Given that we managed to get confused, it doesn't hurt to call this out
> explicitly in the doc, so I can add the following extra text.
> 
> Will
> 
> --->8
> 
> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> index 46a85abb77c6..794d102d06df 100644
> --- a/Documentation/memory-barriers.txt
> +++ b/Documentation/memory-barriers.txt
> @@ -1902,8 +1902,8 @@ the RELEASE would simply complete, thereby avoiding the deadlock.
>  	a sleep-unlock race, but the locking primitive needs to resolve
>  	such races properly in any case.
>  
> -If necessary, ordering can be enforced by use of an
> -smp_mb__release_acquire() barrier:
> +Where the RELEASE and ACQUIRE operations are performed by the same CPU,
> +ordering can be enforced by use of an smp_mb__release_acquire() barrier:
>  
>  	*A = a;
>  	RELEASE M
> @@ -1916,6 +1916,10 @@ in which case, the only permitted sequences are:
>  	STORE *A, RELEASE M, ACQUIRE N, STORE *B
>  	STORE *A, ACQUIRE N, RELEASE M, STORE *B
>  
> +Note that smp_mb__release_acquire() has no effect on ACQUIRE or RELEASE
> +operations performed by other CPUs, even if they are to the same variable.
> +In cases where transitivity is required, smp_mb() should be used explicitly.
> +

Then, IIRC, the memory order effect of RELEASE+ACQUIRE should be:

If an ACQUIRE loads the value of stored by a RELEASE, then on the CPU
executing the ACQUIRE operation, all the memory operations after the
ACQUIRE operation will perceive all the memory operations before the
RELEASE operation on the CPU executing the RELEASE operation.

This could cover both the "on the same CPU" and "on different CPUs"
cases.

Of course, this may has nothing to do with smp_mb__release_acquire(),
but I think we can take this chance to document the memory order effect
of RELEASE+ACQUIRE well.


Regards,
Boqun

>  Locks and semaphores may not provide any guarantee of ordering on UP compiled
>  systems, and so cannot be counted on in such a situation to actually achieve
>  anything at all - especially with respect to I/O accesses - unless combined
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

Download attachment "signature.asc" of type "application/pgp-signature" (474 bytes)