linux-kernel - Re: [PATCH 1/4] spinlock: Document memory barrier rules

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <80de24e3-fa01-a6d6-99e9-afd1e831e07b@colorfullife.com>
Date:   Wed, 31 Aug 2016 20:32:18 +0200
From:   Manfred Spraul <manfred@...orfullife.com>
To:     Will Deacon <will.deacon@....com>,
        Peter Zijlstra <peterz@...radead.org>
Cc:     benh@...nel.crashing.org, paulmck@...ux.vnet.ibm.com,
        Ingo Molnar <mingo@...e.hu>, Boqun Feng <boqun.feng@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, 1vier1@....de,
        Davidlohr Bueso <dave@...olabs.net>
Subject: Re: [PATCH 1/4] spinlock: Document memory barrier rules

On 08/31/2016 06:40 PM, Will Deacon wrote:
>
> I'm struggling with this example. We have these locks:
>
>    &sem->lock
>    &sma->sem_base[0...sma->sem_nsems].lock
>    &sma->sem_perm.lock
>
> a condition variable:
>
>    sma->complex_mode
>
> and a new barrier:
>
>    smp_mb__after_spin_lock()
>
> For simplicity, we can make sma->sem_nsems == 1, and have &sma->sem_base[0]
> be &sem->lock in the example above.
Correct.
>   &sma->sem_perm.lock seems to be
> irrelevant.
Correct.
> The litmus test then looks a bit like:
>
> CPUm:
>
> LOCK(x)
> smp_mb();
> RyAcq=0
>
>
> CPUn:
>
> Wy=1
> smp_mb();
> UNLOCK_WAIT(x)
Correct.
>
> which I think can be simplified to:
>
>
> LOCK(x)
I thought that here a barrier is required, because Ry=0 could be before 
store of the lock.
> Ry=0
RyAcq instead of Ry would required due to the unlock at the end of the 
critical section
CpuN: <...>
           WyRelease=0
for the litmus test irrelevant.
> Wy=1
> smp_mb(); // Note that this is implied by spin_unlock_wait on PPC and arm64
> LOCK(x)   // spin_unlock_wait behaves like lock; unlock
> UNLOCK(x)

> [I've removed a bunch of barriers here, that I don't think are necessary
>   for the guarantees you're after]
>
> and the question is "Can both CPUs proceed?".
>
> Looking at the above, then I don't think that they can. Whilst CPUm can
> indeed speculate the Ry=0 before successfully taking the lock, if CPUn
> observes CPUm's read, then it must also observe the lock being held wrt
> the spin_lock API. That is because a successful LOCK operation by CPUn
> would force CPUm to replay its LL/SC loop and therefore discard its
> speculation of y.
>
> What am I missing? The code snippet seems to have too many barriers to me!
spin_unlock_wait() is not necessarily lock()+unlock().
It can be a simple Rx, or now RxAcq.

So I had assumed:

CPUm:

LOCK(x)
smp_mb(); /* at least for PPC, therefore with arch override */
RyAcq=0


CPUn:

Wy=1
smp_mb(); /* at least for archs where UNLOCK_WAIT is RxAcq */
UNLOCK_WAIT(x)
smp_rmb(); /* not required anymore, was required when UNLOCK_WAIT was Rx */


--
     Manfred