linux-kernel - Re: [PATCH 1/4] spinlock: Document memory barrier rules

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160831164020.GG29505@arm.com>
Date:   Wed, 31 Aug 2016 17:40:21 +0100
From:   Will Deacon <will.deacon@....com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Manfred Spraul <manfred@...orfullife.com>,
        benh@...nel.crashing.org, paulmck@...ux.vnet.ibm.com,
        Ingo Molnar <mingo@...e.hu>, Boqun Feng <boqun.feng@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, 1vier1@....de,
        Davidlohr Bueso <dave@...olabs.net>
Subject: Re: [PATCH 1/4] spinlock: Document memory barrier rules

On Wed, Aug 31, 2016 at 05:40:49PM +0200, Peter Zijlstra wrote:
> On Wed, Aug 31, 2016 at 06:59:07AM +0200, Manfred Spraul wrote:
> 
> > The barrier must ensure that taking the spinlock (as observed by another cpu
> > with spin_unlock_wait()) and a following read are ordered.
> > 
> > start condition: sma->complex_mode = false;
> > 
> > CPU 1:
> >     spin_lock(&sem->lock); /* sem_nsems instances */
> >     smp_mb__after_spin_lock();
> >     if (!smp_load_acquire(&sma->complex_mode)) {
> >         /* fast path successful! */
> >         return sops->sem_num;
> >     }
> >      /* slow path, not relevant */
> > 
> > CPU 2: (holding sma->sem_perm.lock)
> > 
> >         smp_store_mb(sma->complex_mode, true);
> > 
> >         for (i = 0; i < sma->sem_nsems; i++) {
> >                 spin_unlock_wait(&sma->sem_base[i].lock);
> >         }

I'm struggling with this example. We have these locks:

  &sem->lock
  &sma->sem_base[0...sma->sem_nsems].lock
  &sma->sem_perm.lock

a condition variable:

  sma->complex_mode

and a new barrier:

  smp_mb__after_spin_lock()

For simplicity, we can make sma->sem_nsems == 1, and have &sma->sem_base[0]
be &sem->lock in the example above. &sma->sem_perm.lock seems to be
irrelevant.

The litmus test then looks a bit like:

CPUm:

LOCK(x)
smp_mb();
RyAcq=0


CPUn:

Wy=1
smp_mb();
UNLOCK_WAIT(x)


which I think can be simplified to:


LOCK(x)
Ry=0

Wy=1
smp_mb(); // Note that this is implied by spin_unlock_wait on PPC and arm64
LOCK(x)   // spin_unlock_wait behaves like lock; unlock
UNLOCK(x)


[I've removed a bunch of barriers here, that I don't think are necessary
 for the guarantees you're after]

and the question is "Can both CPUs proceed?".

Looking at the above, then I don't think that they can. Whilst CPUm can
indeed speculate the Ry=0 before successfully taking the lock, if CPUn
observes CPUm's read, then it must also observe the lock being held wrt
the spin_lock API. That is because a successful LOCK operation by CPUn
would force CPUm to replay its LL/SC loop and therefore discard its
speculation of y.

What am I missing? The code snippet seems to have too many barriers to me!

Will