linux-kernel - Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and remove it for ordinary release/acquire

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180712074040.GA4920@worktop.programming.kicks-ass.net>
Date:   Thu, 12 Jul 2018 09:40:40 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Andrea Parri <andrea.parri@...rulasolutions.com>
Cc:     Will Deacon <will.deacon@....com>,
        Alan Stern <stern@...land.harvard.edu>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        LKMM Maintainers -- Akira Yokosawa <akiyks@...il.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Daniel Lustig <dlustig@...dia.com>,
        David Howells <dhowells@...hat.com>,
        Jade Alglave <j.alglave@....ac.uk>,
        Luc Maranget <luc.maranget@...ia.fr>,
        Nicholas Piggin <npiggin@...il.com>,
        Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and
 remove it for ordinary release/acquire

On Wed, Jul 11, 2018 at 02:34:21PM +0200, Andrea Parri wrote:
> Simplicity is the eye of the beholder.  From my POV (LKMM maintainer), the
> simplest solution would be to get rid of rfi-rel-acq and unlock-rf-lock-po
> (or its analogous in v3) all together:

<snip>

> Among other things, this would immediately:
> 
>   1) Enable RISC-V to use their .aq/.rl annotations _without_ having to
>      "worry" about tso or release/acquire fences; IOW, this will permit
>      a partial revert of:
> 
>        0123f4d76ca6 ("riscv/spinlock: Strengthen implementations with fences")
>        5ce6c1f3535f ("riscv/atomic: Strengthen implementations with fences")

But I feel this goes in the wrong direction. This weakens the effective
memory model, where I feel we should strengthen it.

Currently PowerPC is the weakest here, and the above RISC-V changes
(reverts) would make RISC-V weaker still.

Any any effective weakening makes me very uncomfortable -- who knows
what will come apart this time. This memory ordering stuff causes
horrible subtle bugs at best.

>   2) Resolve the above mentioned controversy (the inconsistency between
>      - locking operations and atomic RMWs on one side, and their actual
>      implementation in generic code on the other), thus enabling the use
>      of LKMM _and_ its tools for the analysis/reviewing of the latter.

This is a good point; so lets see if there is something we can do to
strengthen the model so it all works again.

And I think if we raise atomic*_acquire() to require TSO (but ideally
raise it to RCsc) we're there.

The TSO archs have RCpc load-acquire and store-release, but fully
ordered atomics. Most of the other archs have smp_mb() everything, with
the exception of PPC, ARM64 and now RISC-V.

PPC has the RCpc TSO fence LWSYNC, ARM64 has the RCsc
load-acquire/store-release. And RISC-V has a gazillion of options IIRC.

So ideally atomic*_acquire() + smp_store_release() will be RCsc, and is
with the notable exception of PPC, and ideally RISC-V would be RCsc
here. But at the very least it should not be weaker than PPC.

By increasing atomic*_acquire() to TSO we also immediately get the
proposed:

  P0()
  {
	  WRITE_ONCE(X, 1);
	  spin_unlock(&s);
	  spin_lock(&s);
	  WRITE_ONCE(Y, 1);
  }

  P1()
  {
	  r1 = READ_ONCE(Y);
	  smp_rmb();
	  r2 = READ_ONCE(X);
  }

behaviour under discussion; because the spin_lock() will imply the TSO
ordering.

And note that this retains regular RCpc ACQUIRE for smp_load_acquire()
and associated primitives -- as they have had since their introduction
not too long ago.