lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180712115249.GA6201@andrea>
Date:   Thu, 12 Jul 2018 13:52:49 +0200
From:   Andrea Parri <andrea.parri@...rulasolutions.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Will Deacon <will.deacon@....com>,
        Alan Stern <stern@...land.harvard.edu>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        LKMM Maintainers -- Akira Yokosawa <akiyks@...il.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Daniel Lustig <dlustig@...dia.com>,
        David Howells <dhowells@...hat.com>,
        Jade Alglave <j.alglave@....ac.uk>,
        Luc Maranget <luc.maranget@...ia.fr>,
        Nicholas Piggin <npiggin@...il.com>,
        Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] tools/memory-model: Add extra ordering for locks and
 remove it for ordinary release/acquire

On Thu, Jul 12, 2018 at 09:40:40AM +0200, Peter Zijlstra wrote:
> On Wed, Jul 11, 2018 at 02:34:21PM +0200, Andrea Parri wrote:
> > Simplicity is the eye of the beholder.  From my POV (LKMM maintainer), the
> > simplest solution would be to get rid of rfi-rel-acq and unlock-rf-lock-po
> > (or its analogous in v3) all together:
> 
> <snip>
> 
> > Among other things, this would immediately:
> > 
> >   1) Enable RISC-V to use their .aq/.rl annotations _without_ having to
> >      "worry" about tso or release/acquire fences; IOW, this will permit
> >      a partial revert of:
> > 
> >        0123f4d76ca6 ("riscv/spinlock: Strengthen implementations with fences")
> >        5ce6c1f3535f ("riscv/atomic: Strengthen implementations with fences")
> 
> But I feel this goes in the wrong direction. This weakens the effective
> memory model, where I feel we should strengthen it.
> 
> Currently PowerPC is the weakest here, and the above RISC-V changes
> (reverts) would make RISC-V weaker still.
> 
> Any any effective weakening makes me very uncomfortable -- who knows
> what will come apart this time. This memory ordering stuff causes
> horrible subtle bugs at best.

Indeed, what I was suggesting above is a weaking of the current model
(and I agree: I wouldn't say that bugs in this context are nice  ;-).

These changes would affect a specific area: (IMO,) the examples we've
been considering here aren't for the faint-hearted  ;-) and as Daniel
already suggested, everything would again be "nice and neat", if this
was all about locking/if every thread used lock-synchronization.


> 
> >   2) Resolve the above mentioned controversy (the inconsistency between
> >      - locking operations and atomic RMWs on one side, and their actual
> >      implementation in generic code on the other), thus enabling the use
> >      of LKMM _and_ its tools for the analysis/reviewing of the latter.
> 
> This is a good point; so lets see if there is something we can do to
> strengthen the model so it all works again.
> 
> And I think if we raise atomic*_acquire() to require TSO (but ideally
> raise it to RCsc) we're there.
> 
> The TSO archs have RCpc load-acquire and store-release, but fully
> ordered atomics. Most of the other archs have smp_mb() everything, with
> the exception of PPC, ARM64 and now RISC-V.
> 
> PPC has the RCpc TSO fence LWSYNC, ARM64 has the RCsc
> load-acquire/store-release. And RISC-V has a gazillion of options IIRC.
> 
> 
> So ideally atomic*_acquire() + smp_store_release() will be RCsc, and is
> with the notable exception of PPC, and ideally RISC-V would be RCsc
> here. But at the very least it should not be weaker than PPC.
> 
> By increasing atomic*_acquire() to TSO we also immediately get the
> proposed:
> 
>   P0()
>   {
> 	  WRITE_ONCE(X, 1);
> 	  spin_unlock(&s);
> 	  spin_lock(&s);
> 	  WRITE_ONCE(Y, 1);
>   }
> 
>   P1()
>   {
> 	  r1 = READ_ONCE(Y);
> 	  smp_rmb();
> 	  r2 = READ_ONCE(X);
>   }
> 
> behaviour under discussion; because the spin_lock() will imply the TSO
> ordering.

You mean: "when paired with a po-earlier release to the same memory
location", right?  I am afraid that neither arm64 nor riscv current
implementations would ensure "(r1 == 1 && r2 == 0) forbidden" if we
removed the po-earlier spin_unlock()...

AFAICT, the current implementation would work with that release: as
you remarked above, arm64 release->acquire is SC; riscv has an rw,w
fence in its spin_unlock() (hence an w,w fence between the stores),
or it could have a .tso fence ...

But again, these are stuble patterns, and my guess is that several/
most kernel developers really won't care about such guarantees (and
if some will do, they'll have the tools to figure out what they can
actually rely on ...)

OTOH (as I pointed out earlier) the strengthening we're configuring
will prevent some arch. (riscv being just the example of today!) to
go "full RCsc", and this will inevitably "complicate" both the LKMM
and the reviewing process of related changes (atomics, locking, ...;
c.f., this debate), apparently, just because you  ;-) want to "care"
about these guarantees.

Not yet convinced ...  :/

  Andrea


> 
> And note that this retains regular RCpc ACQUIRE for smp_load_acquire()
> and associated primitives -- as they have had since their introduction
> not too long ago.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ