linux-kernel - Re: Unlock-lock questions and the Linux Kernel Memory Model

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.44L0.1711301020110.1380-100000@iolanthe.rowland.org>
Date:   Thu, 30 Nov 2017 10:46:22 -0500 (EST)
From:   Alan Stern <stern@...land.harvard.edu>
To:     Boqun Feng <boqun.feng@...il.com>
cc:     Daniel Lustig <dlustig@...dia.com>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Andrea Parri <parri.andrea@...il.com>,
        Luc Maranget <luc.maranget@...ia.fr>,
        Jade Alglave <j.alglave@....ac.uk>,
        Nicholas Piggin <npiggin@...il.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Will Deacon <will.deacon@....com>,
        David Howells <dhowells@...hat.com>,
        Palmer Dabbelt <palmer@...belt.com>,
        Kernel development list <linux-kernel@...r.kernel.org>
Subject: Re: Unlock-lock questions and the Linux Kernel Memory Model

On Thu, 30 Nov 2017, Boqun Feng wrote:

> On Wed, Nov 29, 2017 at 02:44:37PM -0500, Alan Stern wrote:
> > On Wed, 29 Nov 2017, Daniel Lustig wrote:
> > 
> > > While we're here, let me ask about another test which isn't directly
> > > about unlock/lock but which is still somewhat related to this
> > > discussion:
> > > 
> > > "MP+wmb+xchg-acq" (or some such)
> > > 
> > > {}
> > > 
> > > P0(int *x, int *y)
> > > {
> > >         WRITE_ONCE(*x, 1);
> > >         smp_wmb();
> > >         WRITE_ONCE(*y, 1);
> > > }
> > > 
> > > P1(int *x, int *y)
> > > {
> > >         r1 = atomic_xchg_relaxed(y, 2);
> > >         r2 = smp_load_acquire(y);
> > >         r3 = READ_ONCE(*x);
> > > }
> > > 
> > > exists (1:r1=1 /\ 1:r2=2 /\ 1:r3=0)
> > > 
> > > C/C++ would call the atomic_xchg_relaxed part of a release sequence
> > > and hence would forbid this outcome.
> > > 
> > > x86 and Power would forbid this.  ARM forbids this via a special-case
> > > rule in the memory model, ordering atomics with later load-acquires.
> > > 
> > > RISC-V, however, wouldn't forbid this by default using RCpc or RCsc
> > > atomics for smp_load_acquire().  It's an "fri; rfi" type of pattern,
> > > because xchg doesn't have an inherent internal data dependency.
> > > 
> > > If the Linux memory model is going to forbid this outcome, then
> > > RISC-V would either need to use fences instead, or maybe we'd need to
> > > add a special rule to our memory model similarly.  This is one detail
> > > where RISC-V is still actively deciding what to do.
> > > 
> > > Have you all thought about this test before?  Any idea which way you
> > > are leaning regarding the outcome above?
> > 
> > Good questions.  Currently the LKMM allows this, and I think it should
> > because xchg doesn't have a dependency from its read to its write.
> > 
> > On the other hand, herd isn't careful enough in the way it implements 
> > internal dependencies for RMW operations.  If we change 
> > atomic_xchg_relaxed(y, 2) to atomic_inc(y) and remove r1 from the test:
> > 
> > C MP+wmb+inc-acq
> > 
> > {}
> > 
> > P0(int *x, int *y)
> > {
> >         WRITE_ONCE(*x, 1);
> >         smp_wmb();
> >         WRITE_ONCE(*y, 1);
> > }
> > 
> > P1(int *x, int *y)
> > {
> >         atomic_inc(y);
> >         r2 = smp_load_acquire(y);
> >         r3 = READ_ONCE(*x);
> > }
> > 
> > exists (1:r2=2 /\ 1:r3=0)
> > 
> > then the test _should_ be forbidden, but it isn't -- herd doesn't
> > realize that all atomic RMW operations other than xchg must have a
> > dependency (either data or control) between their internal read and
> > write.
> > 
> > (Although the smp_load_acquire is allowed to execute before the write 
> > part of the atomic_inc, it cannot execute before the read part.  I 
> > think a similar argument applies even on ARM.)
> > 
> 
> But in case of AMOs, which directly send the addition request to memory
> controller, so there wouldn't be any read part or even write part of the
> atomic_inc() executed by CPU. Would this be allowed then?

Firstly, sending the addition request to the memory controller _is_ a
write operation.

Secondly, even though the CPU hardware might not execute a read 
operation during an AMO, the LKMM and herd nevertheless represent the 
atomic update as a specially-annotated read event followed by a write 
event.

In an other-multicopy-atomic system, P0's write to y must become
visible to P1 before P1 executes the smp_load_acquire, because the
write was visible to the memory controller when the controller carried
out the AMO, and the write becomes visible to the memory controller and
to P1 at the same time (by other-multicopy-atomicity).  That's why I
said the test would be forbidden on ARM.

But even on a non-other-multicopy-atomic system, there has to be some 
synchronization between the memory controller and P1's CPU.  Otherwise, 
how could the system guarantee that P1's smp_load_acquire would see the 
post-increment value of y?  It seems reasonable to assume that this 
synchronization would also cause P1 to see x=1.

Alan Stern