lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 15 Nov 2017 18:08:05 -0800 (PST)
From:   Palmer Dabbelt <palmer@...belt.com>
To:     Daniel Lustig <dlustig@...dia.com>
CC:     Will Deacon <will.deacon@....com>, Arnd Bergmann <arnd@...db.de>,
        Olof Johansson <olof@...om.net>, linux-kernel@...r.kernel.org,
        patches@...ups.riscv.org, peterz@...radead.org,
        boqun.feng@...il.com
Subject:     RE: [patches] Re: [PATCH v9 05/12] RISC-V: Atomic and Locking Code

On Wed, 15 Nov 2017 15:59:44 PST (-0800), Daniel Lustig wrote:
>> On Wed, 15 Nov 2017 10:06:01 PST (-0800), will.deacon@....com wrote:
>>> On Tue, Nov 14, 2017 at 12:30:59PM -0800, Palmer Dabbelt wrote:
>>> > On Tue, 24 Oct 2017 07:10:33 PDT (-0700), will.deacon@....com wrote:
>>> >>On Tue, Sep 26, 2017 at 06:56:31PM -0700, Palmer Dabbelt wrote:
>> >
>> > Hi Palmer,
>> >
>> >> >>+ATOMIC_OPS(add, add, +,  i,      , _relaxed)
>> >> >>+ATOMIC_OPS(add, add, +,  i, .aq  , _acquire) ATOMIC_OPS(add, add,
>> >> >>++,  i, .rl  , _release)
>> >> >>+ATOMIC_OPS(add, add, +,  i, .aqrl,         )
>> >> >
>> >> >Have you checked that .aqrl is equivalent to "ordered", since there
>> >> >are interpretations where that isn't the case. Specifically:
>> >> >
>> >> >// all variables zero at start of time
>> >> >P0:
>> >> >WRITE_ONCE(x) = 1;
>> >> >atomic_add_return(y, 1);
>> >> >WRITE_ONCE(z) = 1;
>> >> >
>> >> >P1:
>> >> >READ_ONCE(z) // reads 1
>> >> >smp_rmb();
>> >> >READ_ONCE(x) // must not read 0
>> >>
>> >> I haven't.  We don't quite have a formal memory model specification yet.
>> >> I've added Daniel Lustig, who is creating that model.  He should have
>> >> a better idea
>> >
>> > Thanks. You really do need to ensure that, as it's heavily relied upon.
>> 
>> I know it's the case for our current processors, and I'm pretty sure it's the
>> case for what's formally specified, but we'll have to wait for the spec in order
>> to prove it.
>
> I think Will is right.  In the current spec, using .aqrl converts an RCpc load
> or store into an RCsc load or store, but the acquire(-RCsc) annotation still
> only applies to the load part of the atomic, and the release(-RCsc) annotation
> applies only to the store part of the atomic.
>
> Why is that?  Picture an machine which implements AMOs using something that
> looks more like an LR/SC under the covers, or one that uses cache line locking,
> or anything else along those same lines.  In some such machines, there could be
> a window between lock/reserve and unlock/store-conditional where other later
> stores could squeeze into, and that would break Will's example among others.

I'm not sure I understand this.  My brand new mental model here is that

  amoadd.w.aqrl Rout, Rinc, Raddr

is exactly the same as

  0:
    lr.w.aq Rout, Raddr
    add Rout, Rout, Rinc
    sc.w.rl Rtmp, Raddr
    bnez Rtmp, 0b

but I don't see how that allows the WRITE_ONCE(z) to appear before the 
WRITE_ONCE(x).  I get it appearing before the SC, but not the LR.  Am I 
misunderstanding acquire/release here?  My model for that is that that .aq 
doesn't let accesses from after (in program order) be observed before, while 
.rl doesn't let accesses from before move after.

Either way, I think that's still broken, as given the above sequence we could 
observe

  READ_ONCE(y); // -> 0, the add hasn't happened
  smp_rmb();
  READ_ONCE(z); // -> 1, the write has happened

which is bad.

> It's likely the same reasoning that causes ARM to use a trailing dmb here,
> rather than just using ldaxr/stlxr.  Is that right Will?  I know that's LL/SC
> and this particular cases uses AMOADD, but it's the same principle.  Well, at
> least according to how we have it in the current memory model draft.
>
> Also, RISC-V currently prefers leading fence mappings, so I think the result
> here, for atomic_add_return() for example, should be this:
>
> fence rw,rw
> amoadd.aq ...
>
> Note that at this point, I think you could even elide the .rl.  If I'm reading
> it right it looks like the ARM mapping does this too (well, the reverse: ARM
> elides the "a" in ldaxr due to the trailing dmb making it redundant).
>
> Does that seem reasonable to you all?

Well, it's kind of a bummer on my end -- we're going to have to change a lot of 
stuff to add an extra fence, and it appears that .aqrl is pretty useless.  I'm 
also not convinced this is even valid, it'd map to

    fence rw,rw
  0:
    lr.w.aq Rout, Raddr
    add Rout, Rout, Rinc
    sc.w Rtmp, Raddr
    bnez Rtmp, 0b

which would still allow a subsequent store to be made visible before the 
'sc.w', with or without a .rl.  Doesn't this mean we need

  fence rw,rw
  amo.w ...
  fence rw,rw

which seems pretty heavy-handed.  FWIW, this is exactly how our current 
hardware interprets amo.w.aqrl -- not really an argument for anything, we're 
just super conservative because there's no formal spec.

It looks like we're still safe for the _acquire and _release versions.  From 
Documentation/atomic_t.txt:

  {}_relaxed: unordered
  {}_acquire: the R of the RMW (or atomic_read) is an ACQUIRE
  {}_release: the W of the RMW (or atomic_set)  is a  RELEASE

So it's just the ordered ones that are broken.   I'm assuming this means our 
cmpxchg is broken as well

  0:
    lr.w.aqrl Rold, Raddr
    bne Rcomp, Rold, 1f
    sc.w.aqrl Rtmp, Rnew, Raddr
    bnez Rtmp, 0b
  1:

because sc.w.aqrl would be the same as sc.w.rl, so it's just the same as the 
amoadd case above.

Hopefully I'm just crazy here, I should probably get some sleep at some point 
:)...  I just scanned over some more mail that went by, let's just double-fence 
for now.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ