linux-kernel - Re: [RFC PATCH] riscv/locking: Strengthen spin_lock() and spin

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <mhng-82f403f2-fa94-4fdd-8770-3ee6f79a6752@palmer-si-x1c4>
Date:   Thu, 01 Mar 2018 13:54:29 -0800 (PST)
From:   Palmer Dabbelt <palmer@...ive.com>
To:     parri.andrea@...il.com
CC:     Daniel Lustig <dlustig@...dia.com>, peterz@...radead.org,
        paulmck@...ux.vnet.ibm.com, linux-kernel@...r.kernel.org,
        albert@...ive.com, stern@...land.harvard.edu,
        Will Deacon <will.deacon@....com>, boqun.feng@...il.com,
        npiggin@...il.com, dhowells@...hat.com, j.alglave@....ac.uk,
        luc.maranget@...ia.fr, akiyks@...il.com, mingo@...nel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-riscv@...ts.infradead.org
Subject:     Re: [RFC PATCH] riscv/locking: Strengthen spin_lock() and spin_unlock()

On Thu, 01 Mar 2018 07:11:41 PST (-0800), parri.andrea@...il.com wrote:
> Hi Daniel,
>
> On Thu, Feb 22, 2018 at 11:47:57AM -0800, Daniel Lustig wrote:
>> On 2/22/2018 10:27 AM, Peter Zijlstra wrote:
>> > On Thu, Feb 22, 2018 at 10:13:17AM -0800, Paul E. McKenney wrote:
>> >> So we have something that is not all that rare in the Linux kernel
>> >> community, namely two conflicting more-or-less concurrent changes.
>> >> This clearly needs to be resolved, either by us not strengthening the
>> >> Linux-kernel memory model in the way we were planning to or by you
>> >> strengthening RISC-V to be no weaker than PowerPC for these sorts of
>> >> externally viewed release-acquire situations.
>> >>
>> >> Other thoughts?
>> >
>> > Like said in the other email, I would _much_ prefer to not go weaker
>> > than PPC, I find that PPC is already painfully weak at times.
>>
>> Sure, and RISC-V could make this work too by using RCsc instructions
>> and/or by using lightweight fences instead.  It just wasn't clear at
>> first whether smp_load_acquire() and smp_store_release() were RCpc,
>> RCsc, or something else, and hence whether RISC-V would actually need
>> to use something stronger than pure RCpc there.  Likewise for
>> spin_unlock()/spin_lock() and everywhere else this comes up.
>
> while digging into riscv's locks and atomics to fix the issues discussed
> earlier in this thread, I became aware of another issue:
>
> Considering here the CMPXCHG primitives, for example, I noticed that the
> implementation currently provides the four variants
>
> 	ATOMIC_OPS(        , .aqrl)
> 	ATOMIC_OPS(_acquire,   .aq)
> 	ATOMIC_OPS(_release,   .rl)
> 	ATOMIC_OPS(_relaxed,      )
>
> (corresponding, resp., to
>
> 	atomic_cmpxchg()
> 	atomic_cmpxchg_acquire()
> 	atomic_cmpxchg_release()
> 	atomic_cmpxchg_relaxed()  )
>
> so that the first variant above ends up doing
>
> 	0:	lr.w.aqrl  %0, %addr
> 		bne        %0, %old, 1f
> 		sc.w.aqrl  %1, %new, %addr
> 		bnez       %1, 0b
>         1:
>
> From Sect. 2.3.5. ("Acquire/Release Ordering") of the Spec.,
>
>  "AMOs with both .aq and .rl set are fully-ordered operations.  Treating
>   the load part and the store part as independent RCsc operations is not
>   in and of itself sufficient to enforce full fencing behavior, but this
>   subtle weak behavior is counterintuitive and not much of an advantage
>   architecturally, especially with lr and sc also available [...]."
>
> I understand that
>
>         { x = y = u = v = 0 }
>
> 	P0()
>
> 	WRITE_ONCE(x, 1);		("relaxed" store, sw)
> 	atomic_cmpxchg(&u, 0, 1);
> 	r0 = READ_ONCE(y);		("relaxed" load, lw)
>
> 	P1()
>
> 	WRITE_ONCE(y, 1);
> 	atomic_cmpxchg(&v, 0, 1);
> 	r1 = READ_ONCE(x);
>
> could result in (u = v = 1 and r0 = r1 = 0) at the end; can you confirm?

cmpxchg isn't an AMO, it's an LR SC sequence, so that blurb doesn't apply.  I 
think "lr.w.aqrl" and "sc.w.aqrl" is not sufficient to perform a fully ordered 
operation (ie, it's an incorrect implementation of atomic_cmpxchg()), but I was 
hoping to get some time to actually internalize this part of the RISC-V memory 
model at some point to be sure.