lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 9 Sep 2021 13:03:18 -0400
From:   Dan Lustig <dlustig@...dia.com>
To:     Will Deacon <will@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>
CC:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Alan Stern <stern@...land.harvard.edu>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Peter Anvin <hpa@...or.com>,
        "Andrea Parri" <parri.andrea@...il.com>,
        Ingo Molnar <mingo@...nel.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Vince Weaver <vincent.weaver@...ne.edu>,
        Thomas Gleixner <tglx@...utronix.de>,
        Jiri Olsa <jolsa@...hat.com>,
        "Arnaldo Carvalho de Melo" <acme@...hat.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Stephane Eranian <eranian@...gle.com>,
        <linux-tip-commits@...r.kernel.org>, <palmer@...belt.com>,
        <paul.walmsley@...ive.com>, <mpe@...erman.id.au>
Subject: Re: [tip:locking/core] tools/memory-model: Add extra ordering for
 locks and remove it for ordinary release/acquire

On 9/9/2021 9:35 AM, Will Deacon wrote:
> [+Palmer, PaulW, Daniel and Michael]
> 
> On Thu, Sep 09, 2021 at 09:25:30AM +0200, Peter Zijlstra wrote:
>> On Wed, Sep 08, 2021 at 09:08:33AM -0700, Linus Torvalds wrote:
>>
>>> So if this is purely a RISC-V thing,
>>
>> Just to clarify, I think the current RISC-V thing is stonger than
>> PowerPC, but maybe not as strong as say ARM64, but RISC-V memory
>> ordering is still somewhat hazy to me.
>>
>> Specifically, the sequence:
>>
>> 	/* critical section s */
>> 	WRITE_ONCE(x, 1);
>> 	FENCE RW, W
>> 	WRITE_ONCE(s.lock, 0);		/* store S */
>> 	AMOSWAP %0, 1, r.lock		/* store R */
>> 	FENCE R, RW
>> 	WRITE_ONCE(y, 1);
>> 	/* critical section r */
>>
>> fully separates section s from section r, as in RW->RW ordering
>> (possibly not as strong as smp_mb() though), while on PowerPC it would
>> only impose TSO ordering between sections.
>>
>> The AMOSWAP is a RmW and as such matches the W from the RW->W fence,
>> similarly it marches the R from the R->RW fence, yielding an:
>>
>> 	RW->  W
>> 	    RmW
>> 	    R  ->RW
>>
>> ordering. It's the stores S and R that can be re-ordered, but not the
>> sections themselves (same on PowerPC and many others).
>>
>> Clarification from a RISC-V enabled person would be appreciated.

To first order, RISC-V's memory model is very similar to ARMv8's.  It
is "other-multi-copy-atomic", unlike Power, and respects dependencies.
It also has AMOs and LR/SC with optional RCsc acquire or release
semantics.  There's no need to worry about RISC-V somehow pushing the
boundaries of weak memory ordering in new ways.

The tricky part is that unlike ARMv8, RISC-V doesn't have load-acquire
or store-release opcodes at all.  Only AMOs and LR/SC have acquire or
release options.  That means that while certain operations like swap
can be implemented with native RCsc semantics, others like store-release
have to fall back on fences and plain writes.

That's where the complexity came up last time this was discussed, at
least as it relates to RISC-V: how to make sure the combination of RCsc
atomics and plain operations+fences gives the semantics everyone is
asking for here.  And to be clear there, I'm not asking for LKMM to
weaken anything about critical section ordering just for RISC-V's sake.
TSO/RCsc ordering between critical sections is a perfectly reasonable
model in my opinion.  I just want to make sure RISC-V gets it right
given whatever the decision is.

>>> then I think it's entirely reasonable to
>>>
>>>         spin_unlock(&r);
>>>         spin_lock(&s);
>>>
>>> cannot be reordered.
>>
>> I'm obviously completely in favour of that :-)
> 
> I don't think we should require the accesses to the actual lockwords to
> be ordered here, as it becomes pretty onerous for relaxed LL/SC
> architectures where you'd end up with an extra barrier either after the
> unlock() or before the lock() operation. However, I remain absolutely in
> favour of strengthening the ordering of the _critical sections_ guarded by
> the locks to be RCsc.

I agree with Will here.  If the AMOSWAP above is actually implemented with
a RISC-V AMO, then the two critical sections will be separated as if RW,RW,
as Peter described.  If instead it's implemented using LR/SC, then RISC-V
gives only TSO (R->R, R->W, W->W), because the two pieces of the AMO are
split, and that breaks the chain.  Getting full RW->RW between the critical
sections would therefore require an extra fence.  Also, the accesses to the
lockwords themselves would not be ordered without an extra fence.

> Last time this came up, I think the RISC-V folks were generally happy to
> implement whatever was necessary for Linux [1]. The thing that was stopping
> us was Power (see CONFIG_ARCH_WEAK_RELEASE_ACQUIRE), wasn't it? I think
> Michael saw quite a bit of variety in the impact on benchmarks [2] across
> different machines. So the question is whether newer Power machines are less
> affected to the degree that we could consider making this change again.

Yes, as I said above, RISC-V will implement what is needed to make this work.

Dan

> Will
> 
> [1] https://lore.kernel.org/lkml/11b27d32-4a8a-3f84-0f25-723095ef1076@nvidia.com/
> [2] https://lore.kernel.org/lkml/87tvp3xonl.fsf@concordia.ellerman.id.au/

Powered by blists - more mailing lists