lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 7 May 2024 11:11:19 +0200
From: Jonas Oberhauser <jonas.oberhauser@...weicloud.com>
To: paulmck@...nel.org, luc.maranget@...ia.fr
Cc: linux-kernel@...r.kernel.org, linux-arch@...r.kernel.org,
 kernel-team@...a.com, mingo@...nel.org, stern@...land.harvard.edu,
 parri.andrea@...il.com, will@...nel.org, peterz@...radead.org,
 boqun.feng@...il.com, npiggin@...il.com, dhowells@...hat.com,
 j.alglave@....ac.uk, akiyks@...il.com,
 Frederic Weisbecker <frederic@...nel.org>, Daniel Lustig
 <dlustig@...dia.com>, Joel Fernandes <joel@...lfernandes.org>,
 Mark Rutland <mark.rutland@....com>, Jonathan Corbet <corbet@....net>,
 linux-doc@...r.kernel.org
Subject: Re: [PATCH memory-model 2/4] Documentation/litmus-tests: Demonstrate
 unordered failing cmpxchg



Am 5/6/2024 um 8:00 PM schrieb Paul E. McKenney:
> On Mon, May 06, 2024 at 06:30:45PM +0200, Jonas Oberhauser wrote:
>> Am 5/6/2024 um 12:05 PM schrieb Jonas Oberhauser:
>>> Am 5/2/2024 um 1:21 AM schrieb Paul E. McKenney:
>>>> This commit adds four litmus tests showing that a failing cmpxchg()
>>>> operation is unordered unless followed by an smp_mb__after_atomic()
>>>> operation.
>>>
>>> So far, my understanding was that all RMW operations without suffix
>>> (xchg(), cmpxchg(), ...) will be interpreted as F[Mb];...;F[Mb].
>>>
>>> I guess this shows again how important it is to model these full
>>> barriers explicitly inside the cat model, instead of relying on implicit
>>> conversions internal to herd.
>>>
>>> I'd like to propose a patch to this effect.
>>>
>>> What is the intended behavior of a failed cmpxchg()? Is it the same as a
>>> relaxed one?
> 
> Yes, and unless I am too confused, LKMM currently does implement this.
> Please let me know if I am missing something.

At least the herd and Dat3M implementations seem to be doing that, at 
least according to this thread sent to me by Hernan.

https://github.com/herd/herdtools7/issues/384#issue-1243049709


> 
>>> My suggestion would be in the direction of marking read and write events
>>> of these operations as Mb, and then defining
>>>
>>> (* full barrier events that appear in non-failing RMW *)
>>> let RMW_MB = Mb & (dom(rmw) | range(rmw))
>>>
>>>
>>> let mb =
>>>       [M] ; fencerel(Mb) ; [M]
>>>     | [M] ; (po \ rmw) ; [RMW_MB] ; po^? ; [M]
>>>     | [M] ; po^? ; [RMW_MB] ; (po \ rmw) ; [M]
>>>     | ...
>>>
>>> The po \ rmw is because ordering is not provided internally of the rmw
>>
>> (removed the unnecessary si since LKMM is still non-mixed-accesses)
> 
> Addition of mixed-access support would be quite welcome!

:P


>> This could also be written with a single rule:
>>
>>       | [M] ; (po \ rmw) & (po^?; [RMW_MB] ; po^?) ; [M]
>>
>>> I suspect that after we added [rmw] sequences it could perhaps be
>>> simplified [...]
>>
>> No, my suspicion is wrong - this would incorrectly let full-barrier RMWs
>> act like strong fences when they appear in an rmw sequence.
>>
>>   if (z==1)  ||  x = 2;     ||  xchg(&y,2)  || if (y==2)
>>     x = 1;   ||  y =_rel 1; ||              ||    z=1;
>>
>>
>> right now, we allow x=2 overwriting x=1 (in case the last thread does not
>> propagate x=2 along with z=1) because on power, the xchg might be
>> implemented with a sync that doesn't get executed until the very end
>> of the program run.
>>
>>
>> Instead of its negative form (everything other than inside the rmw),
>> it could also be rewritten positively. Here's a somewhat short form:
>>
>> let mb =
>>       [M] ; fencerel(Mb) ; [M]
>>     (* everything across a full barrier RMW is ordered. This includes up to
>> one event inside the RMW. *)
>>     | [M] ; po ; [RMW_MB] ; po ; [M]
>>     (* full barrier RMW writes are ordered with everything behind the RMW *)
>>     | [W & RMW_MB] ; po ; [M]
>>     (* full barrier RMW reads are ordered with everything before the RMW *)
>>     | [M] ; po ; [R & RMW_MB]
>>     | ...
> 
> Does this produce the results expected by the litmus tests in the Linux
> kernel source tree and also those at https://github.com/paulmckrcu/litmus?

I suspect that it doesn't work out of the box because of some of the 
implicit magic herd is doing that could get in the way, so I'd need some 
help from Luc to actually turn this into a patch that can be tested.
(or at least confirmation that just by changing a few things in the .def 
& .bell files we can sidestep the implicit behaviors).

But at least in my proofs it seems to be equivalent.
(there may still be differences in opinion on what some herd things 
mean, so what I/Viktor have formalized as the semantics of the herd 
model may not be exactly the behavior of LKMM in herd. hence testing is 
necessary too as a sanity check)

best wishes,
   jonas


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ