linux-kernel - Re: LKMM: Making RMW barriers explicit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <73e08b17-cb2a-4e14-a94f-7144c5738684@huaweicloud.com>
Date: Thu, 23 May 2024 16:27:29 +0200
From: Jonas Oberhauser <jonas.oberhauser@...weicloud.com>
To: Andrea Parri <parri.andrea@...il.com>,
 Alan Stern <stern@...land.harvard.edu>
Cc: Hernan Ponce de Leon <hernan.poncedeleon@...weicloud.com>,
 "Paul E. McKenney" <paulmck@...nel.org>, linux-kernel@...r.kernel.org,
 linux-arch@...r.kernel.org, kernel-team@...a.com, boqun.feng@...il.com,
 j.alglave@....ac.uk, luc.maranget@...ia.fr,
 Joel Fernandes <joel@...lfernandes.org>
Subject: Re: LKMM: Making RMW barriers explicit



Am 5/22/2024 um 6:54 PM schrieb Andrea Parri:
> Alan, all,
> 
> ("randomly" picking a recent post in the thread, after having observed
> this discussion for a while...)
> 
>> It would be better if there was a way to tell herd7 not to add the 'mb
>> tag to failed instructions in the first place.  This approach is
>> brittle; see below.
> 
> AFAIU, changing the herd representation to generate mb-accesses in place
> of certain mb-fences...
> 
>> If you do want to use this approach, it should be simplified.  All you
>> need is:
>>
>> 	[M] ; po ; [RMW_MB]
>>
>> 	[RMW_MB] ; po ; [M]
>>
>> This is because events tagged with RMW_MB always are memory accesses,
>> and accesses that aren't part of the RMW are already covered by the
>> fencerel(Mb) thing above.
> 
> .. and updating the .cat file to the effects of something like
> 
>    -let mb = ([M] ; fencerel(Mb) ; [M]) |
>    +let mb = (([M] ; po? ; [Mb] ; po? ; [M]) \ id) |
> 
> .. can hardly be called "making RMW barriers explicit".  (So much so
> that the first commit in PR #865 was titled "Remove explicit barriers
> from RMWs".  :-))
> 
> Overall, this discussion rather seems to confirm the close link between
> tools/memory-model/ and herdtools7.  (After all, to what extent could
> any putative RMW_MB be considered "explicit" without _knowing the under-
> lying representation of the RMW operations...)  


My view is a bit different. There's more or less standard theory for 
weak memory models, including how operations at the source code level 
map to events in the graph.

Part of that standard theory are relations like rmw, and the 
circumstances under which they appear in the graph.

You'll see these relations in generic papers about weak memory models, 
like GenMC, Hahn, etc. as well as in pretty much every specific memory 
model like TSO, C11, PowerPC, VMM, etc., totally independently of herd7 
(even though these notations have historically developed together with 
herd).

Naively I would expect that a tool like herd7 would be a generic WMM 
exploration tool, which follows these standards with a very obvious 1:1 
mapping of what we see in the code and the events we see in the graph, 
plus perhaps a thin and explicit configurable herd7-specific mapping 
layer like what we see in linux_kernel.cat to configure the syntax for a 
specific model.

In that mapping layer, we currently see that xchg() is an exchange 
operation with an Mb tag. Just like an smp_load_acquire is a read with 
an acquire tag.
Or so it seems.

Instead, we find that contrary to what's written in that file, and 
contrary to the conventions, an xchg() is an F[mb] ; R[once] ; W[once] 
; F[Mb]. And in fact a cmp_xchg could even be a R[once].

That's because the herd7 tool isn't quite as generic as one might think, 
but rather specifically "detects" that it's running the LKMM and then 
the mapping isn't what you'd naively think, but something hidden in the 
OCaml implementation of herd7.

That would be comparable to a popular tool for matrix calculations using 
the syntax
[ 10  5  4
   3   4  2 ]

to define a 2x3 matrix, unless one of the values is -3, in which case it 
becomes a vector of 6 elements, because that's what a really important 
user of the tool wanted, and then didn't see enough of a need to change.

My point is that to anyone who has seen standard matrix notation, this 
syntax in a matrix-computation-tool looks like it would be a matrix, 
right? And it usually is a matrix; then it should always be a matrix.

I usually say I don't look much at natural language documentation and 
only read code, because code doesn't lie. In this case, the natural 
language documentation is saying the correct thing thanks to the hard 
work of a lot of people, but the code (in .cat etc.) doesn't do what it 
seems to be doing.

And I think that should be changed, both to reduce the anyways high 
mental load of reading the code without having to do mental 
transformations in your head, and also to make herd more Lkmm-agnostic.
In the ideal world, herd doesn't know anything about Lkmm except what we 
tell it through tools/memory-model, and generic things like C syntax + 
semantics.

So when using syntax like dom(rmw), I don't see it as confirming the 
close link to herd more than before, but rather depending more on 
standard notations and conventions, and relying on herd's close link to 
those standard notations (such as using rmw for the rmw relation and 
dom(r) for the domain of r) for dom(rmw) to mean what anyone who has 
deeply read a couple of modern WMM papers would expect.



> My understanding is that
> this discussion was at least in part motivated by a desire to experiment
> and familiarize with the current herd representation

I would phrase it more extreme: I want to get rid of the unnecessary 
non-standard parts of the herd representation.

Those parts have led me astray several times. Ok, who cares about me, 
but even Luc once forgot about the non-standard parts and thought it is 
a bug:

https://github.com/herd/herdtools7/issues/384#issuecomment-1132859904

I also know that Viktor stumbled over it before and also suggested we 
change it.

I think there's 0 benefit to them, they're just wasting people's time 
and energy and lead to misunderstanding.

Ok, this e-mail became long. Hope it at least helps clarify my 
motivation :))

jonas