linux-kernel - Re: LKMM: Making RMW barriers explicit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58042cf3-e515-4e5f-ab48-1d0d6123c9e9@huaweicloud.com>
Date: Wed, 22 May 2024 11:20:47 +0200
From: Jonas Oberhauser <jonas.oberhauser@...weicloud.com>
To: Alan Stern <stern@...land.harvard.edu>
Cc: Hernan Ponce de Leon <hernan.poncedeleon@...weicloud.com>,
 "Paul E. McKenney" <paulmck@...nel.org>, linux-kernel@...r.kernel.org,
 linux-arch@...r.kernel.org, kernel-team@...a.com, parri.andrea@...il.com,
 boqun.feng@...il.com, j.alglave@....ac.uk, luc.maranget@...ia.fr,
 Joel Fernandes <joel@...lfernandes.org>
Subject: Re: LKMM: Making RMW barriers explicit



Am 5/21/2024 um 5:36 PM schrieb Alan Stern:
> On Tue, May 21, 2024 at 11:57:29AM +0200, Jonas Oberhauser wrote:
>>
>>
>> Am 5/18/2024 um 2:31 AM schrieb Alan Stern:
>>> On Thu, May 16, 2024 at 10:44:05AM +0200, Hernan Ponce de Leon wrote:
>>>> On 5/16/2024 10:31 AM, Jonas Oberhauser wrote:
>>>>>
>>>>>
>>>>> Am 5/16/2024 um 3:43 AM schrieb Alan Stern:
>>>>>> Hernan and Jonas:
>>>>>>
>>>>>> Can you explain more fully the changes you want to make to herd7 and/or
>>>>>> the LKMM?  The goal is to make the memory barriers currently implicit in
>>>>>> RMW operations explicit, but I couldn't understand how you propose to do
>>>>>> this.
>>>>>>
>>>>>> Are you going to change herd7 somehow, and if so, how?  It seems like
>>>>>> you should want to provide sufficient information so that the .bell
>>>>>> and .cat files can implement the appropriate memory barriers associated
>>>>>> with each RMW operation.  What additional information is needed?  And
>>>>>> how (explained in English, not by quoting source code) will the .bell
>>>>>> and .cat files make use of this information?
>>>>>>
>>>>>> Alan
>>>>>
>>>>>
>>>>> I don't know whether herd7 needs to be changed. Probably, herd7 does the
>>>>> following:
>>>>> - if a tag called Mb appears on an rmw instruction (by instruction I
>>>>> mean things like xchg(), atomic_inc_return_relaxed()), replace it with
>>>>> one of those things:
>>>>>      * full mb ; once (the rmw) ; full mb, if a value returning
>>>>> (successful) rmw
>>>>>      * once (the rmw)   otherwise
>>>>> - everything else gets translated 1:1 into some internal representation
>>>>
>>>> This is my understanding from reading the source code of CSem.ml in herd7's
>>>> repo.
>>>>
>>>> Also, this is exactly what dartagnan is currently doing.
>>>>
>>>>>
>>>>> What I'm proposing is:
>>>>> 1. remove this transpilation step,
>>>>> 2. and instead allow the Mb tag to actually appear on RMW instructions
>>>>> 3. change the cat file to explicitly define the behavior of the Mb tag
>>>>> on RMW instructions
>>>>
>>>> These are the exact 3 things I changed in dartagnan for testing what Jonas
>>>> proposed.
>>>>
>>>> I am not sure if further changes are needed for herd7.
> 
> What about failed RMW instructions?  IIRC, herd7 generates just an R for
> these, not both R and W, but won't it still be annotated with an mb tag?
> And wasn't this matter of failed RMWs one of the issues that the two of
> you specifically wanted to make explicit in the memory model, rather
> than implicit in the operation of herd7?

That's why we use the RMW_MB tag. I should have copied that definition too:


(* full barrier events that appear in non-failing RMW *)
let RMW_MB = Mb & (dom(rmw) | range(rmw))


This ensures that the only successful rmw instructions have an RMW_MB tag.


> 
> And wasn't another one of these issues the difference between
> value-returning and non-value-returning RMWs?  As far as I can, nothing
> in the .def file specifically mentions this.  There's the noreturn tag
> in the .bell file, but nothing in the .def file says which instructions
> it applies to.  Or are we supposed to know that it automatically applies
> to all __atomic_op() instances?

Ah, now you're already forestalling my next suggestion :))

I would say let's fix one issue at a time (learned this from Andrea).

For the other issue, do noreturn rmws always have the same ordering as once?

I suspect we need to extend herd slightly more to support the second 
one, I don't know if there's syntax for passing tags to __atomic_op.


> 
>>> Okay, good.  This answers the first part of what I asked.  What about
>>> the second part?  That is, how will the changes to the .def, .bell, and
>>> .cat files achieve your goals?
>>>
>>> Alan
>>
>>
>> Firstly, we'd allow 'mb as a barrier mode in events, e.g.
> 
> You mean as a mode in RMW events.  'mb already is a mode for F events.

Thanks, you're right.

> 
>> instructions RMW[{'once,'acquire,'release,'mb}]
>>
>> then the Mb tags would appear in the graph. And then I'd define the ordering
>> explicitly. One way is to say that an Mb tag orders all memory accesses
>> before(or at) the tag with all memory accesses after(or at) the tag, except
>> the accesses of the rmw with each other.
> 
> I don't think you need to add very much.  The .cat file already defines
> the mb relation as including the term:
> 
> 	([M] ; fencerel(Mb) ; [M])
> 
> All that's needed is to replace the fencerel(Mb) with something more
> general...
> 
>> This is the same as the full fence before the read, which orders all memory
>> accesses before the read with every access after(or at) the read,
>> plus the full fence after the write, which orders all memory accesses
>> before(or at) the write with all accesses after the write.
>>
>> That would be done by adding
>>
>>       [M] ; (po \ rmw) & (po^?; [RMW_MB] ; po^?) ; [M]
> 
> .. like this.
> 
>> to ppo.
> 
> You need to update the definition of mb, not ppo.

Yes, I meant to update the definition of mb, but I momentarily thought 
the effect of that is just that

ppo_new = ppo_old |  [M] ; (po \ rmw) & (po^?; [RMW_MB] ; po^?) ; [M]

I forgot that extending mb has another effect, in pb.


>  And the RMW_MB above
> looks wrong; shouldn't it be just Mb?

As discussed above, no, since the Mb will also be on the failed RMW.

> 
> Also, is the "\ rmw" part really necessary?  I don't think it makes any
> difference; the memory model already knows that the read part of an RMW
> has to happen before the write part.

It unfortunately does make a difference, because mb is a strong fence.
This means that an Mb in an rmw sequence would create additional pb edges.

   prop;(rfe ; [Mb];rmw;[Mb])

I believe this is might give wrong results on powerpc, but I'd need to 
double check.


> 
>> One could also split it into two rules to keep with the "two full fences"
>> analogy. Maybe a good way would be like this:
>>
>>       [M] ; po; [RMW_MB & R] ; po^? ; [M]
>>
>>       [M] ; po^?; [RMW_MB & W] ; po ; [M]
> 
> My preference is for the first approach.

That was also my early preference, but to be honest I expected that 
you'd prefer the second approach.
Actually, I also started warming up to it.


Have fun,
  jonas