linux-kernel - Re: [RFC] tools/memory-model: Rule out OOTA

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d48e277b-8e5b-471f-98b4-245cb0dede86@huaweicloud.com>
Date: Thu, 16 Jan 2025 21:21:38 +0100
From: Jonas Oberhauser <jonas.oberhauser@...weicloud.com>
To: paulmck@...nel.org
Cc: Alan Stern <stern@...land.harvard.edu>, parri.andrea@...il.com,
 will@...nel.org, peterz@...radead.org, boqun.feng@...il.com,
 npiggin@...il.com, dhowells@...hat.com, j.alglave@....ac.uk,
 luc.maranget@...ia.fr, akiyks@...il.com, dlustig@...dia.com,
 joel@...lfernandes.org, urezki@...il.com, quic_neeraju@...cinc.com,
 frederic@...nel.org, linux-kernel@...r.kernel.org, lkmm@...ts.linux.dev,
 hernan.poncedeleon@...weicloud.com
Subject: Re: [RFC] tools/memory-model: Rule out OOTA



Am 1/16/2025 um 8:31 PM schrieb Paul E. McKenney:
> On Thu, Jan 16, 2025 at 08:13:28PM +0100, Jonas Oberhauser wrote:
>> Am 1/16/2025 um 7:40 PM schrieb Paul E. McKenney:
>>> On Mon, Jan 13, 2025 at 02:04:58PM -0800, Paul E. McKenney wrote:
>>>> On Fri, Jan 10, 2025 at 05:21:59PM +0100, Jonas Oberhauser wrote:
>>>>> Am 1/10/2025 um 3:54 PM schrieb Paul E. McKenney:
>>>>>> On Thu, Jan 09, 2025 at 07:35:19PM +0100, Jonas Oberhauser wrote:
>>>>>>> Am 1/9/2025 um 6:54 PM schrieb Paul E. McKenney:
>>>>>>>> On Wed, Jan 08, 2025 at 08:17:51PM +0100, Jonas Oberhauser wrote:
>>>
>>> [ . . . ]
>>>
>>>>>>> I currently can not come up with an example where there would be a
>>>>>>> (semantic) control dependency from a load to a store that is not in the arm
>>>>>>> of an if statement (or a loop / switch of some form with the branch
>>>>>>> depending on the load).
>>>>>>>
>>>>>>> I think the control dependency is just a red herring. It is only there to
>>>>>>> avoid the data race.
>>>>>>
>>>>>> Well, that red herring needs to have a companion fish to swim with in
>>>>>> order to enforce ordering, and I am not seeing that companion.
>>>>>>
>>>>>> Or am I (yet again!) missing something subtle here?
>>>>>
>>>>> It makes more sense to think about how people do message passing (or
>>>>> seqlock), which might look something like this:
>>>>>
>>>>>     [READ_ONCE]
>>>>>     rmb
>>>>>     [plain read]
>>>>>
>>>>> and
>>>>>
>>>>>     [plain write]
>>>>>     wmb
>>>>>     [WRITE_ONCE]
>>>>>
>>>>>
>>>>> Clearly LKMM says that there is some sort of order (not quite happens-before
>>>>> order though) between the READ_ONCE and the plain read, and between the
>>>>> plain write and the WRITE_ONCE. This order is clearly defined in the data
>>>>> race definition, in r-pre-bounded and w-post-bounded.
>>>>>
>>>>> Now consider
>>>>>
>>>>>     [READ_ONCE]
>>>>>     rmb
>>>>>     [plain read]
>>>>>     // some code that creates order between the plain accesses
>>>>>     [plain write]
>>>>>     wmb
>>>>>     [WRITE_ONCE]
>>>>>
>>>>> where for some specific reason we can discern that the compiler can not
>>>>> fully eliminate/move across the barrier either this specific plain read, nor
>>>>> the plain write, nor the ordering between the two.
>>>>>
>>>>> In this case, is there order between the READ_ONCE and the WRITE_ONCE, or
>>>>> not? Of course, we know current LKMM says no. I would say that in those very
>>>>> specific cases, we do have ordering.
>>>>
>>>> Agreed, for LKMM to deal with seqlock, the read-side critical section
>>>> would need to use READ_ONCE(), which is a bit unnatural.  The C++
>>>> standards committee has been discussing this for some time, as that
>>>> memory model also gives data race in that case.
>>>>
>>>> But it might be better to directly model seqlock than to try to make
>>>> LKMM deal with the underlying atomic operations.
>>>
>>> Maybe I should give an example approach, perhaps inspiring a better
>>> approach.
>>>
>>> o	Model reader-writer locks in LKMM, including a relaxed
>>> 	write-lock-held primitive.
>>>
>>> o	Model sequence locks in terms of reader-writer locks:
>>>
>>> 	o	The seqlock writer maps to write lock.
>>>
>>> 	o	The seqlock reader maps to read lock, but with a
>>> 		write-lock-held check.	If the write lock is held at
>>> 		that point, the seqlock tells the caller to retry.
>>>
>>> 		Please note that the point is simply to exercise
>>> 		the failure path.
>>>
>>> 	o	If a value is read in the seqlock reader and used
>>> 		across a "you need to retry" indication, that
>>> 		flags a seqlock data race.
>>>
>>> But is there a better way?
>>>
>>
>> You need to be careful with those hb edges. The reader critical section does
>> not have to happen-before the writer critical section, as would with an
>> actual read-write lock.
> 
> True, the reader critical section only needs each of its loads to have
> an fr link to all of the stores in the writer critical section.

Yes. And importantly, a properly-synchronized fr link!
Meaning that it does not constitute a data race.

Actually, you only need the proper synchronization, then the fr link 
follows from the race-coherence axioms.

> On the other hand, the Linux-kernel implementation of seqcount does use
> smp_rmb() and smp_wmb(), which does provide the message-passing pattern,
> and thus hb, correct?

Only in one direction, from the last writer CS to the reader CS. But not 
from the reader CS to the next writer CS :(

The message is passed, but from a data race POV, you are reading 
something while new data is coming in (from the next write). This 
fr-race could only be prevented by

    rw-xbstar = fence | *(r-post-bounded ; xbstar ; w-pre-bounded)*

but there is no xbstar :(

Well, one thought is that one could declare xbstar from the read_exit to 
the write_enter, but just not add any release semantics to read_exit() 
(like a rw_read_unlock would have). But that sounds really scary because 
that xbstar definitely does not exist in the implementation, so my 9pm 
brain is doubtful that this is correct.


>> I think the solution would have to be along changing the definition of
>> r-post-bounded.
>> The read_enter() function reads from a write_exit() and establishes hb.
>> The read_exit() function also reads from a write_exit(), if the same as the
>> matching read_enter(), then it returns success, otherwise failure.
>>
>> Reads po-before a successful read_exit() are bounded with regards to
>> subsequent write_enter() on the same lock.
> 
> Or are you trying to model the effects of the data races?

No, I had even overlooked the "real data race" in the failure case xP

I was modelling the properly-synchronized part of the fr, without 
introducing hb/xbstar from the reader to the next writer - but only for 
the success case. The failure case also needs some way to avoid the data 
race (unless it abuses the value). Perhaps by relaxing the definition of 
data race as we did in VMM, so that the readside CS doesn't become UB 
until you do something forbidden with the value.

My suggestion for avoiding the data race in success was to extend the 
notion of rw-xbstar above in some way that provides the necessary 
protection, e.g. by drawing an "r-post-bounded"-like edge to the 
write_enter(), which would then "w-pre-bound" (or something similar) the 
write-side CS.

Have fun,
   jonas