linux-kernel - Re: Allow data races on some read/write operations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <25e7e425-ae72-4370-ae95-958882a07df9@ralfj.de>
Date: Wed, 5 Mar 2025 14:10:22 +0100
From: Ralf Jung <post@...fj.de>
To: Boqun Feng <boqun.feng@...il.com>, comex <comexk@...il.com>
Cc: Andreas Hindborg <a.hindborg@...nel.org>,
 Alice Ryhl <aliceryhl@...gle.com>,
 Daniel Almeida <daniel.almeida@...labora.com>,
 Benno Lossin <benno.lossin@...ton.me>,
 Abdiel Janulgue <abdiel.janulgue@...il.com>, dakr@...nel.org,
 robin.murphy@....com, rust-for-linux@...r.kernel.org,
 Miguel Ojeda <ojeda@...nel.org>, Alex Gaynor <alex.gaynor@...il.com>,
 Gary Guo <gary@...yguo.net>, Björn Roy Baron
 <bjorn3_gh@...tonmail.com>, Trevor Gross <tmgross@...ch.edu>,
 Valentin Obst <kernel@...entinobst.de>, linux-kernel@...r.kernel.org,
 Christoph Hellwig <hch@....de>, Marek Szyprowski <m.szyprowski@...sung.com>,
 airlied@...hat.com, iommu@...ts.linux.dev, lkmm@...ts.linux.dev
Subject: Re: Allow data races on some read/write operations

Hi,

On 05.03.25 04:24, Boqun Feng wrote:
> On Tue, Mar 04, 2025 at 12:18:28PM -0800, comex wrote:
>>
>>> On Mar 4, 2025, at 11:03 AM, Ralf Jung <post@...fj.de> wrote:
>>>
>>> Those already exist in Rust, albeit only unstably:
>>> <https://doc.rust-lang.org/nightly/std/intrinsics/fn.volatile_copy_memory.html>.
>>> However, I am not sure how you'd even generate such a call in C? The
>>> standard memcpy function is not doing volatile accesses, to my
>>> knowledge.
>>
>> The actual memcpy symbol that exists at runtime is written in
>> assembly, and should be valid to treat as performing volatile
>> accesses.

memcpy is often written in C... and AFAIK compilers understand what that 
function does and will, for instance, happily eliminate the call if they can 
prove that the destination memory is not being read from again. So, it doesn't 
behave like a volatile access at all.

>> But both GCC and Clang special-case the memcpy function.  For example,
>> if you call memcpy with a small constant as the size, the optimizer
>> will transform the call into one or more regular loads/stores, which
>> can then be optimized mostly like any other loads/stores (except for
>> opting out of alignment and type-based aliasing assumptions).  Even if
>> the call isn’t transformed, the optimizer will still make assumptions.
>> LLVM will automatically mark memcpy `nosync`, which makes it undefined
>> behavior if the function “communicate[s] (synchronize[s]) with another
>> thread”, including through “volatile accesses”. [1]

The question is more,  what do clang and GCC document / guarantee in a stable 
way regarding memcpy? I have not seen any indication so far that a memcpy call 
would ever be considered volatile, so we have to treat it like a non-volatile 
non-atomic operation.

>> However, these optimizations should rarely trigger misbehavior in
>> practice, so I wouldn’t be surprised if Linux had some code that
>> expected memcpy to act volatile…
>>
> 
> Also in this particular case we are discussing [1], it's a memcpy (from
> or to) a DMA buffer, which means the device can also read or write the
> memory, therefore the content of the memory may be altered outside the
> program (the kernel), so we cannot use copy_nonoverlapping() I believe.
> 
> [1]: https://lore.kernel.org/rust-for-linux/87bjuil15w.fsf@kernel.org/

Is there actually a potential for races (with reads by hardware, not other 
threads) on the memcpy'd memory? Or is this the pattern where you copy some data 
somewhere and then set a flag in an MMIO register to indicate that the data is 
ready and the device can start reading it? In the latter case, the actual data 
copy does not race with anything, so it can be a regular non-atomic non-volatile 
memcpy. The flag write *should* be a release write, and release volatile writes 
do not exist, so that is a problem, but it's a separate problem from volatile 
memcpy. One can use a release fence followed by a relaxed write instead. 
Volatile writes do not currently act like relaxed writes, but you need that 
anyway for WRITE_ONCE to make sense so it seems fine to rely on that here as well.

Rust should have atomic volatile accesses, and various ideas have been proposed 
over the years, but sadly nobody has shown up to try and push this through.

If the memcpy itself can indeed race, you need an atomic volatile memcpy -- 
which neither C nor Rust have, though there are proposals for atomic memcpy (and 
arguably, there should be a way to interact with a device using non-volatile 
atomics... but anyway in the LKMM, atomics are modeled with volatile, so things 
are even more entangled than usual ;).

Kind regards,
Ralf