linux-kernel - Re: [PATCH v6 8/9] rust: sync: Add memory barriers

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <371882d2-3c31-4c5f-a12f-22945027ee33@ralfj.de>
Date: Tue, 15 Jul 2025 17:35:47 +0200
From: Ralf Jung <post@...fj.de>
To: Boqun Feng <boqun.feng@...il.com>
Cc: Benno Lossin <lossin@...nel.org>, linux-kernel@...r.kernel.org,
 rust-for-linux@...r.kernel.org, lkmm@...ts.linux.dev,
 linux-arch@...r.kernel.org, Miguel Ojeda <ojeda@...nel.org>,
 Alex Gaynor <alex.gaynor@...il.com>, Gary Guo <gary@...yguo.net>,
 Björn Roy Baron <bjorn3_gh@...tonmail.com>,
 Andreas Hindborg <a.hindborg@...nel.org>, Alice Ryhl <aliceryhl@...gle.com>,
 Trevor Gross <tmgross@...ch.edu>, Danilo Krummrich <dakr@...nel.org>,
 Will Deacon <will@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
 Mark Rutland <mark.rutland@....com>,
 Wedson Almeida Filho <wedsonaf@...il.com>,
 Viresh Kumar <viresh.kumar@...aro.org>, Lyude Paul <lyude@...hat.com>,
 Ingo Molnar <mingo@...nel.org>, Mitchell Levy <levymitchell0@...il.com>,
 "Paul E. McKenney" <paulmck@...nel.org>,
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 Linus Torvalds <torvalds@...ux-foundation.org>,
 Thomas Gleixner <tglx@...utronix.de>, Alan Stern <stern@...land.harvard.edu>
Subject: Re: [PATCH v6 8/9] rust: sync: Add memory barriers

Hi all,

On 15.07.25 17:21, Boqun Feng wrote:
> On Mon, Jul 14, 2025 at 05:42:39PM +0200, Ralf Jung wrote:
>> Hi all,
>>
>> On 11.07.25 20:20, Boqun Feng wrote:
>>> On Fri, Jul 11, 2025 at 10:57:48AM +0200, Benno Lossin wrote:
>>>> On Thu Jul 10, 2025 at 8:00 AM CEST, Boqun Feng wrote:
>>>>> diff --git a/rust/kernel/sync/barrier.rs b/rust/kernel/sync/barrier.rs
>>>>> new file mode 100644
>>>>> index 000000000000..df4015221503
>>>>> --- /dev/null
>>>>> +++ b/rust/kernel/sync/barrier.rs
>>>>> @@ -0,0 +1,65 @@
>>>>> +// SPDX-License-Identifier: GPL-2.0
>>>>> +
>>>>> +//! Memory barriers.
>>>>> +//!
>>>>> +//! These primitives have the same semantics as their C counterparts: and the precise definitions
>>>>> +//! of semantics can be found at [`LKMM`].
>>>>> +//!
>>>>> +//! [`LKMM`]: srctree/tools/memory-model/
>>>>> +
>>>>> +/// A compiler barrier.
>>>>> +///
>>>>> +/// A barrier that prevents compiler from reordering memory accesses across the barrier.
>>>>> +pub(crate) fn barrier() {
>>>>> +    // By default, Rust inline asms are treated as being able to access any memory or flags, hence
>>>>> +    // it suffices as a compiler barrier.
>>>>
>>>> I don't know about this, but it also isn't my area of expertise... I
>>>> think I heard Ralf talk about this at Rust Week, but I don't remember...
>>>>
>>>
>>> Easy, let's Cc Ralf ;-)
>>>
>>> Ralf, I believe the question here is:
>>>
>>> In kernel C, we define a compiler barrier (barrier()), which is
>>> implemented as:
>>>
>>> # define barrier() __asm__ __volatile__("": : :"memory")
>>>
>>> Now we want to have a Rust version, and I think an empty `asm!()` should
>>> be enough as an equivalent as a barrier() in C, because an empty
>>> `asm!()` in Rust implies "memory" as the clobber:
>>>
>>> 	https://godbolt.org/z/3z3fnWYjs
>>>
>>> ?
>>>
>>> I know you have some opinions on C++ compiler_fence() [1]. But in LKMM,
>>> barrier() and other barriers work for all memory accesses not just
>>> atomics, so the problem "So, if your program contains no atomic
>>> accesses, but some atomic fences, those fences do nothing." doesn't
>>> exist for us. And our barrier() is strictly weaker than other barriers.
>>>
>>> And based on my understanding of the consensus on Rust vs LKMM, "do
>>> whatever kernel C does and rely on whatever kernel C relies" is the
>>> general suggestion, so I think an empty `asm!()` works here. Of course
>>> if in practice, we find an issue, I'm happy to look for solutions ;-)
>>>
>>> Thoughts?
>>>
>>> [1]: https://github.com/rust-lang/unsafe-code-guidelines/issues/347
>>
>> If I understood correctly, this is about using "compiler barriers" to order
>> volatile accesses that the LKMM uses in lieu of atomic accesses?
>> I can't give a principled answer here, unfortunately -- as you know, the
>> mapping of LKMM through the compiler isn't really in a state where we can
>> make principled formal statements. And making principled formal statements
>> is my main expertise so I am a bit out of my depth here. ;)
>>
> 
> Understood ;-)
> 
>> So I agree with your 2nd paragraph: I would say just like the fact that you
>> are using volatile accesses in the first place, this falls under "do
>> whatever the C code does, it shouldn't be any more broken in Rust than it is
>> in C".
>>
>> However, saying that it in general "prevents reordering all memory accesses"
>> is unlikely to be fully correct -- if the compiler can prove that the inline
>> asm block could not possibly have access to a local variable (e.g. because
>> it never had its address taken), its accesses can still be reordered. This
>> applies both to C compilers and Rust compilers. Extra annotations such as
>> `noalias` (or `restrict` in C) can also give rise to reorderings around
>> arbitrary code, including such barriers. This is not a problem for
>> concurrent code since it would anyway be wrong to claim that some pointer
>> doesn't have aliases when it is accessed by multiple threads, but it shows
> 
> Right, it shouldn't be a problem for most of the concurrent code, and
> thank you for bringing this up. I believe we can rely on the barrier
> behavior if the memory accesses on both sides are done via aliased
> references/pointers, which should be the same as C code relies on.
> 
> One thing though is we don't use much of `restrict` in kernel C, so I
> wonder the compiler's behavior in the following code:
> 
>      let mut x = KBox::new_uninit(GFP_KERNEL)?;
>      // ^ KBox is our own Box implementation based on kmalloc(), and it
>      // accepts a flag in new*() functions for different allocation
>      // behavior (can sleep or not, etc), of course we want it to behave
>      // like an std Box in term of aliasing.
> 
>      let x = KBox::write(x, foo); // A
> 
>      smp_mb():
>        // using Rust asm!() for explanation, it's really implemented in
>        // C.
>        asm!("mfence");
> 
>      let a: &Atomic<*mut Foo> = ...; // `a` was null initially.
> 
>      a.store(KBox::into_raw(x), Relaxed); // B
> 
> Now we obviously want A and B to be ordered, because smp_mb() is
> supposed to be stronger than Release ordering. So if another thread does
> an Acquire read or uses address dependency:
> 
>      let a: &Atomic<*mut Foo> = ...;
>      let foo_ptr = a.load(Acquire); // or load(Relaxed);
> 
>      if !foo_ptr.is_null() {
>          let y: KBox<Foo> = unsafe { KBox::from_raw(foo_ptr) };
> 	// ^ this should be safe.
>      }
> 
> Is it something Rust AM could guarantee?

If we pretend these are normal Rust atomics, and we look at the acquire read, 
then yeah that should work -- the asm block can act like a release fence. With 
the LKMM, it's not a "guarantee" in the same sense any more since it lacks the 
formal foundations, but "it shouldn't be worse than in C".

The Rust/C/C++ memory models do not allow that last example with a relaxed load 
and an address dependency. In C/C++ this requires "consume", which Rust doesn't 
have (and which clang treats as "acquire" -- and GCC does the same, IIRC), and 
which nobody figured out how to properly integrate into any of these languages. 
I will refrain from making any definite statements for the LKMM here. ;)

Kind regards,
Ralf

> I think it makes no difference
> than 1) allocating some normal memory for DMA; 2) writing to the normal
> memory; 3) issuing some io barrier instructions to make sure the device
> will see the writes in step 2; 4) doing some MMIO to notify the
> device for a DMA read. Therefore reordering of A and B by compiler will
> be problematic.
> 
> Regards,
> Boqun
> 
>> that the framing of barriers in terms of preventing reordering of accesses
>> is too imprecise. That's why the C++ memory model uses a very different
>> framing, and that's why I can't give a definite answer here. :)
>>
>> Kind regards,
>> Ralf
>>