linux-kernel - Re: [PATCH] rust: page: add volatile memory copy methods

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ldh8ps22.fsf@t14s.mail-host-address-is-not-set>
Date: Wed, 04 Feb 2026 14:16:37 +0100
From: Andreas Hindborg <a.hindborg@...nel.org>
To: Boqun Feng <boqun@...nel.org>
Cc: Gary Guo <gary@...yguo.net>, Alice Ryhl <aliceryhl@...gle.com>, Lorenzo
 Stoakes <lorenzo.stoakes@...cle.com>, "Liam R. Howlett"
 <Liam.Howlett@...cle.com>, Miguel Ojeda <ojeda@...nel.org>, Boqun Feng
 <boqun.feng@...il.com>, Björn Roy Baron
 <bjorn3_gh@...tonmail.com>, Benno
 Lossin <lossin@...nel.org>, Trevor Gross <tmgross@...ch.edu>, Danilo
 Krummrich <dakr@...nel.org>, linux-mm@...ck.org,
 rust-for-linux@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] rust: page: add volatile memory copy methods

Boqun Feng <boqun@...nel.org> writes:

> On Sat, Jan 31, 2026 at 10:31:13PM +0100, Andreas Hindborg wrote:
> [...]
>> >>>>
>> >>>> For __user memory, because kernel is only given a userspace address, and
>> >>>> userspace can lie or unmap the address while kernel accessing it,
>> >>>> copy_{from,to}_user() is needed to handle page faults.
>> >>>
>> >>> Just to clarify, for my use case, the page is already mapped to kernel
>> >>> space, and it is guaranteed to be mapped for the duration of the call
>> >>> where I do the copy. Also, it _may_ be a user page, but it might not
>> >>> always be the case.
>> >>
>> >> In that case you should also assume there might be other kernel-space users.
>> >> Byte-wise atomic memcpy would be best tool.
>> >
>> > Other concurrent kernel readers/writers would be a kernel bug in my use
>> > case. We could add this to the safety requirements.
>> >
>> 
>> Actually, one case just crossed my mind. I think nothing will prevent a
>> user space process from concurrently submitting multiple reads to the
>> same user page. It would not make sense, but it can be done.
>> 
>> If the reads are issued to different null block devices, the null block
>> driver might concurrently write the user page when servicing each IO
>> request concurrently.
>> 
>> The same situation would happen in real block device drivers, except the
>> writes would be done by dma engines rather than kernel threads.
>> 
>
> Then we better use byte-wise atomic memcpy, and I think for all the
> architectures that Linux kernel support, memcpy() is in fact byte-wise
> atomic if it's volatile. Because down the actual instructions, either a
> byte-size read/write is used, or a larger-size read/write is used but
> they are guaranteed to be byte-wise atomic even for unaligned read or
> write. So "volatile memcpy" and "volatile byte-wise atomic memcpy" have
> the same implementation.
>
> (The C++ paper [1] also says: "In fact, we expect that existing assembly
> memcpy implementations will suffice when suffixed with the required
> fence.")
>
> So to make thing move forward, do you mind to introduce a
> `atomic_per_byte_memcpy()` in rust::sync::atomic based on
> bindings::memcpy(), and cc linux-arch and all the archs that support
> Rust for some confirmation? Thanks!

There is a few things I do not fully understand:

 - Does the operation need to be both atomic and volatile, or is atomic enough on its
   own (why)?
 - The article you reference has separate `atomic_load_per_byte_memcpy`
   and `atomic_store_per_byte_memcpy` that allows inserting an acquire
   fence before the load and a release fence after the store. Do we not
   need that?
 - It is unclear to me how to formulate the safety requirements for
   `atomic_per_byte_memcpy`. In this series, one end of the operation is
   the potential racy area. For `atomic_per_byte_memcpy` it could be
   either end (or both?). Do we even mention an area being "outside the
   Rust AM"?

First attempt below. I am quite uncertain about this. I feel like we
have two things going on: Potential races with other kernel threads,
which we solve by saying all accesses are byte-wise atomic, and reaces
with user space processes, which we solve with volatile semantics?

Should the functin name be `volatile_atomic_per_byte_memcpy`?

/// Copy `len` bytes from `src` to `dst` using byte-wise atomic operations.
///
/// This copy operation is volatile.
///
/// # Safety
///
/// Callers must ensure that:
///
/// * The source memory region is readable and reading from the region will not trap.
/// * The destination memory region is writable and writing to the region will not trap.
/// * No references exist to the source or destination regions.
/// * If the source or destination region is within the Rust AM, any concurrent reads or writes to
///   source or destination memory regions by the Rust AM must use byte-wise atomic operations.
pub unsafe fn atomic_per_byte_memcpy(src: *const u8, dst: *mut u8, len: usize) {
    // SAFETY: By the safety requirements of this function, the following operation will not:
    //  - Trap.
    //  - Invalidate any reference invariants.
    //  - Race with any operation by the Rust AM, as `bindings::memcpy` is a byte-wise atomic
    //    operation and all operations by the Rust AM use byte-wise atomic semantics.
    //
    //  Further, as `bindings::memcpy` is a volatile operation, the operation will not race with any
    //  read or write operation to the source or destination area if the area can be considered to
    //  be outside the Rust AM.
    unsafe { bindings::memcpy(dst.cast::<kernel::ffi::c_void>(), src.cast::<kernel::ffi::c_void>(), len) };
}


Best regards,
Andreas Hindborg