[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ikccpkkm.fsf@t14s.mail-host-address-is-not-set>
Date: Wed, 04 Feb 2026 16:58:17 +0100
From: Andreas Hindborg <a.hindborg@...nel.org>
To: Alice Ryhl <aliceryhl@...gle.com>
Cc: Boqun Feng <boqun@...nel.org>, Gary Guo <gary@...yguo.net>, Lorenzo
Stoakes <lorenzo.stoakes@...cle.com>, "Liam R. Howlett"
<Liam.Howlett@...cle.com>, Miguel Ojeda <ojeda@...nel.org>, Boqun Feng
<boqun.feng@...il.com>, Björn Roy Baron
<bjorn3_gh@...tonmail.com>, Benno
Lossin <lossin@...nel.org>, Trevor
Gross <tmgross@...ch.edu>, Danilo Krummrich <dakr@...nel.org>,
linux-mm@...ck.org, rust-for-linux@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] rust: page: add volatile memory copy methods
"Alice Ryhl" <aliceryhl@...gle.com> writes:
> On Wed, Feb 04, 2026 at 02:16:37PM +0100, Andreas Hindborg wrote:
>> Boqun Feng <boqun@...nel.org> writes:
>>
>> > On Sat, Jan 31, 2026 at 10:31:13PM +0100, Andreas Hindborg wrote:
>> > [...]
>> >> >>>>
>> >> >>>> For __user memory, because kernel is only given a userspace address, and
>> >> >>>> userspace can lie or unmap the address while kernel accessing it,
>> >> >>>> copy_{from,to}_user() is needed to handle page faults.
>> >> >>>
>> >> >>> Just to clarify, for my use case, the page is already mapped to kernel
>> >> >>> space, and it is guaranteed to be mapped for the duration of the call
>> >> >>> where I do the copy. Also, it _may_ be a user page, but it might not
>> >> >>> always be the case.
>> >> >>
>> >> >> In that case you should also assume there might be other kernel-space users.
>> >> >> Byte-wise atomic memcpy would be best tool.
>> >> >
>> >> > Other concurrent kernel readers/writers would be a kernel bug in my use
>> >> > case. We could add this to the safety requirements.
>> >> >
>> >>
>> >> Actually, one case just crossed my mind. I think nothing will prevent a
>> >> user space process from concurrently submitting multiple reads to the
>> >> same user page. It would not make sense, but it can be done.
>> >>
>> >> If the reads are issued to different null block devices, the null block
>> >> driver might concurrently write the user page when servicing each IO
>> >> request concurrently.
>> >>
>> >> The same situation would happen in real block device drivers, except the
>> >> writes would be done by dma engines rather than kernel threads.
>> >>
>> >
>> > Then we better use byte-wise atomic memcpy, and I think for all the
>> > architectures that Linux kernel support, memcpy() is in fact byte-wise
>> > atomic if it's volatile. Because down the actual instructions, either a
>> > byte-size read/write is used, or a larger-size read/write is used but
>> > they are guaranteed to be byte-wise atomic even for unaligned read or
>> > write. So "volatile memcpy" and "volatile byte-wise atomic memcpy" have
>> > the same implementation.
>> >
>> > (The C++ paper [1] also says: "In fact, we expect that existing assembly
>> > memcpy implementations will suffice when suffixed with the required
>> > fence.")
>> >
>> > So to make thing move forward, do you mind to introduce a
>> > `atomic_per_byte_memcpy()` in rust::sync::atomic based on
>> > bindings::memcpy(), and cc linux-arch and all the archs that support
>> > Rust for some confirmation? Thanks!
>>
>> There is a few things I do not fully understand:
>>
>> - Does the operation need to be both atomic and volatile, or is atomic enough on its
>> own (why)?
>> - The article you reference has separate `atomic_load_per_byte_memcpy`
>> and `atomic_store_per_byte_memcpy` that allows inserting an acquire
>> fence before the load and a release fence after the store. Do we not
>> need that?
>
> We can just make both src and dst into per-byte atomics. We don't really
> lose anything from it. Technically we're performing unnecessary atomic
> ops on one side, but who cares?
OK.
>
>> - It is unclear to me how to formulate the safety requirements for
>> `atomic_per_byte_memcpy`. In this series, one end of the operation is
>> the potential racy area. For `atomic_per_byte_memcpy` it could be
>> either end (or both?). Do we even mention an area being "outside the
>> Rust AM"?
>>
>> First attempt below. I am quite uncertain about this. I feel like we
>> have two things going on: Potential races with other kernel threads,
>> which we solve by saying all accesses are byte-wise atomic, and reaces
>> with user space processes, which we solve with volatile semantics?
>>
>> Should the functin name be `volatile_atomic_per_byte_memcpy`?
>>
>> /// Copy `len` bytes from `src` to `dst` using byte-wise atomic operations.
>> ///
>> /// This copy operation is volatile.
>> ///
>> /// # Safety
>> ///
>> /// Callers must ensure that:
>> ///
>> /// * The source memory region is readable and reading from the region will not trap.
>> /// * The destination memory region is writable and writing to the region will not trap.
>
> Ok.
>
>> /// * No references exist to the source or destination regions.
>
> You can omit this requirement. Creating references have safety
> requirements, and if such references exist, you're also violating the
> safety requirements of creating a reference, so you do not need to
> repeat it here.
Cool.
>
>> /// * If the source or destination region is within the Rust AM, any concurrent reads or writes to
>> /// source or destination memory regions by the Rust AM must use byte-wise atomic operations.
>
> Unless you need to support memory outside the Rust AM, we can drop this.
I need to support pages that are concurrently mapped to user processes.
I think we decided these pages are outside the Rust AM if we only do
non-reference volatile IO operations on them from kernel space.
They are different than pages that are never mapped to user space, in
the sense that they can incur concurrent reads/writes from the user
space process, and we cannot require any kind of atomicity for these
reads/writes.
Best regards,
Andreas Hindborg
Powered by blists - more mailing lists