[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aYNOLt_kj6buuhH9@google.com>
Date: Wed, 4 Feb 2026 13:48:30 +0000
From: Alice Ryhl <aliceryhl@...gle.com>
To: Andreas Hindborg <a.hindborg@...nel.org>
Cc: Boqun Feng <boqun@...nel.org>, Gary Guo <gary@...yguo.net>,
Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, "Liam R. Howlett" <Liam.Howlett@...cle.com>,
Miguel Ojeda <ojeda@...nel.org>, Boqun Feng <boqun.feng@...il.com>,
"Björn Roy Baron" <bjorn3_gh@...tonmail.com>, Benno Lossin <lossin@...nel.org>,
Trevor Gross <tmgross@...ch.edu>, Danilo Krummrich <dakr@...nel.org>, linux-mm@...ck.org,
rust-for-linux@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] rust: page: add volatile memory copy methods
On Wed, Feb 04, 2026 at 02:16:37PM +0100, Andreas Hindborg wrote:
> Boqun Feng <boqun@...nel.org> writes:
>
> > On Sat, Jan 31, 2026 at 10:31:13PM +0100, Andreas Hindborg wrote:
> > [...]
> >> >>>>
> >> >>>> For __user memory, because kernel is only given a userspace address, and
> >> >>>> userspace can lie or unmap the address while kernel accessing it,
> >> >>>> copy_{from,to}_user() is needed to handle page faults.
> >> >>>
> >> >>> Just to clarify, for my use case, the page is already mapped to kernel
> >> >>> space, and it is guaranteed to be mapped for the duration of the call
> >> >>> where I do the copy. Also, it _may_ be a user page, but it might not
> >> >>> always be the case.
> >> >>
> >> >> In that case you should also assume there might be other kernel-space users.
> >> >> Byte-wise atomic memcpy would be best tool.
> >> >
> >> > Other concurrent kernel readers/writers would be a kernel bug in my use
> >> > case. We could add this to the safety requirements.
> >> >
> >>
> >> Actually, one case just crossed my mind. I think nothing will prevent a
> >> user space process from concurrently submitting multiple reads to the
> >> same user page. It would not make sense, but it can be done.
> >>
> >> If the reads are issued to different null block devices, the null block
> >> driver might concurrently write the user page when servicing each IO
> >> request concurrently.
> >>
> >> The same situation would happen in real block device drivers, except the
> >> writes would be done by dma engines rather than kernel threads.
> >>
> >
> > Then we better use byte-wise atomic memcpy, and I think for all the
> > architectures that Linux kernel support, memcpy() is in fact byte-wise
> > atomic if it's volatile. Because down the actual instructions, either a
> > byte-size read/write is used, or a larger-size read/write is used but
> > they are guaranteed to be byte-wise atomic even for unaligned read or
> > write. So "volatile memcpy" and "volatile byte-wise atomic memcpy" have
> > the same implementation.
> >
> > (The C++ paper [1] also says: "In fact, we expect that existing assembly
> > memcpy implementations will suffice when suffixed with the required
> > fence.")
> >
> > So to make thing move forward, do you mind to introduce a
> > `atomic_per_byte_memcpy()` in rust::sync::atomic based on
> > bindings::memcpy(), and cc linux-arch and all the archs that support
> > Rust for some confirmation? Thanks!
>
> There is a few things I do not fully understand:
>
> - Does the operation need to be both atomic and volatile, or is atomic enough on its
> own (why)?
> - The article you reference has separate `atomic_load_per_byte_memcpy`
> and `atomic_store_per_byte_memcpy` that allows inserting an acquire
> fence before the load and a release fence after the store. Do we not
> need that?
We can just make both src and dst into per-byte atomics. We don't really
lose anything from it. Technically we're performing unnecessary atomic
ops on one side, but who cares?
> - It is unclear to me how to formulate the safety requirements for
> `atomic_per_byte_memcpy`. In this series, one end of the operation is
> the potential racy area. For `atomic_per_byte_memcpy` it could be
> either end (or both?). Do we even mention an area being "outside the
> Rust AM"?
>
> First attempt below. I am quite uncertain about this. I feel like we
> have two things going on: Potential races with other kernel threads,
> which we solve by saying all accesses are byte-wise atomic, and reaces
> with user space processes, which we solve with volatile semantics?
>
> Should the functin name be `volatile_atomic_per_byte_memcpy`?
>
> /// Copy `len` bytes from `src` to `dst` using byte-wise atomic operations.
> ///
> /// This copy operation is volatile.
> ///
> /// # Safety
> ///
> /// Callers must ensure that:
> ///
> /// * The source memory region is readable and reading from the region will not trap.
> /// * The destination memory region is writable and writing to the region will not trap.
Ok.
> /// * No references exist to the source or destination regions.
You can omit this requirement. Creating references have safety
requirements, and if such references exist, you're also violating the
safety requirements of creating a reference, so you do not need to
repeat it here.
> /// * If the source or destination region is within the Rust AM, any concurrent reads or writes to
> /// source or destination memory regions by the Rust AM must use byte-wise atomic operations.
Unless you need to support memory outside the Rust AM, we can drop this.
Alice
Powered by blists - more mailing lists