linux-kernel - Re: [PATCH] rust: page: add volatile memory copy methods

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <DG32JI45HFKS.29745T7AZGFTV@garyguo.net>
Date: Sat, 31 Jan 2026 20:48:55 +0000
From: "Gary Guo" <gary@...yguo.net>
To: "Andreas Hindborg" <a.hindborg@...nel.org>, "Gary Guo"
 <gary@...yguo.net>, "Boqun Feng" <boqun@...nel.org>
Cc: "Alice Ryhl" <aliceryhl@...gle.com>, "Lorenzo Stoakes"
 <lorenzo.stoakes@...cle.com>, "Liam R. Howlett" <Liam.Howlett@...cle.com>,
 "Miguel Ojeda" <ojeda@...nel.org>, "Boqun Feng" <boqun.feng@...il.com>,
 Björn Roy Baron <bjorn3_gh@...tonmail.com>, "Benno Lossin"
 <lossin@...nel.org>, "Trevor Gross" <tmgross@...ch.edu>, "Danilo Krummrich"
 <dakr@...nel.org>, <linux-mm@...ck.org>, <rust-for-linux@...r.kernel.org>,
 <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] rust: page: add volatile memory copy methods

On Sat Jan 31, 2026 at 8:30 PM GMT, Andreas Hindborg wrote:
> "Gary Guo" <gary@...yguo.net> writes:
>
>> On Sat Jan 31, 2026 at 1:34 PM GMT, Andreas Hindborg wrote:
>>> "Boqun Feng" <boqun@...nel.org> writes:
>>>
>>>> On Fri, Jan 30, 2026 at 01:41:05PM -0800, Boqun Feng wrote:
>>>>> On Fri, Jan 30, 2026 at 05:20:11PM +0100, Andreas Hindborg wrote:
>>>>> [...]
>>>>> > >> In the last discussions we had on this, the conclusion was to use
>>>>> > >> `volatile_copy_memory` whenever that is available, or write a volatile
>>>>> > >> copy function in assembly.
>>>>> > >>
>>>>> > >> Using memcpy_{from,to}io is the latter solution. These functions are
>>>>> > >> simply volatile memcpy implemented in assembly.
>>>>> > >>
>>>>> > >> There is nothing special about MMIO. These functions are name as they
>>>>> > >> are because they are useful for MMIO.
>>>>> > >
>>>>> > > No. MMIO are really special. A few architectures require them to be accessed
>>>>> > > completely differently compared to normal memory. We also have things like
>>>>> > > INDIRECT_IOMEM. memory_{from,to}io are special as they use MMIO accessor such as
>>>>> > > readb to perform access on the __iomem pointer. They should not be mixed with
>>>>> > > normal memory. They must be treated as if they're from a completely separate
>>>>> > > address space.
>>>>> > >
>>>>> > > Normal memory vs DMA vs MMIO are all distinct, and this is demonstrated by the
>>>>> > > different types of barriers needed to order things correctly for each type of
>>>>> > > memory region.
>>>>> > >
>>>>> > > Userspace-mapped memory (that is also mapped in the kernel space, not __user) is
>>>>> > > the least special one out of these. They could practically share all atomic infra
>>>>> > > available for the kernel, hence the suggestion of using byte-wise atomic memcpy.
>>>>> >
>>>>> > I see. I did not consider this.
>>>>> >
>>>>> > At any rate, I still don't understand why I need an atomic copy function, or why I
>>>>> > need a byte-wise copy function. A volatile copy function should be fine, no?
>>>>> >
>>>>>
>>>>> but memcpy_{from,to}io() are not just volatile copy functions, they have
>>>>> additional side effects for MMIO ;-)
>>>>>
>>>>
>>>> For example, powerpc's memcpy_fromio() has eioio() in it, which we don't
>>>> need for normal (user -> kernel) memory copy.
>>>
>>> Ok, I see. Thanks for explaining. I was only looking at the x86
>>> implementation, which is of course not enough.
>>>
>>>>
>>>>> > And what is the exact problem in using memcpy_{from,to}io. Looking at
>>>>
>>>> I think the main problem of using memcpy_{from,to}io here is not that
>>>> they are not volatile memcpy (they might be), but it's because we
>>>> wouldn't use them for the same thing in C, because they are designed for
>>>> memory copying between MMIO and kernel memory (RAM).
>>>>
>>>> For MMIO, as Gary mentioned, because they are different than the normal
>>>> memory, special instructions or extra barriers are needed.
>>>
>>> I see, I was not aware.
>>>
>>>>
>>>> For DMA memory, it can be almost treated as external normal memory,
>>>> however, different archictures/systems/platforms may have different
>>>> requirement regarding cache coherent between CPU and devices, specially
>>>> mapping or special instructions may be needed.
>>>
>>> Cache flushing and barriers, got it.
>>>
>>>>
>>>> For __user memory, because kernel is only given a userspace address, and
>>>> userspace can lie or unmap the address while kernel accessing it,
>>>> copy_{from,to}_user() is needed to handle page faults.
>>>
>>> Just to clarify, for my use case, the page is already mapped to kernel
>>> space, and it is guaranteed to be mapped for the duration of the call
>>> where I do the copy. Also, it _may_ be a user page, but it might not
>>> always be the case.
>>
>> In that case you should also assume there might be other kernel-space users.
>> Byte-wise atomic memcpy would be best tool.
>
> Other concurrent kernel readers/writers would be a kernel bug in my use
> case. We could add this to the safety requirements.
>
>>
>>>
>>>>
>>>> Your use case (copying between userspace-mapped memory and kernel
>>>> memory) is, as Gary said, the least special here. So using
>>>> memcpy_{from,to}io() would be overkill and probably misleading.
>>>
>>> Ok, I understand.
>>>
>>>> I
>>>> suggest we use `{read,write}_volatile()` (unless I'm missing something
>>>> subtle of course), however `{read,write}_volatile()` only works on Sized
>>>> types,
>>>
>>> We can copy as u8? Or would it be more efficient to copy as a larger size?
>>
>> Byte-wise atomic means that the atomicity is restricted to byte level (hence
>> it's okay to say if you read a u32 with it and does not observe an atomic
>> update). It does not mean that the access needs to be byte-wise, so it's
>> perfectly fine to do a 32-bit load and it'll still be byte-wise atomic.
>
> Ah.
>
>>
>>>
>>> You suggested atomic in the other email, did you abandon that idea?
>>
>> The semantics we want is byte-wise atomic, although as a impl detail, using
>> volatile for now is all that we need.
>>
>>>
>>>> so we may have to use `bindings::memcpy()` or
>>>> core::intrinsics::volatile_copy_memory() [1]
>>>
>>> I was looking at this one, but it is unstable behind `core_intrinsics`.
>>> I was uncertain about pulling in additional unstable features. This is
>>> why I was looking for something in the C kernel to use.
>>>
>>> I think `bindings::memcpy` is not guaranteed to be implemented as inline
>>> assembly, so it may not have volatile semantics?
>>
>> In absence of full language LTO as we have today, it'll be fine (in practice).
>> Unlike C, if you reference a symbol called "memcpy", it won't be treated as
>> special and get turned into non-volatile memcpy.
>>
>> If the volatile memcpy intrinsics is stable, then we can switch to use that.
>
> Got it, this aligns with what Boqun is writing. Let's go for that.
>
> It also looks like memcpy is implemented in assembly for arm, arm32,
> x86_64. Which would exempt it from LTO. Not sure about 32bit x86 though.
> It defers to `__memcpy`. I could not figure out what that resolves to.
> Is it from the compiler?

I think it's the one in arch/x86/include/asm/string_32.h? That is also inline
assembly.

There's no need to worry about if things can be optimized wrongly. I haven't
looked at the current defence against LTO when the code is implemented in C, but
As Boqun pointed out, the `memcpy` and `memmove` symbols are assumed to have
volatile semantics anyway. So the issue is not unique to Rust (also, we're
immune at the moment as there's no linker-plugin LTO support for Rust).

Ultimately, `volatile_copy_nonoverlapping_memory` is translated to `memcpy`
(similarly, `volatile_copy_memory` is `memmove`). The benefit of the intrinsics
is that if the size is fixed, it can be optimized a single volatile load/store
by LLVM.

Best,
Gary

>
>
> Best regards,
> Andreas Hindborg