[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d2d1f25c-780d-4103-aeb1-461adc4940c3@lucifer.local>
Date: Thu, 21 Nov 2024 12:50:23 +0000
From: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
To: Alice Ryhl <aliceryhl@...gle.com>
Cc: Miguel Ojeda <ojeda@...nel.org>, Matthew Wilcox <willy@...radead.org>,
Vlastimil Babka <vbabka@...e.cz>, John Hubbard <jhubbard@...dia.com>,
"Liam R. Howlett" <Liam.Howlett@...cle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Arnd Bergmann <arnd@...db.de>, Christian Brauner <brauner@...nel.org>,
Alex Gaynor <alex.gaynor@...il.com>, Boqun Feng <boqun.feng@...il.com>,
Gary Guo <gary@...yguo.net>,
Björn Roy Baron <bjorn3_gh@...tonmail.com>,
Benno Lossin <benno.lossin@...ton.me>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, rust-for-linux@...r.kernel.org,
Andreas Hindborg <a.hindborg@...nel.org>
Subject: Re: [PATCH v8 2/7] mm: rust: add vm_area_struct methods that require
read access
On Thu, Nov 21, 2024 at 11:23:39AM +0100, Alice Ryhl wrote:
> On Wed, Nov 20, 2024 at 8:07 PM Lorenzo Stoakes
> <lorenzo.stoakes@...cle.com> wrote:
> >
> > On Wed, Nov 20, 2024 at 02:49:56PM +0000, Alice Ryhl wrote:
> > > This adds a type called VmAreaRef which is used when referencing a vma
> > > that you have read access to. Here, read access means that you hold
> > > either the mmap read lock or the vma read lock (or stronger).
> >
> > This is pure nit and not important but...
> >
> > I know we screwed up naming in mm, with the god awful vm_area_struct (we
> > abbreviate, for 2 letters, get bored, stop abbreviating, but throw in a
> > struct for a laugh) or 'VMA', but I wonder if this would be better as
> > VmaRef? Then again that's kinda terrible too.
> >
> > Sorry about that. I mean this isn't hugely important + ultimately _our
> > fault_.
> >
> > VirtualMemoryAreaRef is worse... VirtMemAreaRef? OK. Maybe VMAreaRef is the
> > least bad...
>
> Happy to use whichever name you prefer. :)
Let's leave it as-is, this is kinda our fault* in mm :)
* pun intended
>
> > > Additionally, a vma_lookup method is added to the mmap read guard, which
> > > enables you to obtain a &VmAreaRef in safe Rust code.
> > >
> > > Signed-off-by: Alice Ryhl <aliceryhl@...gle.com>
Other than trivial doc/comment tweaks this looks fine to me from mm
perspective so:
Acked-by: Lorenzo Stoakes <lorenzo.stoakes@...cle.com> (for mm bits)
> > > ---
> > > rust/helpers/mm.c | 6 ++
> > > rust/kernel/mm.rs | 21 ++++++
> > > rust/kernel/mm/virt.rs | 171 +++++++++++++++++++++++++++++++++++++++++++++++++
> > > 3 files changed, 198 insertions(+)
> > >
> > > diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c
> > > index 7201747a5d31..7b72eb065a3e 100644
> > > --- a/rust/helpers/mm.c
> > > +++ b/rust/helpers/mm.c
> > > @@ -37,3 +37,9 @@ void rust_helper_mmap_read_unlock(struct mm_struct *mm)
> > > {
> > > mmap_read_unlock(mm);
> > > }
> > > +
> > > +struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm,
> > > + unsigned long addr)
> > > +{
> > > + return vma_lookup(mm, addr);
> > > +}
> >
> > I wonder whether we want to abstract some of the VMA iterator stuff,
> > because it's inefficient to look up VMAs by doing this each time if you are
> > looking at, for instance, adjacent VMAs.
> >
> > But perhaps reasonable to defer doing that to later work?
>
> Yeah, that should probably be deferred. Binder only has one vma per
> mm, so iteration is not useful.
Sure, no problem!
>
> > > diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
> > > index 84cba581edaa..ace8e7d57afe 100644
> > > --- a/rust/kernel/mm.rs
> > > +++ b/rust/kernel/mm.rs
> > > @@ -12,6 +12,8 @@
> > > };
> > > use core::{ops::Deref, ptr::NonNull};
> > >
> > > +pub mod virt;
> > > +
> > > /// A wrapper for the kernel's `struct mm_struct`.
> > > ///
> > > /// Since `mm_users` may be zero, the associated address space may not exist anymore. You can use
> > > @@ -210,6 +212,25 @@ pub struct MmapReadGuard<'a> {
> > > _nts: NotThreadSafe,
> > > }
> > >
> > > +impl<'a> MmapReadGuard<'a> {
> > > + /// Look up a vma at the given address.
> > > + #[inline]
> > > + pub fn vma_lookup(&self, vma_addr: usize) -> Option<&virt::VmAreaRef> {
> >
> > Kind of lovely to have functions under the read guard and then knowing that
> > hey - we must hold the read lock to be able to do this.
> >
> > Imagine, a programming language with actual types... :P
>
> :)
>
> > > + // SAFETY: We hold a reference to the mm, so the pointer must be valid. Any value is okay
> > > + // for `vma_addr`.
> > > + let vma = unsafe { bindings::vma_lookup(self.mm.as_raw(), vma_addr as _) };
> >
> > Why do we say 'as _' here?
>
> This is an integer cast where the target type of the cast is inferred
> by the compiler. It will go away once this lands:
> https://lore.kernel.org/rust-for-linux/20240913213041.395655-1-gary@garyguo.net/
OK cool.
>
> > > +
> > > + if vma.is_null() {
> > > + None
> > > + } else {
> > > + // SAFETY: We just checked that a vma was found, so the pointer is valid. Furthermore,
> > > + // the returned area will borrow from this read lock guard, so it can only be used
> > > + // while the mmap read lock is still held.
> >
> > Lovely!
> >
> > > + unsafe { Some(virt::VmAreaRef::from_raw(vma)) }
> > > + }
> > > + }
> > > +}
> > > +
> > > impl Drop for MmapReadGuard<'_> {
> > > #[inline]
> > > fn drop(&mut self) {
> > > diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs
> > > new file mode 100644
> > > index 000000000000..1e755dca46dd
> > > --- /dev/null
> > > +++ b/rust/kernel/mm/virt.rs
> > > @@ -0,0 +1,171 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +
> > > +// Copyright (C) 2024 Google LLC.
> > > +
> > > +//! Virtual memory.
> >
> > Trivial rust q again but does this result in a heading in generated docs or
> > something?
>
> Yes, this becomes module documentation. Check out:
> https://rust.docs.kernel.org/kernel/
> or try out `make LLVM=1 rustdoc`
Cool thanks, thought this might be the case!
>
> > > +use crate::{bindings, types::Opaque};
> > > +
> > > +/// A wrapper for the kernel's `struct vm_area_struct` with read access.
> > > +///
> > > +/// It represents an area of virtual memory.
> > > +///
> > > +/// # Invariants
> > > +///
> > > +/// The caller must hold the mmap read lock or the vma read lock.
> >
> > Might be worth mentioning here that you have yet to abstract the vma lock?
>
> I do that in the next patch.
Yeah saw, one of those 'comment in patch X, addressed in patch X+1' scenarios :)
>
> > > +#[repr(transparent)]
> > > +pub struct VmAreaRef {
> > > + vma: Opaque<bindings::vm_area_struct>,
> > > +}
> > > +
> > > +// Methods you can call when holding the mmap or vma read lock (or strong). They must be usable no
> > > +// matter what the vma flags are.
> >
> > typo: 'or strong' -> 'or stronger'.
> >
> > Nitty, but perhaps say 'Methods must be usable' rather than 'they' to be
> > totally clear.
>
> Will reword.
Thanks!
>
> > > +impl VmAreaRef {
> > > + /// Access a virtual memory area given a raw pointer.
> > > + ///
> > > + /// # Safety
> > > + ///
> > > + /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap read lock
> > > + /// (or stronger) is held for at least the duration of 'a.
> >
> > Well, VMA locks make this more complicated, in that we can just hold a VMA
> > lock. But again, perhaps worth doing this incrementally and just talk about
> > mmap locks to start with.
> >
> > However since we reference VMA locks elsewhere, we should reference them
> > here (and probably worth mentioning that they are not yet abstracted).
>
> Oh I forgot to update this, it should say "mmap or vma read lock".
Cool, one to tweak on respin also then.
>
> > > + #[inline]
> > > + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self {
> > > + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a.
> > > + unsafe { &*vma.cast() }
> > > + }
> > > +
> > > + /// Returns a raw pointer to this area.
> > > + #[inline]
> > > + pub fn as_ptr(&self) -> *mut bindings::vm_area_struct {
> > > + self.vma.get()
> > > + }
> > > +
> > > + /// Returns the flags associated with the virtual memory area.
> > > + ///
> > > + /// The possible flags are a combination of the constants in [`flags`].
> > > + #[inline]
> > > + pub fn flags(&self) -> vm_flags_t {
> > > + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> > > + // access is not a data race.
> > > + unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags as _ }
> >
> > Hm what's up with this __bindgen_anon_N stuff?
>
> Whenever you have a `struct { ... }` or `union { ... }` in the middle
> of a struct, bindgen generates additional types for them because Rust
> doesn't have a direct equivalent.
OK I see.
>
> > > + }
> > > +
> > > + /// Returns the start address of the virtual memory area.
> > > + #[inline]
> > > + pub fn start(&self) -> usize {
> >
> > Is usize guranteed to be equivalent to unsigned long?
>
> They are guaranteed to have the same size, yes.
>
> But again, right now `unsigned long` is being translated to whichever
> one of u32 or u64 has the same size as usize, instead of just being
> translated to usize. Thus the annoying casts. We can get rid of them
> as soon as
> https://lore.kernel.org/rust-for-linux/20240913213041.395655-1-gary@garyguo.net/
> lands.
Ack cool.
>
> > > + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> > > + // access is not a data race.
> > > + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_start as _ }
> > > + }
> > > +
> > > + /// Returns the end address of the virtual memory area.
> >
> > Worth mentioning that it's an _exclusive_ end.
>
> Sure, I'll add that.
Thanks
>
> > > + #[inline]
> > > + pub fn end(&self) -> usize {
> > > + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this
> > > + // access is not a data race.
> > > + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_end as _ }
> > > + }
> > > +
> > > + /// Unmap pages in the given page range.
> >
> > This needs some more description, as 'unmapping' pages is unfortunately an
> > overloaded term in the kernel and this very much might confuse people as
> > opposed to e.g. munmap()'ing a range.
> >
> > I'd say something like 'clear page table mappings for the range at the leaf
> > level, leaving all other page tables intact, freeing any memory referenced
> > by the VMA in this range (anonymous memory is completely freed, file-backed
> > memory has its reference count on page cache folio's dropped, any dirty
> > data will still be written back to disk as usual)'.
>
> Sure, will add that to the docs.
Thanks, I assume you mean this comment, which will form part of the docs? As
here we should at least replace the 'unmap' with 'zap' to avoid confusion
vs. munmap().
>
> > > + #[inline]
> > > + pub fn zap_page_range_single(&self, address: usize, size: usize) {
> > > + // SAFETY: By the type invariants, the caller has read access to this VMA, which is
> > > + // sufficient for this method call. This method has no requirements on the vma flags. Any
> > > + // value of `address` and `size` is allowed.
> > > + unsafe {
> > > + bindings::zap_page_range_single(
> >
> > Hm weirdly I see this in rust/bindings/bindings_generated.rs but not in
> > rust/helpers/mm.c is this intended?
> >
> > Is this meant to be generated _from_ somewhere? Is something missing for
> > that?
>
> The bindings_generated.rs file is generated at built-time from C
> headers. The zap_page_range_single is a real function, not a fake
> static inline header-only function, so bindgen is able to generate it
> without anything in rust/helpers.
>
> > > + self.as_ptr(),
> > > + address as _,
> > > + size as _,
> > > + core::ptr::null_mut(),
> > > + )
> > > + };
> > > + }
> > > +}
> > > +
> > > +/// The integer type used for vma flags.
> > > +#[doc(inline)]
> > > +pub use bindings::vm_flags_t;
> >
> > Where do you declare this type?
>
> It's declared in include/linux/mm_types.h
I meant from a rust perspective, but I guess bindgen handles this?
>
> > > +
> > > +/// All possible flags for [`VmAreaRef`].
> >
> > Well you've not specified all as they're some speciailist ones and ones
> > that depend on config flags, so maybe worth just saying 'core' flags?
>
> Sure.
Thanks.
>
> > > +pub mod flags {
> >
> > Pure rust noobie question (again...) but what is a 'mod'?
>
> It's a module.
Ack.
>
> > > + use super::vm_flags_t;
> > > + use crate::bindings;
> > > +
> > > + /// No flags are set.
> > > + pub const NONE: vm_flags_t = bindings::VM_NONE as _;
> >
> > This is strictly not a flag, as is the 0 value (and is used 'cleverly' in
> > kernel code when we have flags that are defined under some circumstances
> > but not others, where we can then assign these values to VM_NONE if not
> > available, ensuring all |= ... operations are no-ops and all &
> > ... expressions evaluate to false) but I guess it doesn't matter all too
> > much right?
>
> Ultimately it's just a bunch of integer constants. We can include or
> exclude whichever ones we prefer.
Yeah not a big deal.
>
> > > + /// Mapping allows reads.
> > > + pub const READ: vm_flags_t = bindings::VM_READ as _;
> > > +
> > > + /// Mapping allows writes.
> > > + pub const WRITE: vm_flags_t = bindings::VM_WRITE as _;
> > > +
> > > + /// Mapping allows execution.
> > > + pub const EXEC: vm_flags_t = bindings::VM_EXEC as _;
> > > +
> > > + /// Mapping is shared.
> > > + pub const SHARED: vm_flags_t = bindings::VM_SHARED as _;
> > > +
> > > + /// Mapping may be updated to allow reads.
> > > + pub const MAYREAD: vm_flags_t = bindings::VM_MAYREAD as _;
> > > +
> > > + /// Mapping may be updated to allow writes.
> > > + pub const MAYWRITE: vm_flags_t = bindings::VM_MAYWRITE as _;
> > > +
> > > + /// Mapping may be updated to allow execution.
> > > + pub const MAYEXEC: vm_flags_t = bindings::VM_MAYEXEC as _;
> > > +
> > > + /// Mapping may be updated to be shared.
> > > + pub const MAYSHARE: vm_flags_t = bindings::VM_MAYSHARE as _;
> > > +
> > > + /// Page-ranges managed without `struct page`, just pure PFN.
> > > + pub const PFNMAP: vm_flags_t = bindings::VM_PFNMAP as _;
> > > +
> > > + /// Memory mapped I/O or similar.
> > > + pub const IO: vm_flags_t = bindings::VM_IO as _;
> > > +
> > > + /// Do not copy this vma on fork.
> > > + pub const DONTCOPY: vm_flags_t = bindings::VM_DONTCOPY as _;
> > > +
> > > + /// Cannot expand with mremap().
> > > + pub const DONTEXPAND: vm_flags_t = bindings::VM_DONTEXPAND as _;
> > > +
> > > + /// Lock the pages covered when they are faulted in.
> > > + pub const LOCKONFAULT: vm_flags_t = bindings::VM_LOCKONFAULT as _;
> > > +
> > > + /// Is a VM accounted object.
> > > + pub const ACCOUNT: vm_flags_t = bindings::VM_ACCOUNT as _;
> > > +
> > > + /// should the VM suppress accounting.
> > > + pub const NORESERVE: vm_flags_t = bindings::VM_NORESERVE as _;
> > > +
> > > + /// Huge TLB Page VM.
> > > + pub const HUGETLB: vm_flags_t = bindings::VM_HUGETLB as _;
> > > +
> > > + /// Synchronous page faults.
> >
> > Might be worth mentioning that this is DAX-specific only.
>
> Will add.
Ack.
>
> > > + pub const SYNC: vm_flags_t = bindings::VM_SYNC as _;
> > > +
> > > + /// Architecture-specific flag.
> > > + pub const ARCH_1: vm_flags_t = bindings::VM_ARCH_1 as _;
> > > +
> > > + /// Wipe VMA contents in child..
> >
> > Typo '..' - also probably worth saying 'wipe VMA contents in child on
> > fork'.
>
> Will add.
Ack, thanks.
>
> > > + pub const WIPEONFORK: vm_flags_t = bindings::VM_WIPEONFORK as _;
> > > +
> > > + /// Do not include in the core dump.
> > > + pub const DONTDUMP: vm_flags_t = bindings::VM_DONTDUMP as _;
> > > +
> > > + /// Not soft dirty clean area.
> > > + pub const SOFTDIRTY: vm_flags_t = bindings::VM_SOFTDIRTY as _;
> > > +
> > > + /// Can contain `struct page` and pure PFN pages.
> > > + pub const MIXEDMAP: vm_flags_t = bindings::VM_MIXEDMAP as _;
> > > +
> > > + /// MADV_HUGEPAGE marked this vma.
> > > + pub const HUGEPAGE: vm_flags_t = bindings::VM_HUGEPAGE as _;
> > > +
> > > + /// MADV_NOHUGEPAGE marked this vma.
> > > + pub const NOHUGEPAGE: vm_flags_t = bindings::VM_NOHUGEPAGE as _;
> > > +
> > > + /// KSM may merge identical pages.
> > > + pub const MERGEABLE: vm_flags_t = bindings::VM_MERGEABLE as _;
> > > +}
> > >
> >
> > I'd say these comments are a bit sparse and need more detail, but they are
> > literally comments from include/linux/mm.h and therefore, again, our fault
> > so this is fine :)
>
> Happy to add more if you tell me what you'd like to see. ;)
Sure, this is fine for now.
>
> Alice
Powered by blists - more mailing lists