lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aSmVQDMXqwc8ctDf@google.com>
Date: Fri, 28 Nov 2025 12:27:44 +0000
From: Alice Ryhl <aliceryhl@...gle.com>
To: Robin Murphy <robin.murphy@....com>
Cc: Miguel Ojeda <ojeda@...nel.org>, Will Deacon <will@...nel.org>, 
	Daniel Almeida <daniel.almeida@...labora.com>, 
	Boris Brezillon <boris.brezillon@...labora.com>, Boqun Feng <boqun.feng@...il.com>, 
	Gary Guo <gary@...yguo.net>, 
	"Björn Roy Baron" <bjorn3_gh@...tonmail.com>, Benno Lossin <lossin@...nel.org>, 
	Andreas Hindborg <a.hindborg@...nel.org>, Trevor Gross <tmgross@...ch.edu>, 
	Danilo Krummrich <dakr@...nel.org>, Joerg Roedel <joro@...tes.org>, 
	Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, "Liam R. Howlett" <Liam.Howlett@...cle.com>, 
	Asahi Lina <lina+kernel@...hilina.net>, linux-kernel@...r.kernel.org, 
	rust-for-linux@...r.kernel.org, iommu@...ts.linux.dev, linux-mm@...ck.org
Subject: Re: [PATCH v3] io: add io_pgtable abstraction

On Fri, Nov 28, 2025 at 11:56:17AM +0000, Robin Murphy wrote:
> On 2025-11-12 10:15 am, Alice Ryhl wrote:
> > From: Asahi Lina <lina+kernel@...hilina.net>
> > 
> > This will be used by the Tyr driver to create and modify the page table
> > of each address space on the GPU. Each time a mapping gets created or
> > removed by userspace, Tyr will call into GPUVM, which will figure out
> > which calls to map_pages and unmap_pages are required to map the data in
> > question in the page table so that the GPU may access those pages when
> > using that address space.
> > 
> > The Rust type wraps the struct using a raw pointer rather than the usual
> > Opaque+ARef approach because Opaque+ARef requires the target type to be
> > refcounted.
> > 
> > Signed-off-by: Asahi Lina <lina+kernel@...hilina.net>
> > Co-Developed-by: Alice Ryhl <aliceryhl@...gle.com>
> > Signed-off-by: Alice Ryhl <aliceryhl@...gle.com>

> > +/// Protection flags used with IOMMU mappings.
> > +pub mod prot {
> > +    /// Read access.
> > +    pub const READ: u32 = bindings::IOMMU_READ;
> > +    /// Write access.
> > +    pub const WRITE: u32 = bindings::IOMMU_WRITE;
> > +    /// Request cache coherency.
> > +    pub const CACHE: u32 = bindings::IOMMU_CACHE;
> > +    /// Request no-execute permission.
> > +    pub const NOEXEC: u32 = bindings::IOMMU_NOEXEC;
> > +    /// MMIO peripheral mapping.
> > +    pub const MMIO: u32 = bindings::IOMMU_MMIO;
> > +    /// Privileged mapping.
> > +    pub const PRIV: u32 = bindings::IOMMU_PRIV;
> 
> Nit: probably best to call this PRIVILEGED from day 1 for clarity - some day
> we may eventually get round to renaming the C symbol too, especially if we
> revisit the notion of "private" mappings (that's still on my ideas list...)

Sure, will rename.

> > +    /// Map a physically contiguous range of pages of the same size.
> > +    ///
> > +    /// # Safety
> > +    ///
> > +    /// * This page table must not contain any mapping that overlaps with the mapping created by
> > +    ///   this call.
> 
> As mentioned this isn't necessarily true of io-pgtable itself, but since
> you've not included QUIRK_NO_WARN in the abstraction then it's fair if this
> layer wants to be a little stricter toward Rust users.

Assuming that we don't allow QUICK_NO_WARN, would you say that it's
precise as-is?

> > +    /// * If this page table is live, then the caller must ensure that it's okay to access the
> > +    ///   physical address being mapped for the duration in which it is mapped.
> > +    #[inline]
> > +    pub unsafe fn map_pages(
> > +        &self,
> > +        iova: usize,
> > +        paddr: PhysAddr,
> > +        pgsize: usize,
> > +        pgcount: usize,
> > +        prot: u32,
> > +        flags: alloc::Flags,
> > +    ) -> Result<usize> {
> > +        let mut mapped: usize = 0;
> > +
> > +        // SAFETY: The `map_pages` function in `io_pgtable_ops` is never null.
> > +        let map_pages = unsafe { (*self.raw_ops()).map_pages.unwrap_unchecked() };
> > +
> > +        // SAFETY: The safety requirements of this method are sufficient to call `map_pages`.
> > +        to_result(unsafe {
> > +            (map_pages)(
> > +                self.raw_ops(),
> > +                iova,
> > +                paddr,
> > +                pgsize,
> > +                pgcount,
> > +                prot as i32,
> > +                flags.as_raw(),
> > +                &mut mapped,
> > +            )
> > +        })?;
> > +
> > +        Ok(mapped)
> 
> Just to double-check since I'm a bit unclear on the Rust semantics, this can
> correctly reflect all 4 outcomes back to the caller, right? I.e.:
> 
> - no error, mapped == pgcount * pgsize (success)
> - no error, mapped < pgcount * pgsize (call again with the remainder)
> - error, mapped > 0 (probably unmap that bit, unless clever trickery where
> an error was expected)
> - error, mapped == 0 (nothing was done, straightforward failure)
> 
> (the only case not permitted is "no error, mapped == 0" - failure to make
> any progress must always be an error)
> 
> Alternatively you might want to consider encapsulating the partial-mapping
> handling in this layer as well - in the C code that's done at the level of
> the IOMMU API calls that io-pgtable-using IOMMU drivers are merely passing
> through, hence why panfrost/panthor have to open-code their own equivalents,
> but there's no particular reason to follow the *exact* same pattern here.

Ah, no this signature does not reflect all of those cases. The return
type is Result<usize>, which corresponds to:

struct my_return_type {
    bool success;
    union {
        size_t ok;
	int err; // an errno
    }
};

We need a different signature if it's possible to have mapped != 0 when
returning an error.

> > +    }
> > +
> > +    /// Unmap a range of virtually contiguous pages of the same size.
> > +    ///
> > +    /// # Safety
> > +    ///
> > +    /// This page table must contain a mapping at `iova` that consists of exactly `pgcount` pages
> > +    /// of size `pgsize`.
> 
> Again, the underlying requirement here is only that pgsize * pgcount
> represents the IOVA range of one or more consecutive ranges previously
> mapped, i.e.:
> 
> map(0, 4KB * 256);
> map(1MB, 4KB * 256);
> unmap(0, 2MB * 1);
> 
> is legal, since it's generally impractical for callers to know and keep
> track of the *exact* structure of a given pagetable. In this case there
> isn't really any good reason to try to be stricter.

How about this wording?

This page table must contain one or more consecutive mappings starting
at `iova` whose total size is `pgcount*pgsize`.

Alice

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ