[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <DBK1M000P87N.2HJHDJN1LG5CA@nvidia.com>
Date: Thu, 24 Jul 2025 14:40:37 +0900
From: "Alexandre Courbot" <acourbot@...dia.com>
To: "Abdiel Janulgue" <abdiel.janulgue@...il.com>, <dakr@...nel.org>,
<jgg@...pe.ca>, <lyude@...hat.com>
Cc: "Miguel Ojeda" <ojeda@...nel.org>, "Alex Gaynor"
<alex.gaynor@...il.com>, "Boqun Feng" <boqun.feng@...il.com>, "Gary Guo"
<gary@...yguo.net>, Björn Roy Baron
<bjorn3_gh@...tonmail.com>, "Benno Lossin" <lossin@...nel.org>, "Andreas
Hindborg" <a.hindborg@...nel.org>, "Alice Ryhl" <aliceryhl@...gle.com>,
"Trevor Gross" <tmgross@...ch.edu>, "Tamir Duberstein" <tamird@...il.com>,
"FUJITA Tomonori" <fujita.tomonori@...il.com>, "open list"
<linux-kernel@...r.kernel.org>, "Andrew Morton"
<akpm@...ux-foundation.org>, "Randy Dunlap" <rdunlap@...radead.org>,
"Herbert Xu" <herbert@...dor.apana.org.au>, "Caleb Sander Mateos"
<csander@...estorage.com>, "Petr Tesarik" <petr@...arici.cz>, "Sui
Jingfeng" <sui.jingfeng@...ux.dev>, "Marek Szyprowski"
<m.szyprowski@...sung.com>, "Robin Murphy" <robin.murphy@....com>,
<airlied@...hat.com>, "open list:DMA MAPPING HELPERS"
<iommu@...ts.linux.dev>, <rust-for-linux@...r.kernel.org>
Subject: Re: [PATCH v3 1/2] rust: add initial scatterlist abstraction
Hi Abdiel,
Thanks for this new revision! I think it is starting to take shape. I
have been able to use it (with a few changes) to replace my hacky DMA
mapping code in nova-core, and it seems to be working fine!
Some comments below.
On Fri Jul 18, 2025 at 7:33 PM JST, Abdiel Janulgue wrote:
<snip>
> +impl SGEntry {
> + /// Convert a raw `struct scatterlist *` to a `&'a SGEntry`.
> + ///
> + /// This is meant as a helper for other kernel subsystems and not to be used by device drivers directly.
> + ///
> + /// # Safety
> + ///
> + /// Callers must ensure that the `struct scatterlist` pointed to by `ptr` is valid for the lifetime
> + /// of the returned reference.
> + pub(crate) unsafe fn as_ref<'a>(ptr: *mut bindings::scatterlist) -> &'a Self {
IIUC this method can be kept private.
<snip>
> +/// A scatter-gather table of DMA address spans.
> +///
> +/// This structure represents the Rust abstraction for a C `struct sg_table`. This implementation
> +/// abstracts the usage of an already existing C `struct sg_table` within Rust code that we get
> +/// passed from the C side.
> +pub struct SGTable<T: Borrow<bindings::sg_table>, M: MappingState> {
> + /// Mapping state of the underlying `struct sg_table`.
> + ///
> + /// This defines which methods of `SGTable` are available.
> + ///
> + /// Declared first so it is dropped before `table`, so we remove the mapping before freeing the
> + /// SG table if the latter is owned.
> + _mapping: M,
> +
> + /// Something that can borrow the underlying `struct sg_table`.
> + table: T,
> +}
We will probably want an `impl AsRef<bindings::sg_table> for SGTable`, or an `as_ptr()` method.
<snip>
> +/// Provides a list of pages that can be used to build a `SGTable`.
> +pub trait SGTablePages {
> + /// Returns an iterator to the pages providing the backing memory of `self`.
> + ///
> + /// Implementers should return an iterator which provides information regarding each page entry to
> + /// build the `SGTable`. The first element in the tuple is a reference to the Page, the second element
> + /// as the offset into the page, and the third as the length of data. The fields correspond to the
> + /// first three fields of the C `struct scatterlist`.
> + fn iter<'a>(&'a self) -> impl Iterator<Item = (&'a Page, usize, usize)>;
I see a few issues with the `Item` type here.
The first one is that `Page` can only be created by allocating a new
page from scratch using `Page::alloc_page`. This doesn't cover the cases
where we want to map memory that is now allocated through this
mechanism, e.g. when mapping a `VVec`. So I think we have no choice but
return `*mut bindings::page`s.
The second one is that this `Item` type provides an offset and a length
for all pages, but in practice only the first page can have an offset,
and the length is only relevant for the last one. Providing these for
all pages is unneeded and risky.
So here I'd suggest to provide the following three methods:
pub trait SGTablePages {
fn iter(&self) -> impl Iterator<Item = *mut bindings::page>;
/// Returns the offset from the start of the first page to the start of the buffer.
fn offset(&self) -> usize;
/// Returns the number of valid bytes in the buffer, after [`SGTablePages::offset`].
fn size(&self) -> usize;
/// Returns the number of pages in the list.
fn num_pages(&self) -> usize {
(self.offset() + self.size()).div_ceil(PAGE_SIZE)
}
}
I have also renamed `entries` into `num_pages` and implemented it as a default
method as it can be deduced from `offset` and `size`.
> +
> + /// Returns the number of pages in the list.
> + fn entries(&self) -> usize;
> +}
> +
> +/// An iterator through `SGTable` entries.
IIUC this is an iterator over mapped `SGTable` entries. We definitely don't
want to provide this for unmapped states, as the `dma_address` method of
`SGEntry` would be callable in a state where it returns an invalid value.
> +pub struct SGTableIter<'a> {
> + pos: Option<&'a SGEntry>,
> +}
> +
> +impl<'a> Iterator for SGTableIter<'a> {
> + type Item = &'a SGEntry;
> +
> + fn next(&mut self) -> Option<Self::Item> {
> + let entry = self.pos;
> + // SAFETY: `sg` is an immutable reference and is equivalent to `scatterlist` via its type
> + // invariants, so its safe to use with sg_next.
> + let next = unsafe { bindings::sg_next(self.pos?.as_raw()) };
> +
> + // SAFETY: `sg_next` returns either a valid pointer to a `scatterlist`, or null if we
> + // are at the end of the scatterlist.
> + self.pos = (!next.is_null()).then(|| unsafe { SGEntry::as_ref(next) });
> + entry
> + }
As per `sg_dma_address`'s documentation, we should also stop iterating on the
first `sg_dma_len` that is zero. Integrating this condition, and reworking the
code a bit to be more idiomatic I came with this:
fn next(&mut self) -> Option<Self::Item> {
// SAFETY: `sg` is an immutable reference and is equivalent to `scatterlist` via its type
// invariants, so its safe to use with sg_next.
let next = unsafe { bindings::sg_next(self.pos?.as_raw()) };
// SAFETY: `sg_next` returns either a valid pointer to a `scatterlist`, or null if we
// are at the end of the scatterlist.
core::mem::replace(
&mut self.pos,
(!next.is_null())
.then(|| unsafe { SGEntry::as_ref(next) })
// As per `sg_dma_address` documentation, stop iterating on the first `sg_dma_len`
// which is `0`.
.filter(|&e| e.dma_len() != 0),
)
}
> +}
> +
> +impl<'a, T, M> IntoIterator for &'a SGTable<T, M>
> +where
> + T: Borrow<bindings::sg_table>,
> + M: MappedState,
> +{
> + type Item = &'a SGEntry;
> + type IntoIter = SGTableIter<'a>;
> +
> + fn into_iter(self) -> Self::IntoIter {
> + self.iter()
> + }
> +}
I don't think we need this `IntoIterator` implementation?
<snip>
> +impl<P: SGTablePages> SGTable<OwnedSgt<P>, Unmapped> {
> + /// Allocate and build a new `SGTable` from an existing list of `pages`. This method moves the
> + /// ownership of `pages` to the table.
> + ///
> + /// To build a scatter-gather table, provide the `pages` object which must implement the
> + /// `SGTablePages` trait.
> + ///
> + ///# Examples
> + ///
> + /// ```
> + /// use kernel::{device::Device, scatterlist::*, page::*, prelude::*};
> + ///
> + /// struct PagesArray(KVec<Page>);
> + /// impl SGTablePages for PagesArray {
> + /// fn iter<'a>(&'a self) -> impl Iterator<Item = (&'a Page, usize, usize)> {
> + /// self.0.iter().map(|page| (page, kernel::page::PAGE_SIZE, 0))
> + /// }
> + ///
> + /// fn entries(&self) -> usize {
> + /// self.0.len()
> + /// }
> + /// }
> + ///
> + /// let mut pages = KVec::new();
> + /// let _ = pages.push(Page::alloc_page(GFP_KERNEL)?, GFP_KERNEL);
> + /// let _ = pages.push(Page::alloc_page(GFP_KERNEL)?, GFP_KERNEL);
> + /// let sgt = SGTable::new_owned(PagesArray(pages), GFP_KERNEL)?;
> + /// # Ok::<(), Error>(())
> + /// ```
> + pub fn new_owned(pages: P, flags: kernel::alloc::Flags) -> Result<Self> {
> + // SAFETY: `sgt` is not a reference.
> + let mut sgt: bindings::sg_table = unsafe { core::mem::zeroed() };
> +
> + // SAFETY: The sgt pointer is from the Opaque-wrapped `sg_table` object hence is valid.
> + let ret =
> + unsafe { bindings::sg_alloc_table(&mut sgt, pages.entries() as u32, flags.as_raw()) };
> + if ret != 0 {
> + return Err(Error::from_errno(ret));
> + }
> + // SAFETY: We just successfully allocated `sgt`, hence the pointer is valid and have sole access to
> + // it at this point.
> + let sgentries = unsafe { core::slice::from_raw_parts_mut(sgt.sgl, pages.entries()) };
> + for (entry, page) in sgentries
> + .iter_mut()
> + .map(|e|
> + // SAFETY: `SGEntry::as_mut` is called on the pointer only once, which is valid and non-NULL
> + // while inside the closure.
> + unsafe { SGEntry::as_mut(e) })
> + .zip(pages.iter())
> + {
> + entry.set_page(page.0, page.1 as u32, page.2 as u32)
> + }
Since we now have clear `offset` and `size` parameters provided by
SGTablePages, how about just using `sg_table_alloc_from_page_segments`?
It can merge nearby neighboring pages into a single entry, making it
more efficient than calling `set_page` repeatedly. And if we switch to
it we can remove `set_page` altogether.
The only drawback being that we would need to temporarily collect the
pages into a linear buffer in order to call it, but I think that's an
acceptable compromise. The code would look something like this:
pub fn new_owned(pages: P, flags: kernel::alloc::Flags) -> Result<Self> {
// Collect of all the memory pages.
let mut page_vec: KVec<*mut bindings::page> =
KVec::with_capacity(pages.num_pages(), flags)?;
for page in pages.iter() {
page_vec.push(page, flags)?;
}
let mut sgt: bindings::sg_table = Default::default();
error::to_result(unsafe {
bindings::sg_alloc_table_from_pages_segment(
&mut sgt,
page_vec.as_mut_ptr(),
page_vec.len() as u32,
pages.offset() as u32,
pages.size(),
u32::MAX,
bindings::GFP_KERNEL,
)
})?;
Ok(Self {
table: OwnedSgt { sgt, pages },
_mapping: Unmapped,
})
}
I know `set_page` was frowned at a bit, so maybe that's a good opportunity to
just work around the problem by using a higher-level SG API (and we could also
remove `SGEntry::as_mut` as a bonus).
> +
> + Ok(Self {
> + // INVARIANT: We just successfully allocated and built the table from the page entries.
> + table: OwnedSgt { sgt, _pages: pages },
> + _mapping: Unmapped,
> + })
> + }
> +}
There are also cases when the caller might want to access the inner object of
an `OwnedSgt` - I've had to add this for nova-core's needs:
impl<P, M> SGTable<OwnedSgt<P>, M>
where
P: SGTablePages,
M: MappingState,
{
/// Returns a reference to the object backing the memory for this SG table.
pub fn borrow(&self) -> &P {
&self.table.pages
}
}
Powered by blists - more mailing lists