lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <DA9G1V69KTZQ.1SCVVKL43692A@nvidia.com>
Date: Fri, 30 May 2025 20:04:17 +0900
From: "Alexandre Courbot" <acourbot@...dia.com>
To: "Abdiel Janulgue" <abdiel.janulgue@...il.com>, <jgg@...pe.ca>,
 <dakr@...nel.org>, <lyude@...hat.com>
Cc: "Miguel Ojeda" <ojeda@...nel.org>, "Alex Gaynor"
 <alex.gaynor@...il.com>, "Boqun Feng" <boqun.feng@...il.com>, "Gary Guo"
 <gary@...yguo.net>, Björn Roy Baron
 <bjorn3_gh@...tonmail.com>, "Benno Lossin" <benno.lossin@...ton.me>,
 "Andreas Hindborg" <a.hindborg@...nel.org>, "Alice Ryhl"
 <aliceryhl@...gle.com>, "Trevor Gross" <tmgross@...ch.edu>, "Valentin Obst"
 <kernel@...entinobst.de>, "open list" <linux-kernel@...r.kernel.org>,
 "Marek Szyprowski" <m.szyprowski@...sung.com>, "Robin Murphy"
 <robin.murphy@....com>, <airlied@...hat.com>,
 <rust-for-linux@...r.kernel.org>, "open list:DMA MAPPING HELPERS"
 <iommu@...ts.linux.dev>, "Petr Tesarik" <petr@...arici.cz>, "Andrew Morton"
 <akpm@...ux-foundation.org>, "Herbert Xu" <herbert@...dor.apana.org.au>,
 "Sui Jingfeng" <sui.jingfeng@...ux.dev>, "Randy Dunlap"
 <rdunlap@...radead.org>, "Michael Kelley" <mhklinux@...look.com>
Subject: Re: [PATCH 1/2] rust: add initial scatterlist bindings

Hi Abdiel,

In my review of the RFC I mentioned there are two main issues this
design should address: safely transitioning between the mapping states
and iterate over DMA addresses (the easy one), and building the SG table
properly (the hard one)

Let me keep this thread focused on the easy part, then I'll piggyback on
Jason's reply to discuss the elephant in the room that is the lifetime
of backing memory.

On Thu May 29, 2025 at 7:14 AM JST, Abdiel Janulgue wrote:
> Add the rust abstraction for scatterlist. This allows use of the C
> scatterlist within Rust code which the caller can allocate themselves
> or to wrap existing kernel sg_table objects.
>
> Signed-off-by: Abdiel Janulgue <abdiel.janulgue@...il.com>
> ---
>  rust/bindings/bindings_helper.h |   1 +
>  rust/helpers/helpers.c          |   1 +
>  rust/helpers/scatterlist.c      |  25 +++
>  rust/kernel/dma.rs              |  17 ++
>  rust/kernel/lib.rs              |   1 +
>  rust/kernel/scatterlist.rs      | 369 ++++++++++++++++++++++++++++++++
>  6 files changed, 414 insertions(+)
>  create mode 100644 rust/helpers/scatterlist.c
>  create mode 100644 rust/kernel/scatterlist.rs
>
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index ab37e1d35c70..a7d3b97cd4e0 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -14,6 +14,7 @@
>  #include <linux/cred.h>
>  #include <linux/device/faux.h>
>  #include <linux/dma-mapping.h>
> +#include <linux/dma-direction.h>
>  #include <linux/errname.h>
>  #include <linux/ethtool.h>
>  #include <linux/file.h>
> diff --git a/rust/helpers/helpers.c b/rust/helpers/helpers.c
> index 1e7c84df7252..f45a15f88e25 100644
> --- a/rust/helpers/helpers.c
> +++ b/rust/helpers/helpers.c
> @@ -28,6 +28,7 @@
>  #include "rbtree.c"
>  #include "rcu.c"
>  #include "refcount.c"
> +#include "scatterlist.c"
>  #include "security.c"
>  #include "signal.c"
>  #include "slab.c"
> diff --git a/rust/helpers/scatterlist.c b/rust/helpers/scatterlist.c
> new file mode 100644
> index 000000000000..3c8015b07da1
> --- /dev/null
> +++ b/rust/helpers/scatterlist.c
> @@ -0,0 +1,25 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/dma-direction.h>
> +
> +void rust_helper_sg_set_page(struct scatterlist *sg, struct page *page,
> +			     unsigned int len, unsigned int offset)
> +{
> +	return sg_set_page(sg, page, len, offset);
> +}
> +
> +dma_addr_t rust_helper_sg_dma_address(struct scatterlist *sg)
> +{
> +	return sg_dma_address(sg);
> +}
> +
> +unsigned int rust_helper_sg_dma_len(struct scatterlist *sg)
> +{
> +	return sg_dma_len(sg);
> +}
> +
> +void rust_helper_dma_unmap_sgtable(struct device *dev, struct sg_table *sgt,
> +				   enum dma_data_direction dir, unsigned long attrs)
> +{
> +	return dma_unmap_sgtable(dev, sgt, dir, attrs);
> +}
> diff --git a/rust/kernel/dma.rs b/rust/kernel/dma.rs
> index 605e01e35715..2866345d22fc 100644
> --- a/rust/kernel/dma.rs
> +++ b/rust/kernel/dma.rs
> @@ -102,6 +102,23 @@ pub mod attrs {
>      pub const DMA_ATTR_PRIVILEGED: Attrs = Attrs(bindings::DMA_ATTR_PRIVILEGED);
>  }
>  
> +/// DMA mapping direction.
> +///
> +/// Corresponds to the kernel's [`enum dma_data_direction`].
> +///
> +/// [`enum dma_data_direction`]: srctree/include/linux/dma-direction.h
> +#[derive(Copy, Clone, PartialEq, Eq)]
> +pub enum DmaDataDirection {
> +    /// Direction isn't known.

Isn't it rather that data can flow both ways?

> +    DmaBidirectional = bindings::dma_data_direction_DMA_BIDIRECTIONAL as _,
> +    /// Data is going from the memory to the device.
> +    DmaToDevice = bindings::dma_data_direction_DMA_TO_DEVICE as _,
> +    /// Data is coming from the device to the memory.
> +    DmaFromDevice = bindings::dma_data_direction_DMA_FROM_DEVICE as _,
> +    /// No direction (used for debugging).
> +    DmaNone = bindings::dma_data_direction_DMA_NONE as _,
> +}

You probably don't need to prefix everything with "Dma", e.g. `ToDevice`
looks fine. Maybe the type itself could also be just `DataDirection`
(the same way `Attrs` is not `DmaAttrs`) as it is already in the `dma`
module.

The fact that this type is (correctly, IMHO) defined in `dma` but used
in `scatterlist` makes me wonder whether `scatterlist` should not be a
sub-module of `dma`? Are we using scatterlists for anything else than
DMA?

> +
>  /// An abstraction of the `dma_alloc_coherent` API.
>  ///
>  /// This is an abstraction around the `dma_alloc_coherent` API which is used to allocate and map
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index de07aadd1ff5..0a4f270e9a0d 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -73,6 +73,7 @@
>  pub mod print;
>  pub mod rbtree;
>  pub mod revocable;
> +pub mod scatterlist;
>  pub mod security;
>  pub mod seq_file;
>  pub mod sizes;
> diff --git a/rust/kernel/scatterlist.rs b/rust/kernel/scatterlist.rs
> new file mode 100644
> index 000000000000..46cc04a87b2f
> --- /dev/null
> +++ b/rust/kernel/scatterlist.rs
> @@ -0,0 +1,369 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Scatterlist
> +//!
> +//! C header: [`include/linux/scatterlist.h`](srctree/include/linux/scatterlist.h)
> +
> +use crate::{
> +    bindings,
> +    device::{Bound, Device},
> +    dma::DmaDataDirection,
> +    error::{Error, Result},
> +    page::Page,
> +    types::{ARef, Opaque},
> +};
> +use core::marker::PhantomData;
> +use core::ops::{Deref, DerefMut};
> +
> +/// Marker trait for the mapping state of the `SGTable`
> +pub trait MapState: private::Sealed {}
> +
> +/// The [`Unmapped`] state of the `SGTable` is the table's initial state. While in this state, the pages of
> +/// the `SGTable` can be built by the CPU.
> +pub struct Unmapped;
> +
> +/// The [`Initialized`] state of the `SGTable` means that the table's span of pages has already been built.
> +pub struct Initialized;
> +
> +/// The [`Mapped`] state of the `SGTable` means that it is now mapped via DMA. While in this state
> +/// modification of the pages by the CPU is disallowed. This state will expose an interface to query
> +/// the DMA address of the entries.
> +pub struct Mapped;
> +
> +mod private {
> +    pub trait Sealed {}
> +
> +    impl Sealed for super::Mapped {}
> +    impl Sealed for super::Initialized {}
> +    impl Sealed for super::Unmapped {}
> +}
> +
> +impl MapState for Unmapped {}
> +impl MapState for Initialized {}
> +impl MapState for Mapped {}
> +
> +/// A single scatter-gather entry, representing a span of pages in the device's DMA address space.
> +///
> +/// # Invariants
> +///
> +/// The `scatterlist` pointer is valid for the lifetime of an SGEntry instance.
> +#[repr(transparent)]
> +pub struct SGEntry<T: MapState = Unmapped>(Opaque<bindings::scatterlist>, PhantomData<T>);
> +
> +impl<T: MapState> SGEntry<T> {
> +    /// Convert a raw `struct scatterlist *` to a `&'a SGEntry`.
> +    ///
> +    /// # Safety
> +    ///
> +    /// Callers must ensure that the `struct scatterlist` pointed to by `ptr` is valid for the lifetime
> +    /// of the returned reference.

Something like "Callers must ensure that the `struct scatterlist` is in
a state compatible with `MapState`" seems also needed.

> +    pub(crate) unsafe fn as_ref<'a>(ptr: *mut bindings::scatterlist) -> &'a Self {

I suspect this method can be kept private?

> +        // SAFETY: The pointer is valid and guaranteed by the safety requirements of the function.
> +        unsafe { &*ptr.cast() }
> +    }
> +
> +    /// Convert a raw `struct scatterlist *` to a `&'a mut SGEntry`.
> +    ///
> +    /// # Safety
> +    ///
> +    /// See safety requirements of [`SGEntry::as_ref`]. In addition, callers must ensure that only
> +    /// a single mutable reference can be taken from the same raw pointer, i.e. for the lifetime of the
> +    /// returned reference, no other call to this function on the same `struct scatterlist *` should
> +    /// be permitted.
> +    pub(crate) unsafe fn as_mut<'a>(ptr: *mut bindings::scatterlist) -> &'a mut Self {
> +        // SAFETY: The pointer is valid and guaranteed by the safety requirements of the function.
> +        unsafe { &mut *ptr.cast() }
> +    }
> +
> +    /// Obtain the raw `struct scatterlist *`.
> +    pub(crate) fn as_raw(&self) -> *mut bindings::scatterlist {
> +        self.0.get()
> +    }
> +}
> +
> +impl SGEntry<Mapped> {
> +    /// Returns the DMA address of this SG entry.
> +    pub fn dma_address(&self) -> bindings::dma_addr_t {
> +        // SAFETY: By the type invariant of `SGEntry`, ptr is valid.
> +        unsafe { bindings::sg_dma_address(self.0.get()) }
> +    }
> +
> +    /// Returns the length of this SG entry.
> +    pub fn dma_len(&self) -> u32 {
> +        // SAFETY: By the type invariant of `SGEntry`, ptr is valid.
> +        unsafe { bindings::sg_dma_len(self.0.get()) }
> +    }
> +}
> +
> +impl SGEntry<Unmapped> {
> +    /// Set this entry to point at a given page.
> +    pub fn set_page(&mut self, page: &Page, length: u32, offset: u32) {

Random idea: use Range<u32> to replace `length` and `offset`? But maybe
we can drop this method altogether, see below.

> +        let c: *mut bindings::scatterlist = self.0.get();
> +        // SAFETY: according to the `SGEntry` invariant, the scatterlist pointer is valid.
> +        // `Page` invariant also ensures the pointer is valid.
> +        unsafe { bindings::sg_set_page(c, page.as_ptr(), length, offset) };
> +    }
> +}
> +
> +/// A scatter-gather table of DMA address spans.
> +///
> +/// This structure represents the Rust abstraction for a C `struct sg_table`. This implementation
> +/// is able to abstract the usage of an already existing C `struct sg_table`. A new table can be
> +/// allocated by calling [`SGTable::alloc_table`].
> +///
> +/// # Invariants
> +///
> +/// The `sg_table` pointer is valid for the lifetime of an SGTable instance.
> +#[repr(transparent)]
> +pub struct SGTable<T: MapState = Unmapped>(Opaque<bindings::sg_table>, PhantomData<T>);
> +
> +impl<T: MapState> SGTable<T> {
> +    /// Convert a raw `struct sg_table *` to a `&'a SGTable`.
> +    ///
> +    /// # Safety
> +    ///
> +    /// Callers must ensure that the `struct sg_table` pointed to by `ptr` is valid for the lifetime
> +    /// of the returned reference.

Here also the `sg_table` must be in a state compatible with `MapState`.

> +    #[allow(unused)]
> +    pub(crate) unsafe fn as_ref<'a>(ptr: *mut bindings::sg_table) -> &'a Self {
> +        // SAFETY: Guaranteed by the safety requirements of the function.
> +        unsafe { &*ptr.cast() }
> +    }
> +
> +    /// Convert a raw `struct sg_table *` to a `&'a mut SGTable`.
> +    ///
> +    /// # Safety
> +    ///
> +    /// See safety requirements of [`SGTable::as_ref`]. In addition, callers must ensure that only
> +    /// a single mutable reference can be taken from the same raw pointer, i.e. for the lifetime of the
> +    /// returned reference, no other call to this function on the same `struct sg_table *` should
> +    /// be permitted.
> +    #[allow(unused)]
> +    pub(crate) unsafe fn as_mut<'a>(ptr: *mut bindings::sg_table) -> &'a mut Self {
> +        // SAFETY: Guaranteed by the safety requirements of the function.
> +        unsafe { &mut *ptr.cast() }
> +    }
> +
> +    /// Obtain the raw `struct sg_table *`.
> +    pub(crate) fn as_raw(&self) -> *mut bindings::sg_table {
> +        self.0.get()
> +    }
> +
> +    fn take_sgt(&mut self) -> Opaque<bindings::sg_table> {

Even though this is private, a bit of documentation would be nice.

> +        let sgt: bindings::sg_table = Default::default();
> +        let sgt: Opaque<bindings::sg_table> = Opaque::new(sgt);

let sgt = Opaque::new(bindings::sg_table::default()); ?

> +        core::mem::replace(&mut self.0, sgt)
> +    }
> +}
> +
> +impl SGTable<Unmapped> {
> +    /// Allocate and construct a new scatter-gather table.
> +    pub fn alloc_table(nents: usize, flags: kernel::alloc::Flags) -> Result<Self> {

Is there a need to separate the table allocation from its
initialization? If not you could turn `init` into e.g. `new` and make it
allocate and initialize the list by itself.

The problem I see with this step is that you can very well call `init`
with a closure that does nothing, or initialize the entries incorrectly,
and end up with a table that won't map (or worse, that will bug).

> +        let sgt: Opaque<bindings::sg_table> = Opaque::uninit();
> +
> +        // SAFETY: The sgt pointer is from the Opaque-wrapped `sg_table` object hence is valid.
> +        let ret = unsafe { bindings::sg_alloc_table(sgt.get(), nents as _, flags.as_raw()) };
> +        if ret != 0 {
> +            return Err(Error::from_errno(ret));
> +        }
> +        Ok(Self(sgt, PhantomData))
> +    }
> +
> +    /// The scatter-gather table page initializer.
> +    ///
> +    /// Runs a piece of code that initializes the pages of the scatter-gather table. This call transitions
> +    /// to and returns a `SGTable<Initialized>` object which can then be later mapped via DMA.
> +    ///
> +    /// # Examples
> +    ///
> +    /// ```
> +    /// use kernel::{device::Device, scatterlist::*, page::*};
> +    ///
> +    /// let sgt = SGTable::alloc_table(4, GFP_KERNEL)?;
> +    /// let sgt = sgt.init(|iter| {
> +    ///     for sg in iter {
> +    ///         sg.set_page(&Page::alloc_page(GFP_KERNEL)?, PAGE_SIZE as u32, 0);

Technically all the SGTs must be initialized, so instead of leaving the
burden of iterating to the user, we could make the iteration part of
`init`/`new` and just require the closure to initialize a single SG.

And since the only thing this closure can do is call `set_page` anyway,
and we really want it to not omit that step, why not make `init` take an
iterator of (&Page, Range<u32>) and use that to call `set_page` for us?
That way no SGEntry can be left uninitialized even if the user tries. We
could also allocate the sg_table, if the iterator's `size_hint` can be
trusted (or maybe we can collect the pages temporarily into a vector to
get the exact size).

The usage of `Page` looks also very wrong, but let's discuss that
alongside the backing memory lifetime issue. :)

> +    ///     }
> +    ///     Ok(())
> +    /// })?;
> +    /// # Ok::<(), Error>(())
> +    /// ```
> +    pub fn init(
> +        mut self,
> +        f: impl FnOnce(SGTableIterMut<'_>) -> Result,
> +    ) -> Result<SGTable<Initialized>> {
> +        f(self.iter())?;
> +        let sgt = self.take_sgt();
> +        core::mem::forget(self);
> +        Ok(SGTable(sgt, PhantomData))
> +    }
> +
> +    fn iter(&mut self) -> SGTableIterMut<'_> {

Shouldn't this be named `iter_mut`?

> +        SGTableIterMut {
> +            // SAFETY: dereferenced pointer is valid due to the type invariants on `SGTable`. This call
> +            // is in a private function which is allowed to be called only within the state transition
> +            // function [`SGTable<Unmapped>::init`] ensuring that the mutable reference can only be
> +            // obtained once for this object.
> +            pos: Some(unsafe { SGEntry::<Unmapped>::as_mut((*self.0.get()).sgl) }),
> +        }
> +    }
> +}
> +
> +impl SGTable<Initialized> {
> +    /// Map this scatter-gather table describing a buffer for DMA by the `Device`.
> +    ///
> +    /// This call transitions to and returns a `DeviceSGTable` object.
> +    ///
> +    /// # Examples
> +    ///
> +    /// ```
> +    /// use kernel::{device::{Bound, Device}, scatterlist::*};
> +    ///
> +    /// # fn test(dev: &Device<Bound>, sgt: SGTable<Initialized>) -> Result {
> +    /// let sgt = sgt.dma_map(dev, kernel::dma::DmaDataDirection::DmaToDevice)?;
> +    /// for sg in sgt.iter() {
> +    ///     let _addr = sg.dma_address();
> +    ///     let _len = sg.dma_len();
> +    /// }
> +    /// # Ok::<(), Error>(()) }
> +    /// ```
> +    pub fn dma_map(mut self, dev: &Device<Bound>, dir: DmaDataDirection) -> Result<DeviceSGTable> {
> +        // SAFETY: Invariants on `Device` and `SGTable` ensures that the pointers are valid.
> +        let ret = unsafe {
> +            bindings::dma_map_sgtable(
> +                dev.as_raw(),
> +                self.as_raw(),
> +                dir as _,
> +                bindings::DMA_ATTR_NO_WARN as _,
> +            )
> +        };
> +        if ret != 0 {
> +            return Err(Error::from_errno(ret));
> +        }
> +        let sgt = self.take_sgt();
> +        core::mem::forget(self);
> +        Ok(DeviceSGTable {
> +            sg: SGTable(sgt, PhantomData),
> +            dir,
> +            dev: dev.into(),
> +        })
> +    }
> +}
> +
> +impl SGTable<Mapped> {
> +    /// Returns an immutable iterator over the scather-gather table that is mapped for DMA.
> +    pub fn iter(&self) -> SGTableIter<'_> {
> +        SGTableIter {
> +            // SAFETY: dereferenced pointer is valid due to the type invariants on `SGTable`.
> +            pos: Some(unsafe { SGEntry::<Mapped>::as_ref((*self.0.get()).sgl) }),
> +        }
> +    }
> +}
> +
> +/// A scatter-gather table that is mapped for DMA operation.
> +pub struct DeviceSGTable {
> +    sg: SGTable<Mapped>,
> +    dir: DmaDataDirection,
> +    dev: ARef<Device>,
> +}

Mmm, so the transition graph is `SGTable<Unmapped>` ->
`SGTable<Initialized>` -> `DeviceSGTable.` One would expect
`SGTable<Mapped>.` :)

I think you can achieve this if you move `dir` and `dev` into the
`Mapped` typestate - that way `SGTable<Mapped>` contains everything it
needs, and you can do without the `Deref` and `DerefMut` implementations
below.

> +
> +impl Drop for DeviceSGTable {
> +    fn drop(&mut self) {
> +        // SAFETY: Invariants on `Device<Bound>` and `SGTable` ensures that the `self.dev` and `self.sg`
> +        // pointers are valid.
> +        unsafe {
> +            bindings::dma_unmap_sgtable(self.dev.as_raw(), self.sg.as_raw(), self.dir as _, 0)
> +        };
> +    }
> +}
> +
> +// TODO: Implement these as macros for objects that want to derive from `SGTable`.
> +impl Deref for DeviceSGTable {
> +    type Target = SGTable<Mapped>;
> +
> +    fn deref(&self) -> &Self::Target {
> +        &self.sg
> +    }
> +}
> +
> +impl DerefMut for DeviceSGTable {
> +    fn deref_mut(&mut self) -> &mut Self::Target {
> +        &mut self.sg
> +    }
> +}
> +
> +/// SAFETY: A `SGTable<Mapped>` is an immutable interface and should be safe to `Send` across threads.
> +unsafe impl Send for SGTable<Mapped> {}
> +
> +/// A mutable iterator through `SGTable` entries.
> +pub struct SGTableIterMut<'a> {
> +    pos: Option<&'a mut SGEntry<Unmapped>>,
> +}
> +
> +impl<'a> IntoIterator for &'a mut SGTable<Unmapped> {
> +    type Item = &'a mut SGEntry<Unmapped>;
> +    type IntoIter = SGTableIterMut<'a>;
> +
> +    fn into_iter(self) -> Self::IntoIter {
> +        self.iter()
> +    }
> +}

I wonder if it wouldn't be less confusing to have an `entry_iter` method
in `SGTable<Unmapped>` directly. But maybe we also don't want this at
all if we agree to remove the `Unmapped` state. Is there a need to
iterate over unmapped SG entries like that?

Overall I think this is starting to take shape, the typestate seems to
work here. But there is still the big issue of backing memory lifetime -
let me collect my thoughts some more and throw some test code, and I'll
come back to this in the other thread.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ