linux-kernel - Re: [PATCH v2 1/4] rust: iov: add iov_iter abstractions for ITER

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <878qkyoi6d.fsf@kernel.org>
Date: Tue, 08 Jul 2025 16:45:14 +0200
From: Andreas Hindborg <a.hindborg@...nel.org>
To: "Alice Ryhl" <aliceryhl@...gle.com>
Cc: "Greg Kroah-Hartman" <gregkh@...uxfoundation.org>,  "Alexander Viro"
 <viro@...iv.linux.org.uk>,  "Arnd Bergmann" <arnd@...db.de>,  "Miguel
 Ojeda" <ojeda@...nel.org>,  "Boqun Feng" <boqun.feng@...il.com>,  "Gary
 Guo" <gary@...yguo.net>,  Björn Roy Baron
 <bjorn3_gh@...tonmail.com>,
  "Trevor Gross" <tmgross@...ch.edu>,  "Danilo Krummrich"
 <dakr@...nel.org>,  "Matthew Maurer" <mmaurer@...gle.com>,  "Lee Jones"
 <lee@...nel.org>,  <linux-kernel@...r.kernel.org>,
  <rust-for-linux@...r.kernel.org>,  "Benno Lossin" <lossin@...nel.org>
Subject: Re: [PATCH v2 1/4] rust: iov: add iov_iter abstractions for
 ITER_SOURCE

"Alice Ryhl" <aliceryhl@...gle.com> writes:

> This adds abstractions for the iov_iter type in the case where
> data_source is ITER_SOURCE. This will make Rust implementations of
> fops->write_iter possible.
>
> This series only has support for using existing IO vectors created by C
> code. Additional abstractions will be needed to support the creation of
> IO vectors in Rust code.
>
> These abstractions make the assumption that `struct iov_iter` does not
> have internal self-references, which implies that it is valid to move it
> between different local variables.
>
> Signed-off-by: Alice Ryhl <aliceryhl@...gle.com>
> ---
>  rust/kernel/iov.rs | 152 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  rust/kernel/lib.rs |   1 +
>  2 files changed, 153 insertions(+)
>
> diff --git a/rust/kernel/iov.rs b/rust/kernel/iov.rs
> new file mode 100644
> index 0000000000000000000000000000000000000000..b4d7ec14c57a561a01cd65b6bdf0f94b1b373b84
> --- /dev/null
> +++ b/rust/kernel/iov.rs
> @@ -0,0 +1,152 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +// Copyright (C) 2025 Google LLC.
> +
> +//! IO vectors.
> +//!
> +//! C headers: [`include/linux/iov_iter.h`](srctree/include/linux/iov_iter.h),
> +//! [`include/linux/uio.h`](srctree/include/linux/uio.h)
> +
> +use crate::{
> +    alloc::{Allocator, Flags},
> +    bindings,
> +    prelude::*,
> +    types::Opaque,
> +};
> +use core::{marker::PhantomData, mem::MaybeUninit, slice};
> +
> +const ITER_SOURCE: bool = bindings::ITER_SOURCE != 0;
> +
> +/// An IO vector that acts as a source of data.
> +///
> +/// The data may come from many different sources. This includes both things in kernel-space and
> +/// reading from userspace. It's not necessarily the case that the data source is immutable, so
> +/// rewinding the IO vector to read the same data twice is not guaranteed to result in the same
> +/// bytes. It's also possible that the data source is mapped in a thread-local manner using e.g.
> +/// `kmap_local_page()`, so this type is not `Send` to ensure that the mapping is read from the
> +/// right context in that scenario.
> +///
> +/// # Invariants
> +///
> +/// Must hold a valid `struct iov_iter` with `data_source` set to `ITER_SOURCE`. For the duration
> +/// of `'data`, it must be safe to read the data in this IO vector.

In my opinion, the phrasing you had in v1 was better:

  The buffers referenced by the IO vector must be valid for reading for
  the duration of `'data`.

That is, I would prefer "must be valid for reading" over "it must be
safe to read ...".

> +#[repr(transparent)]
> +pub struct IovIterSource<'data> {
> +    iov: Opaque<bindings::iov_iter>,
> +    /// Represent to the type system that this value contains a pointer to readable data it does
> +    /// not own.
> +    _source: PhantomData<&'data [u8]>,
> +}
> +
> +impl<'data> IovIterSource<'data> {
> +    /// Obtain an `IovIterSource` from a raw pointer.
> +    ///
> +    /// # Safety
> +    ///
> +    /// * For the duration of `'iov`, the `struct iov_iter` must remain valid and must not be
> +    ///   accessed except through the returned reference.
> +    /// * For the duration of `'data`, the buffers backing this IO vector must be valid for
> +    ///   reading.
> +    #[track_caller]
> +    #[inline]
> +    pub unsafe fn from_raw<'iov>(ptr: *mut bindings::iov_iter) -> &'iov mut IovIterSource<'data> {
> +        // SAFETY: The caller ensures that `ptr` is valid.
> +        let data_source = unsafe { (*ptr).data_source };
> +        assert_eq!(data_source, ITER_SOURCE);
> +
> +        // SAFETY: The caller ensures the struct invariants for the right durations.
> +        unsafe { &mut *ptr.cast::<IovIterSource<'data>>() }
> +    }
> +
> +    /// Access this as a raw `struct iov_iter`.
> +    #[inline]
> +    pub fn as_raw(&mut self) -> *mut bindings::iov_iter {
> +        self.iov.get()
> +    }
> +
> +    /// Returns the number of bytes available in this IO vector.
> +    ///
> +    /// Note that this may overestimate the number of bytes. For example, reading from userspace
> +    /// memory could fail with `EFAULT`, which will be treated as the end of the IO vector.
> +    #[inline]
> +    pub fn len(&self) -> usize {
> +        // SAFETY: It is safe to access the `count` field.

Reiterating my comment from v1: Why?

> +        unsafe {
> +            (*self.iov.get())
> +                .__bindgen_anon_1
> +                .__bindgen_anon_1
> +                .as_ref()
> +                .count
> +        }
> +    }
> +
> +    /// Returns whether there are any bytes left in this IO vector.
> +    ///
> +    /// This may return `true` even if there are no more bytes available. For example, reading from
> +    /// userspace memory could fail with `EFAULT`, which will be treated as the end of the IO vector.
> +    #[inline]
> +    pub fn is_empty(&self) -> bool {
> +        self.len() == 0
> +    }
> +
> +    /// Advance this IO vector by `bytes` bytes.
> +    ///
> +    /// If `bytes` is larger than the size of this IO vector, it is advanced to the end.
> +    #[inline]
> +    pub fn advance(&mut self, bytes: usize) {
> +        // SAFETY: `self.iov` is a valid IO vector.
> +        unsafe { bindings::iov_iter_advance(self.as_raw(), bytes) };
> +    }
> +
> +    /// Advance this IO vector backwards by `bytes` bytes.
> +    ///
> +    /// # Safety
> +    ///
> +    /// The IO vector must not be reverted to before its beginning.
> +    #[inline]
> +    pub unsafe fn revert(&mut self, bytes: usize) {
> +        // SAFETY: `self.iov` is a valid IO vector, and `bytes` is in bounds.
> +        unsafe { bindings::iov_iter_revert(self.as_raw(), bytes) };
> +    }
> +
> +    /// Read data from this IO vector.
> +    ///
> +    /// Returns the number of bytes that have been copied.
> +    #[inline]
> +    pub fn copy_from_iter(&mut self, out: &mut [u8]) -> usize {
> +        // SAFETY: We will not write uninitialized bytes to `out`.

Can you provide something to back this claim?


Best regards,
Andreas Hindborg