linux-kernel - Re: [PATCH v2] rust: alloc: implement `extend` for `Vec`

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <D8ZKOIHKPBCN.OIHEWU3TUE1W@nvidia.com>
Date: Sun, 06 Apr 2025 21:59:19 +0900
From: "Alexandre Courbot" <acourbot@...dia.com>
To: "Boqun Feng" <boqun.feng@...il.com>
Cc: "Danilo Krummrich" <dakr@...nel.org>, "Miguel Ojeda" <ojeda@...nel.org>,
 "Alex Gaynor" <alex.gaynor@...il.com>, "Gary Guo" <gary@...yguo.net>,
 Björn Roy Baron <bjorn3_gh@...tonmail.com>, "Benno Lossin"
 <benno.lossin@...ton.me>, "Andreas Hindborg" <a.hindborg@...nel.org>,
 "Alice Ryhl" <aliceryhl@...gle.com>, "Trevor Gross" <tmgross@...ch.edu>,
 "Joel Fernandes" <joelagnelf@...dia.com>, "John Hubbard"
 <jhubbard@...dia.com>, <rust-for-linux@...r.kernel.org>,
 <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] rust: alloc: implement `extend` for `Vec`

Hi Boqun, thanks for the review!

On Sun Apr 6, 2025 at 4:44 AM JST, Boqun Feng wrote:
> Hi Alexandre,
>
> Thanks for the patch.
>
> On Sat, Apr 05, 2025 at 10:51:41PM +0900, Alexandre Courbot wrote:
>> KVec currently has `extend_with` and `extend_from_slice` methods, but no
>> way extend a vector from a regular iterator as provided by the `Extend`
>> trait.
>> 
>> Due to the need to provide the GFP flags, `Extend` cannot be implemented
>> directly, so simply define a homonymous method that takes an extra
>> `flags` argument.
>> 
>> The aforementioned `extend_with` and `extend_from_slice` can then be
>> reimplemented as direct invocations of this new method - maybe they can
>> eventually be removed.
>> 
>> Signed-off-by: Alexandre Courbot <acourbot@...dia.com>
>> ---
>> I was a bit surprised to find no equivalent of the `Extend` trait for
>> KVec, and while I anticipate to be told the reason for this, I also
>> didn't hit any hard wall trying to come with my own implementation so
>> here it is.
>> 
>> I expect the new `extend_with` and `extend_from_slice` to be optimized
>> into something close to their previous implementations, but am not sure
>> how I can simply verify that this is the case - any hint would be
>> appreciated!
>> ---
>> Changes in v2:
>> - Changed the diff algorithm to histogram for a more readable patch.
>> ---
>>  rust/kernel/alloc/kvec.rs | 89 +++++++++++++++++++++++++++++------------------
>>  1 file changed, 56 insertions(+), 33 deletions(-)
>> 
>> diff --git a/rust/kernel/alloc/kvec.rs b/rust/kernel/alloc/kvec.rs
>> index ae9d072741cedbb34bed0be0c20cc75472aa53be..e78cb5ee575ce01e44283f8b4905689fb1e96165 100644
>> --- a/rust/kernel/alloc/kvec.rs
>> +++ b/rust/kernel/alloc/kvec.rs
>> @@ -454,30 +454,64 @@ pub fn reserve(&mut self, additional: usize, flags: Flags) -> Result<(), AllocEr
>>      }
>>  }
>>  
>> +impl<T, A: Allocator> Vec<T, A> {
>> +    /// Extends the vector by the elements of `iter`.
>> +    ///
>> +    /// This uses [`Iterator::size_hint`] to optimize reallocation of memory, but will work even
>> +    /// with imprecise implementations - albeit with potentially more memory reallocations.
>> +    ///
>> +    /// In the kernel most iterators are expected to have a precise `size_hint` implementation, so
>> +    /// this should nicely optimize out in most cases.
>> +    pub fn extend<I>(&mut self, iter: I, flags: Flags) -> Result<(), AllocError>
>> +    where
>> +        I: IntoIterator<Item = T>,
>> +    {
>> +        let mut iter = iter.into_iter();
>> +
>> +        loop {
>> +            let low_bound = match iter.size_hint() {
>> +                // No more items expected, we can return.
>> +                (0, Some(0)) => break,
>> +                // Possibly more items but not certain, tentatively add one.
>> +                (0, _) => 1,
>> +                // More items pending, reserve space for the lower bound.
>> +                (low_bound, _) => low_bound,
>> +            };
>> +
>> +            self.reserve(low_bound, flags)?;
>> +
>
> I want to point out this might cause a behavior change, previously
> extend_with() and extend_with_slice() do a "all-or-nothing" extension
> depending on memory allocation, i.e. if there is enough memory for all
> the new items, do the extension, otherwise do nothing. Your changes here
> make it that extension can fail in-between due to AllocError, that is,
> only part of the `iter` is added. Of course, in practice, both
> slice::Iter and iter::Take will just return the number of all the items
> as the low_bound of .size_hint(), but it's not guaranteed.

That's a very valid point, and one of the reasons why I would like to
see how the code is actually optimized in `extend_with` and
`extend_with_slice`. While the method is designed to handle
imprecise/incorrect implementations of `size_hint`, the expectation is
that outside of very unusual uses the code should be able to be
optimized into a single-allocation, non-loop pass.

We could probably enforce that if we had access to `TrustedLen` or
defined our own equivalent trait for the kernel. The set of iterators
that could be passed as arguments would be more limited, but for the
same reason as above I don't think that would be a big limitation.

> I don't see a direct correct-or-wrong answer for what behavior is
> desired, but if we are moving to a new behavior, we need to make sure
> updating the document of the extend*() function. Plus if failing
> in-between, should we return the `iter` so that users can continue do
> something about the `iter`?

I have updated the documentation with more details about the sub-optimal
and error cases. I am not sure what use a caller would have from the
remaining items - and after all, the currently existing methods also
don't return the iterator upon failure. If we want to preserve the
current behavior, we can always reduce the size of the vector to its
pre-call value on the error path.

>
>> +            // Number of items we effectively added.
>> +            let added_items = self
>> +                .spare_capacity_mut()
>> +                .into_iter()
>> +                // Take a mutable reference to the iterator so we can reuse it in the next
>> +                // iteration of the loop if needed.
>> +                .zip(&mut iter)
>> +                .fold(0, |count, (dst, src)| {
>> +                    dst.write(src);
>> +
>> +                    count + 1
>> +                });
>> +
>> +            // SAFETY:
>> +            // - `self.len() + added_items <= self.capacity()` due to the call to `reserve` above,
>> +            // - items `[self.len()..self.len() + added_items - 1]` are initialized.
>> +            unsafe { self.set_len(self.len() + added_items) };
>> +
>> +            // `size_hint` was incorrect and our iterator ended before its advertized low bound.
>> +            if added_items < low_bound {
>> +                break;
>> +            }
>> +        }
>> +
>> +        Ok(())
>> +    }
>> +}
>> +
>>  impl<T: Clone, A: Allocator> Vec<T, A> {
>>      /// Extend the vector by `n` clones of `value`.
>>      pub fn extend_with(&mut self, n: usize, value: T, flags: Flags) -> Result<(), AllocError> {
>> -        if n == 0 {
>> -            return Ok(());
>> -        }
>> -
>> -        self.reserve(n, flags)?;
>> -
>> -        let spare = self.spare_capacity_mut();
>> -
>> -        for item in spare.iter_mut().take(n - 1) {
>> -            item.write(value.clone());
>> -        }
>> -
>> -        // We can write the last element directly without cloning needlessly.
>> -        spare[n - 1].write(value);
>> -
>> -        // SAFETY:
>> -        // - `self.len() + n < self.capacity()` due to the call to reserve above,
>> -        // - the loop and the line above initialized the next `n` elements.
>> -        unsafe { self.set_len(self.len() + n) };
>> -
>> -        Ok(())
>> +        self.extend(core::iter::repeat(value).take(n), flags)
>
> Would this actually call T::clone() n times instead of n - 1 times?

Indeed - we probably want to use core::iter::repeat_n() here to mimic
the original behavior.

Cheers,
Alex.