lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1ecebc7f-a1a5-488b-832c-40471fa3e61e@proton.me>
Date: Wed, 11 Sep 2024 08:46:11 +0000
From: Benno Lossin <benno.lossin@...ton.me>
To: Danilo Krummrich <dakr@...nel.org>
Cc: ojeda@...nel.org, alex.gaynor@...il.com, wedsonaf@...il.com, boqun.feng@...il.com, gary@...yguo.net, bjorn3_gh@...tonmail.com, a.hindborg@...sung.com, aliceryhl@...gle.com, akpm@...ux-foundation.org, daniel.almeida@...labora.com, faith.ekstrand@...labora.com, boris.brezillon@...labora.com, lina@...hilina.net, mcanal@...lia.com, zhiw@...dia.com, cjia@...dia.com, jhubbard@...dia.com, airlied@...hat.com, ajanulgu@...hat.com, lyude@...hat.com, linux-kernel@...r.kernel.org, rust-for-linux@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v6 13/26] rust: alloc: implement kernel `Vec` type

On 11.09.24 02:18, Danilo Krummrich wrote:
> On Tue, Sep 10, 2024 at 07:32:06PM +0000, Benno Lossin wrote:
>> On 16.08.24 02:10, Danilo Krummrich wrote:
>>> +    ($elem:expr; $n:expr) => (
>>> +        {
>>> +            $crate::alloc::KVec::from_elem($elem, $n, GFP_KERNEL)
>>
>> Hmm, could it be that one might want to use `kvec!` without
>> `GFP_KERNEL`?
>> Or add additional flags, eg __GFP_ZERO?
> 
> Pretty unlikely, I'd say. __GFP_ZERO in particular wouldn't make much sense,
> since the macro enforces initialization anyways.
> 
> Maybe something like GFP_ATOMIC, but I expect this to be uncommon enough to not
> consider this for this macro for now.

SGTM

>> If you think that supporting this is not immediately necessary or if
>> this is too much for this patchset, then we can also postpone it (maybe
>> it could be a good-first-issue). Do you keep a list of future
>> improvements for the new allocator API somewhere? If not, then I think
>> we should create one (best place would be the issue tracker on GH).
> 
> I would only add it if it turns out to be a common need. As mentioned, I don't
> expect it to be.
> 
> I'd rather keep issues in the source tree. For instance, DRM has them in
> '/Documentation/gpu/todo.rst'.

But then you need to submit patches to change them... That sounds like a
hassle and we already have the precedent to use the github issue
tracker. It is also much better for discoverability for people outside
of the kernel.

>>> +        }
>>> +    );
>>> +    ($($x:expr),+ $(,)?) => (
>>> +        {
>>> +            match $crate::alloc::KBox::new_uninit(GFP_KERNEL) {
>>> +                Ok(b) => Ok($crate::alloc::KVec::from($crate::alloc::KBox::write(b, [$($x),+]))),
>>> +                Err(e) => Err(e),
>>> +            }
>>> +        }
>>> +    );
>>> +}
>>> +
>>> +/// The kernel's [`Vec`] type.
>>> +///
>>> +/// A contiguous growable array type with contents allocated with the kernel's allocators (e.g.
>>> +/// `Kmalloc`, `Vmalloc` or `KVmalloc`), written `Vec<T, A>`.
>>
>> New folks might get confused as to which allocator they should choose.
>> Do we have a sensible default for `Vec`?
> 
> Then they should read the documentation about the kernel's allocators. We
> already link them in rust/kernel/alloc/allocator.rs.
> 
>> If yes, then we at least should document that here. We might also want
>> to make it the default value for the generic parameter.
>> This is also a good idea for `Box`.
> 
> If we really want a default it should be `Kmalloc`, but I really think we should
> force people to make an explicit decision and think about it and don't just go
> with whatever default.
> 
> It makes it also easier to grep for things. :)

SGTM

>>> +///
>>> +/// For non-zero-sized values, a [`Vec`] will use the given allocator `A` for its allocation. For
>>> +/// the most common allocators the type aliases `KVec`, `VVec` and `KVVec` exist.
>>> +///
>>> +/// For zero-sized types the [`Vec`]'s pointer must be `dangling_mut::<T>`; no memory is allocated.
>>> +///
>>> +/// Generally, [`Vec`] consists of a pointer that represents the vector's backing buffer, the
>>> +/// capacity of the vector (the number of elements that currently fit into the vector), it's length
>>> +/// (the number of elements that are currently stored in the vector) and the `Allocator` type used
>>> +/// to allocate (and free) the backing buffer.
>>> +///
>>> +/// A [`Vec`] can be deconstructed into and (re-)constructed from it's previously named raw parts
>>> +/// and manually modified.
>>> +///
>>> +/// [`Vec`]'s backing buffer gets, if required, automatically increased (re-allocated) when elements
>>> +/// are added to the vector.
>>> +///
>>> +/// # Invariants
>>> +///
>>> +/// The [`Vec`] backing buffer's pointer is always properly aligned and either points to memory
>>> +/// allocated with `A` or, for zero-sized types, is a dangling pointer.
>>
>> Just use `self.ptr` instead of "backing buffer".
>>
>>> +///
>>> +/// The length of the vector always represents the exact number of elements stored in the vector.
>>
>> Same here, `self.len`.
>>
>>> +///
>>> +/// The capacity of the vector always represents the absolute number of elements that can be stored
>>
>> Ditto.
>>
>>> +/// within the vector without re-allocation. However, it is legal for the backing buffer to be
>>> +/// larger than `size_of<T>` times the capacity.
>>> +///
>>> +/// The `Allocator` type `A` of the vector is the exact same `Allocator` type the backing buffer was
>>> +/// allocated with (and must be freed with).
>>
>> Please turn this into a bullet-point list.
> 
> Is this a rule? Do we want to make it one?

I am trying to make it one with my safety standard.

>>> +pub struct Vec<T, A: Allocator> {
>>> +    ptr: NonNull<T>,
>>> +    /// Represents the actual buffer size as `cap` times `size_of::<T>` bytes.
>>> +    ///
>>> +    /// Note: This isn't quite the same as `Self::capacity`, which in contrast returns the number of
>>> +    /// elements we can still store without reallocating.
>>> +    ///
>>> +    /// # Invariants
>>> +    ///
>>> +    /// `cap` must be in the `0..=isize::MAX` range.
>>> +    cap: usize,
>>> +    len: usize,
>>> +    _p: PhantomData<A>,
>>> +}



>>> +    /// Creates a new, empty Vec<T, A>.
>>> +    ///
>>> +    /// This method does not allocate by itself.
>>> +    #[inline]
>>> +    pub const fn new() -> Self {
>>> +        Self {
>>> +            ptr: NonNull::dangling(),
>>> +            cap: 0,
>>> +            len: 0,
>>> +            _p: PhantomData::<A>,
>>> +        }
>>> +    }
>>> +
>>> +    /// Returns a slice of `MaybeUninit<T>` for the remaining spare capacity of the vector.
>>
>> Returns
> 
> Hm?

Forgot to finish that sentence, since I couldn't really pinpoint what
exactly I wanted to change. Just ignore it.

>>> +    pub fn spare_capacity_mut(&mut self) -> &mut [MaybeUninit<T>] {
>>> +        // SAFETY: The memory between `self.len` and `self.capacity` is guaranteed to be allocated
>>> +        // and valid, but uninitialized.
>>> +        unsafe {
>>> +            slice::from_raw_parts_mut(
>>> +                self.as_mut_ptr().add(self.len) as *mut MaybeUninit<T>,
>>> +                self.capacity() - self.len,
>>> +            )
>>> +        }



>>> +    ///
>>> +    /// Otherwise:
>>> +    ///
>>> +    /// - `ptr` must have been allocated with the allocator `A`.
>>> +    /// - `ptr` must satisfy or exceed the alignment requirements of `T`.
>>> +    /// - `ptr` must point to memory with a size of at least `size_of::<T>` times the `capacity`
>>> +    ///    bytes.
>>
>> Just write "`size_of::<T>() * capacity` bytes".
>>
>>> +    /// - The allocated size in bytes must not be larger than `isize::MAX`.
>>
>> Should we make this a safety requirement of `Allocator`? I think this is
>> generally the maximum size other allocated things can have anyways.
> 
> 
> 
>>
>>> +    /// - `length` must be less than or equal to `capacity`.
>>> +    /// - The first `length` elements must be initialized values of type `T`.
>>> +    ///
>>> +    /// It is also valid to create an empty `Vec` passing a dangling pointer for `ptr` and zero for
>>> +    /// `cap` and `len`.
>>
>> Can you make this last sentence part of the `if` chain that you have
>> above (ie the one started with "If `T` is a ZST:").
> 
> But `T` isn't necessarily a ZST, but it may be.

I originally meant adding an "else if" part that checks for empty
capacity. But you could just add that to the if at the top ie "If
`capacity == 0` or `T` is a ZST".

>> Just to experiment with the suggestion from Kangrejos to use Rust as the
>> language for safety documentation, here is what it could look like (we
>> should discuss it more before we start using it):
>>
>>     /// # Safety
>>     ///
>>     /// ```ignore
>>     /// assert!(ptr.is_aligned_to(align_of::<T>()));
>>     /// assert!(!ptr.is_null());
>>     /// assert!(length <= capacity);
>>     /// if capacity != 0 && size_of::<T>() != 0 {
>>     ///     assert!(A::did_allocate(ptr));
>>     ///     assert!(size_of::<T>() * capacity <= A::layout_of(ptr).len());
>>     ///     assert!(is_initialized(ptr::slice_from_raw_parts(ptr, length)));
>>     /// }
>>     /// ```
>>
>> I really like how this looks! We might want to add labels/names to each
>> of the conditions and then one could use those in the justifications. A
>> tool could then read those and match them to the requirements of the
>> unsafe operation.
> 
> I need to think about this a bit more, but at a first glance I think I like it.
> 
> The tool would ideally be the compiler itself...

Yes! There is the contracts draft PR that might add support for
this: https://github.com/rust-lang/rust/pull/128045

---
Cheers,
Benno


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ