lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cabdde17-5191-4ce6-8cf3-7e7e929e5671@proton.me>
Date: Mon, 05 Aug 2024 20:02:57 +0000
From: Benno Lossin <benno.lossin@...ton.me>
To: Matt Gilbride <mattgilbride@...gle.com>, Miguel Ojeda <ojeda@...nel.org>, Alex Gaynor <alex.gaynor@...il.com>, Wedson Almeida Filho <wedsonaf@...il.com>, Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>, Björn Roy Baron <bjorn3_gh@...tonmail.com>, Andreas Hindborg <a.hindborg@...sung.com>, Alice Ryhl <aliceryhl@...gle.com>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Arve Hjønnevåg <arve@...roid.com>, Todd Kjos <tkjos@...roid.com>, Martijn Coenen <maco@...roid.com>, Joel Fernandes <joel@...lfernandes.org>, Carlos Llamas <cmllamas@...gle.com>, Suren Baghdasaryan <surenb@...gle.com>, Christian Brauner <brauner@...nel.org>
Cc: Rob Landley <rob@...dley.net>, Davidlohr Bueso <dave@...olabs.net>, Michel Lespinasse <michel@...pinasse.org>, rust-for-linux@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v8 6/6] rust: rbtree: add `RBTree::entry`

On 27.07.24 22:30, Matt Gilbride wrote:
> From: Alice Ryhl <aliceryhl@...gle.com>
> 
> This mirrors the entry API [1] from the Rust standard library on
> `RBTree`. This API can be used to access the entry at a specific key and
> make modifications depending on whether the key is vacant or occupied.
> This API is useful because it can often be used to avoid traversing the
> tree multiple times.
> 
> This is used by binder to look up and conditionally access or insert a
> value, depending on whether it is there or not [2].
> 
> Link: https://doc.rust-lang.org/stable/std/collections/btree_map/enum.Entry.html [1]
> Link: https://android-review.googlesource.com/c/kernel/common/+/2849906 [2]
> Signed-off-by: Alice Ryhl <aliceryhl@...gle.com>
> Tested-by: Alice Ryhl <aliceryhl@...gle.com>
> Signed-off-by: Matt Gilbride <mattgilbride@...gle.com>
> ---
>  rust/kernel/rbtree.rs | 302 +++++++++++++++++++++++++++++++++++++-------------
>  1 file changed, 227 insertions(+), 75 deletions(-)
> 
> diff --git a/rust/kernel/rbtree.rs b/rust/kernel/rbtree.rs
> index 5d37aa373685..428f8be8f3a2 100644
> --- a/rust/kernel/rbtree.rs
> +++ b/rust/kernel/rbtree.rs
> @@ -295,12 +295,19 @@ pub fn try_create_and_insert(
>      /// key/value pair). Returns [`None`] if a node with the same key didn't already exist.
>      ///
>      /// This function always succeeds.
> -    pub fn insert(&mut self, RBTreeNode { node }: RBTreeNode<K, V>) -> Option<RBTreeNode<K, V>> {
> -        let node = Box::into_raw(node);
> -        // SAFETY: `node` is valid at least until we call `Box::from_raw`, which only happens when
> -        // the node is removed or replaced.
> -        let node_links = unsafe { addr_of_mut!((*node).links) };
> +    pub fn insert(&mut self, node: RBTreeNode<K, V>) -> Option<RBTreeNode<K, V>> {
> +        match self.raw_entry(&node.node.key) {
> +            RawEntry::Occupied(entry) => Some(entry.replace(node)),
> +            RawEntry::Vacant(entry) => {
> +                entry.insert(node);
> +                None
> +            }
> +        }
> +    }
> 
> +    fn raw_entry(&mut self, key: &K) -> RawEntry<'_, K, V> {
> +        let raw_self: *mut RBTree<K, V> = self;
> +        // The returned `RawEntry` is used to call either `rb_link_node` or `rb_replace_node`.
>          // The parameters of `rb_link_node` are as follows:
>          // - `node`: A pointer to an uninitialized node being inserted.
>          // - `parent`: A pointer to an existing node in the tree. One of its child pointers must be
> @@ -319,62 +326,56 @@ pub fn insert(&mut self, RBTreeNode { node }: RBTreeNode<K, V>) -> Option<RBTree
>          // in the subtree of `parent` that `child_field_of_parent` points at. Once
>          // we find an empty subtree, we can insert the new node using `rb_link_node`.
>          let mut parent = core::ptr::null_mut();
> -        let mut child_field_of_parent: &mut *mut bindings::rb_node = &mut self.root.rb_node;
> -        while !child_field_of_parent.is_null() {
> -            parent = *child_field_of_parent;
> +        let mut child_field_of_parent: &mut *mut bindings::rb_node =
> +            // SAFETY: `raw_self` is a valid pointer to the `RBTree` (created from `self` above).
> +            unsafe { &mut (*raw_self).root.rb_node };
> +        while !(*child_field_of_parent).is_null() {

Why do you manually dereference `child_field_of_parent` here?

> +            let curr = *child_field_of_parent;
> +            // SAFETY: All links fields we create are in a `Node<K, V>`.

I think the SAFETY comment from below that argues via the type invariant
of `Self` is better.

> +            let node = unsafe { container_of!(curr, Node<K, V>, links) };
> 
> -            // We need to determine whether `node` should be the left or right child of `parent`,
> -            // so we will compare with the `key` field of `parent` a.k.a. `this` below.
> -            //
> -            // SAFETY: By the type invariant of `Self`, all non-null `rb_node` pointers stored in `self`
> -            // point to the links field of `Node<K, V>` objects.
> -            let this = unsafe { container_of!(parent, Node<K, V>, links) };
> -
> -            // SAFETY: `this` is a non-null node so it is valid by the type invariants. `node` is
> -            // valid until the node is removed.
> -            match unsafe { (*node).key.cmp(&(*this).key) } {
> -                // We would like `node` to be the left child of `parent`.  Move to this child to check
> -                // whether we can use it, or continue searching, at the next iteration.
> -                //
> -                // SAFETY: `parent` is a non-null node so it is valid by the type invariants.
> -                Ordering::Less => child_field_of_parent = unsafe { &mut (*parent).rb_left },
> -                // We would like `node` to be the right child of `parent`.  Move to this child to check
> -                // whether we can use it, or continue searching, at the next iteration.
> -                //
> -                // SAFETY: `parent` is a non-null node so it is valid by the type invariants.
> -                Ordering::Greater => child_field_of_parent = unsafe { &mut (*parent).rb_right },
> +            // SAFETY: `node` is a non-null node so it is valid by the type invariants.
> +            match key.cmp(unsafe { &(*node).key }) {
> +                // SAFETY: `curr` is a non-null node so it is valid by the type invariants.
> +                Ordering::Less => child_field_of_parent = unsafe { &mut (*curr).rb_left },
> +                // SAFETY: `curr` is a non-null node so it is valid by the type invariants.
> +                Ordering::Greater => child_field_of_parent = unsafe { &mut (*curr).rb_right },
>                  Ordering::Equal => {
> -                    // There is an existing node in the tree with this key, and that node is
> -                    // parent.  Thus, we are replacing parent with a new node.
> -                    //
> -                    // INVARIANT: We are replacing an existing node with a new one, which is valid.
> -                    // It remains valid because we "forgot" it with `Box::into_raw`.
> -                    // SAFETY: All pointers are non-null and valid.
> -                    unsafe { bindings::rb_replace_node(parent, node_links, &mut self.root) };
> -
> -                    // INVARIANT: The node is being returned and the caller may free it, however,
> -                    // it was removed from the tree. So the invariants still hold.
> -                    return Some(RBTreeNode {
> -                        // SAFETY: `this` was a node in the tree, so it is valid.
> -                        node: unsafe { Box::from_raw(this.cast_mut()) },
> -                    });
> +                    return RawEntry::Occupied(OccupiedEntry {
> +                        rbtree: self,
> +                        node_links: curr,
> +                    })
>                  }
>              }
> +            parent = curr;
>          }
> 
> -        // INVARIANT: We are linking in a new node, which is valid. It remains valid because we
> -        // "forgot" it with `Box::into_raw`.
> -        // SAFETY: All pointers are non-null and valid (`*child_field_of_parent` is null, but `child_field_of_parent` is a
> -        // mutable reference).
> -        unsafe { bindings::rb_link_node(node_links, parent, child_field_of_parent) };
> +        RawEntry::Vacant(RawVacantEntry {

RawVacantEntry has Invariants, so missing INVARIANT comment.

> +            rbtree: raw_self,
> +            parent,
> +            child_field_of_parent,
> +            _phantom: PhantomData,
> +        })
> +    }

[...]

> +/// A view into a vacant entry in a [`RBTree`]. It is part of the [`Entry`] enum.
> +pub struct VacantEntry<'a, K, V> {
> +    key: K,
> +    raw: RawVacantEntry<'a, K, V>,
> +}
> +
> +/// Like [`VacantEntry`], but doesn't hold on to the key.a

Typo: trailing 'a'.

> +///
> +/// # Invariants
> +/// - `parent` may be null if the new node becomes the root.
> +/// - `child_field_of_parent` is a valid pointer to the left-child or right-child of `parent`. If `parent` is
> +///     null, it is a pointer to the root of the [`RBTree`].
> +struct RawVacantEntry<'a, K, V> {
> +    rbtree: *mut RBTree<K, V>,
> +    /// The node that will become the parent of the new node if we insert one.
> +    parent: *mut bindings::rb_node,
> +    /// This points to the left-child or right-child field of `parent`, or `root` if `parent` is
> +    /// null.
> +    child_field_of_parent: *mut *mut bindings::rb_node,
> +    _phantom: PhantomData<&'a mut RBTree<K, V>>,
> +}
> +
> +impl<'a, K, V> RawVacantEntry<'a, K, V> {
> +    /// Inserts the given node into the [`RBTree`] at this entry.
> +    ///
> +    /// The `node` must have a key such that inserting it here does not break the ordering of this
> +    /// [`RBTree`].
> +    fn insert(self, node: RBTreeNode<K, V>) -> &'a mut V {
> +        let node = Box::into_raw(node.node);
> +
> +        // SAFETY: `node` is valid at least until we call `Box::from_raw`, which only happens when
> +        // the node is removed or replaced.
> +        let node_links = unsafe { addr_of_mut!((*node).links) };
> +
> +        // INVARIANT: We are linking in a new node, which is valid. It remains valid because we
> +        // "forgot" it with `Box::into_raw`.
> +        // SAFETY: The type invariants of `RawVacantEntry` are exactly the safety requirements of `rb_link_node`.
> +        unsafe { bindings::rb_link_node(node_links, self.parent, self.child_field_of_parent) };
> +
> +        // SAFETY: All pointers are valid. `node` has just been inserted into the tree.
> +        unsafe { bindings::rb_insert_color(node_links, addr_of_mut!((*self.rbtree).root)) };
> +
> +        // SAFETY: The node is valid until we remove it from the tree.
> +        unsafe { &mut (*node).value }
> +    }
> +}
> +
> +impl<'a, K, V> VacantEntry<'a, K, V> {
> +    /// Inserts the given node into the [`RBTree`] at this entry.
> +    pub fn insert(self, value: V, reservation: RBTreeNodeReservation<K, V>) -> &'a mut V {
> +        self.raw.insert(reservation.into_node(self.key, value))
> +    }
> +}
> +
> +/// A view into an occupied entry in a [`RBTree`]. It is part of the [`Entry`] enum.
> +///
> +/// # Invariants
> +/// - `node_links` is a valid, non-null pointer to a tree node in `self.rbtree`
> +pub struct OccupiedEntry<'a, K, V> {
> +    rbtree: &'a mut RBTree<K, V>,
> +    /// The node that this entry corresponds to.
> +    node_links: *mut bindings::rb_node,
> +}
> +
> +impl<'a, K, V> OccupiedEntry<'a, K, V> {
> +    fn node_ptr(&self) -> *mut Node<K, V> {
> +        // SAFETY: By the type invariant of `Self`, all `node_links` pointers stored in `self`
> +        // point to the links field of `Node<K, V>` objects.
> +        unsafe { container_of!(self.node_links, Node<K, V>, links) }.cast_mut()

You should not call `cast_mut` here, see below

> +    }
> +
> +    /// Gets a reference to the value in the entry.
> +    pub fn get(&self) -> &V {
> +        // SAFETY: `self.node_ptr` produces a valid pointer to a node in the tree.

Can you add a `# Guarantees` section to `node_ptr` that states exactly
this?

> +        unsafe { &(*self.node_ptr()).value }
> +    }
> +
> +    /// Gets a mutable reference to the value in the entry.
> +    pub fn get_mut(&mut self) -> &mut V {
> +        // SAFETY: `self.node_ptr` produces a valid pointer to a node in the tree.
> +        unsafe { &mut (*self.node_ptr()).value }

This is sadly UB, you are creating a `&mut` reference from a pointer
that was derived from a `&` reference:
- `node_ptr` takes `&self`, thus it converts the `&mut self` to that.
- `container_of!` inside of `node_ptr` is used to create a read-only
  pointer to the `links` field (it is casted to `*mut`, but that doesn't
  change the fact that you are only allowed to use it for reads)
- `get_mut` turns it again into a `&mut` reference.

One solution is to make `note_ptr` take a `*mut Self`/`*const Self`.

> +    }
> +
> +    /// Converts the entry into a mutable reference to its value.
> +    ///
> +    /// If you need multiple references to the `OccupiedEntry`, see [`self#get_mut`].
> +    pub fn into_mut(self) -> &'a mut V {
> +        // SAFETY: `self.node_ptr` produces a valid pointer to a node in the tree.
> +        unsafe { &mut (*self.node_ptr()).value }
> +    }
> +
> +    /// Remove this entry from the [`RBTree`].
> +    pub fn remove_node(self) -> RBTreeNode<K, V> {
> +        // SAFETY: The node is a node in the tree, so it is valid.
> +        unsafe { bindings::rb_erase(self.node_links, &mut self.rbtree.root) };
> +
> +        // INVARIANT: The node is being returned and the caller may free it, however, it was
> +        // removed from the tree. So the invariants still hold.
> +        RBTreeNode {
> +            // SAFETY: The node was a node in the tree, but we removed it, so we can convert it
> +            // back into a box.
> +            node: unsafe { Box::from_raw(self.node_ptr()) },
> +        }
> +    }
> +
> +    /// Takes the value of the entry out of the map, and returns it.
> +    pub fn remove(self) -> V {
> +        self.remove_node().node.value
> +    }
> +
> +    /// Swap the current node for the provided node.
> +    ///
> +    /// The key of both nodes must be equal.

Is this a safety requirement? Ie if it is violated, can memory bugs
occur, or is it only going to lead to logic bugs?

---
Cheers,
Benno

> +    fn replace(self, node: RBTreeNode<K, V>) -> RBTreeNode<K, V> {
> +        let node = Box::into_raw(node.node);
> +
> +        // SAFETY: `node` is valid at least until we call `Box::from_raw`, which only happens when
> +        // the node is removed or replaced.
> +        let new_node_links = unsafe { addr_of_mut!((*node).links) };
> +
> +        // SAFETY: This updates the pointers so that `new_node_links` is in the tree where
> +        // `self.node_links` used to be.
> +        unsafe {
> +            bindings::rb_replace_node(self.node_links, new_node_links, &mut self.rbtree.root)
> +        };
> +
> +        // SAFETY:
> +        // - `self.node_ptr` produces a valid pointer to a node in the tree.
> +        // - Now that we removed this entry from the tree, we can convert the node to a box.
> +        let old_node = unsafe { Box::from_raw(self.node_ptr()) };
> +
> +        RBTreeNode { node: old_node }
> +    }
> +}
> +
>  struct Node<K, V> {
>      links: bindings::rb_node,
>      key: K,
> 
> --
> 2.46.0.rc1.232.g9752f9e123-goog
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ