lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4231ee9d-66a0-47b4-b285-3e142b142675@proton.me>
Date: Thu, 14 Mar 2024 14:19:43 +0000
From: Benno Lossin <benno.lossin@...ton.me>
To: Matt Gilbride <mattgilbride@...gle.com>, Miguel Ojeda <ojeda@...nel.org>, Alex Gaynor <alex.gaynor@...il.com>, Wedson Almeida Filho <wedsonaf@...il.com>, Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>, Björn Roy Baron <bjorn3_gh@...tonmail.com>, Andreas Hindborg <a.hindborg@...sung.com>, Alice Ryhl <aliceryhl@...gle.com>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Arve Hjønnevåg <arve@...roid.com>, Todd Kjos <tkjos@...roid.com>, Martijn Coenen <maco@...roid.com>, Joel Fernandes <joel@...lfernandes.org>, Carlos Llamas <cmllamas@...gle.com>, Suren Baghdasaryan <surenb@...gle.com>, Christian Brauner <brauner@...nel.org>
Cc: Rob Landley <rob@...dley.net>, Davidlohr Bueso <dave@...olabs.net>, Michel Lespinasse <michel@...pinasse.org>, rust-for-linux@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/6] rust: rbtree: add red-black tree implementation backed by the C version

On 2/19/24 12:48, Matt Gilbride wrote:
> +impl<K, V> RBTree<K, V>
> +where
> +    K: Ord,
> +{
> +    /// Tries to insert a new value into the tree.
> +    ///
> +    /// It overwrites a node if one already exists with the same key and returns it (containing the
> +    /// key/value pair). Returns [`None`] if a node with the same key didn't already exist.
> +    ///
> +    /// Returns an error if it cannot allocate memory for the new node.
> +    pub fn try_create_and_insert(&mut self, key: K, value: V) -> Result<Option<RBTreeNode<K, V>>> {
> +        Ok(self.insert(Self::try_allocate_node(key, value)?))
> +    }
> +
> +    /// Inserts a new node into the tree.
> +    ///
> +    /// It overwrites a node if one already exists with the same key and returns it (containing the
> +    /// key/value pair). Returns [`None`] if a node with the same key didn't already exist.
> +    ///
> +    /// This function always succeeds.
> +    pub fn insert(&mut self, node: RBTreeNode<K, V>) -> Option<RBTreeNode<K, V>> {
> +        let RBTreeNode { node } = node;

You can do this pattern deconstruction directly in the function
signature.

> +        let node = Box::into_raw(node);
> +        // SAFETY: `node` is valid at least until we call `Box::from_raw`, which only happens when
> +        // the node is removed or replaced.
> +        let node_links = unsafe { addr_of_mut!((*node).links) };
> +        let mut new_link: &mut *mut bindings::rb_node = &mut self.rootrb_node;

Is the only reason for this being a double pointer that `rb_link_node`
requires that as its last argument? If yes, then I would just add a
`&mut` there and give this the simpler type.

Also a more fitting name IMO would be `cur_node` or similar.

> +        let mut parent = core::ptr::null_mut();

I would suggest naming this `prev_node` or similar.

> +        while !new_link.is_null() {
> +            // SAFETY: All links fields we create are in a `Node<K, V>`.

Suggestion:
     // SAFETY: By the type invariant of `Self`, all non-null `rb_node` pointers stored in `self`
     // point to the links field of `Node<K, V>` objects.

> +            let this = unsafe { crate::container_of!(*new_link, Node<K, V>, links) };

The `container_of` macro is used 3 times, I think it's nicer to import
it.

> +
> +            parent = *new_link;
> +
> +            // SAFETY: `this` is a non-null node so it is valid by the type invariants. `node` is
> +            // valid until the node is removed.
> +            match unsafe { (*node).key.cmp(&(*this).key) } {
> +                // SAFETY: `parent` is a non-null node so it is valid by the type invariants.
> +                Ordering::Less => new_link = unsafe { &mut (*parent)rb_left },

I would use `*new_link` instead of `parent` here.

> +                // SAFETY: `parent` is a non-null node so it is valid by the type invariants.
> +                Ordering::Greater => new_link = unsafe { &mut (*parent).rb_right },

Ditto.

> +                Ordering::Equal => {
> +                    // INVARIANT: We are replacing an existing node with a new one, which is valid.
> +                    // It remains valid because we "forgot" it with `Box::into_raw`.
> +                    // SAFETY: All pointers are non-null and valid (parent, despite the name, really
> +                    // is the node we're replacing).
> +                    unsafe { bindings::rb_replace_node(parent, node_links, &mut self.root) };

If you use `*new_link` instead of `parent` (and perform the rename) then
you don't need the note in the parenthesis in the safety comment.

> +
> +                    // INVARIANT: The node is being returned and the caller may free it, however,
> +                    // it was removed from the tree. So the invariants still hold.
> +                    return Some(RBTreeNode {
> +                        // SAFETY: `this` was a node in the tree, so it is valid.
> +                        node: unsafe { Box::from_raw(this as _) },

Why do you need this cast? Can it be replaced by `.cast()` or
`.cast_mut()`?

> +                    });
> +                }
> +            }
> +        }
> +
> +        // INVARIANT: We are linking in a new node, which is valid. It remains valid because we
> +        // "forgot" it with `Box::into_raw`.
> +        // SAFETY: All pointers are non-null and valid (`*new_link` is null, but `new_link` is a
> +        // mutable reference).
> +        unsafe { bindings::rb_link_node(node_links, parent, new_link) };
> +
> +        // SAFETY: All pointers are valid. `node` has just been inserted into the tree.
> +        unsafe { bindings::rb_insert_color(node_links, &mut self.root) };
> +        None
> +    }
> +
> +    /// Returns a node with the given key, if one exists.
> +    fn find(&self, key: &K) -> Option<NonNull<Node<K, V>>> {
> +        let mut node = self.root.rb_node;
> +        while !node.is_null() {
> +            // SAFETY: All links fields we create are in a `Node<K, V>`.

See above suggestion.

> +            let this = unsafe { crate::container_of!(node, Node<K, V>, links) };
> +            // SAFETY: `this` is a non-null node so it is valid by the type invariants.
> +            node = match key.cmp(unsafe { &(*this).key }) {
> +                // SAFETY: `node` is a non-null node so it is valid by the type invariants.
> +                Ordering::Less => unsafe { (*node).rb_left },
> +                // SAFETY: `node` is a non-null node so it is valid by the type invariants.
> +                Ordering::Greater => unsafe { (*node).rb_right },
> +                Ordering::Equal => return NonNull::new(this as _),

Why do you need this cast? Can it be replaced by `.cast()` or
`.cast_mut()`?

> +            }
> +        }

If you modify this function to return the parent node if `key` were in
the tree, then you could use this in `insert` instead of having to write
two loops.

> +        None
> +    }
> +
> +    /// Returns a reference to the value corresponding to the key.
> +    pub fn get(&self, key: &K) -> Option<&V> {
> +        // SAFETY: The `find` return value is a node in the tree, so it is valid.
> +        self.find(key).map(|node| unsafe { &node.as_ref().value })
> +    }
> +
> +    /// Returns a mutable reference to the value corresponding to the key.
> +    pub fn get_mut(&mut self, key: &K) -> Option<&mut V> {
> +        // SAFETY: The `find` return value is a node in the tree, so it is valid.
> +        self.find(key)
> +            .map(|mut node| unsafe { &mut node.as_mut().value })
> +    }
> +
> +    /// Removes the node with the given key from the tree.
> +    ///
> +    /// It returns the node that was removed if one exists, or [`None`] otherwise.
> +    fn remove_node(&mut self, key: &K) -> Option<RBTreeNode<K, V>> {
> +        let mut node = self.find(key)?;
> +
> +        // SAFETY: The `find` return value is a node in the tree, so it is valid.
> +        unsafe { bindings::rb_erase(&mut node.as_mut().links, &mut self.root) };
> +
> +        // INVARIANT: The node is being returned and the caller may free it, however, it was
> +        // removed from the tree. So the invariants still hold.
> +        Some(RBTreeNode {
> +            // SAFETY: The `find` return value was a node in the tree, so it is valid.
> +            node: unsafe { Box::from_raw(node.as_ptr()) },
> +        })
> +    }
> +
> +    /// Removes the node with the given key from the tree.
> +    ///
> +    /// It returns the value that was removed if one exists, or [`None`] otherwise.
> +    pub fn remove(&mut self, key: &K) -> Option<V> {
> +        let node = self.remove_node(key)?;
> +        let RBTreeNode { node } = node;
> +        let Node {
> +            links: _,
> +            key: _,
> +            value,
> +        } = *node;
> +        Some(value)

This could be a one-liner:
     self.remove_node(key).map(|node| node.node.value)

> +    }
> +}
> +
> +impl<K, V> Default for RBTree<K, V> {
> +    fn default() -> Self {
> +        Self::new()
> +    }
> +}
> +
> +impl<K, V> Drop for RBTree<K, V> {
> +    fn drop(&mut self) {
> +        // SAFETY: `root` is valid as it's embedded in `self` and we have a valid `self`.
> +        let mut next = unsafe { bindings::rb_first_postorder(&self.root) };
> +
> +        // INVARIANT: The loop invariant is that all tree nodes from `next` in postorder are valid.
> +        while !next.is_null() {
> +            // SAFETY: All links fields we create are in a `Node<K, V>`.
> +            let this = unsafe { crate::container_of!(next, Node<K, V>, links) };
> +
> +            // Find out what the next node is before disposing of the current one.
> +            // SAFETY: `next` and all nodes in postorder are still valid.
> +            next = unsafe { bindings::rb_next_postorder(next) };
> +
> +            // INVARIANT: This is the destructor, so we break the type invariant during clean-up,
> +            // but it is not observable. The loop invariant is still maintained.
> +            // SAFETY: `this` is valid per the loop invariant.
> +            unsafe { drop(Box::from_raw(this as *mut Node<K, V>)) };

Use `.cast_mut()` instead of `as ...`.

> +        }
> +    }
> +}
> +
> +/// A memory reservation for a red-black tree node.
> +///
> +/// It contains the memory needed to hold a node that can be inserted into a red-black tree. One
> +/// can be obtained by directly allocating it ([`RBTree::try_reserve_node`]).
> +pub struct RBTreeNodeReservation<K, V> {
> +    node: Box<MaybeUninit<Node<K, V>>>,
> +}
> +
> +// SAFETY: An [`RBTree`] allows the same kinds of access to its values that a struct allows to its
> +// fields, so we use the same Send condition as would be used for a struct with K and V fields.
> +unsafe impl<K: Send, V: Send> Send for RBTreeNodeReservation<K, V> {}
> +
> +// SAFETY: An [`RBTree`] allows the same kinds of access to its values that a struct allows to its
> +// fields, so we use the same Sync condition as would be used for a struct with K and V fields.
> +unsafe impl<K: Sync, V: Sync> Sync for RBTreeNodeReservation<K, V> {}

The two safety comments are copy-pasted. `RBTreeNodeReservation` does not
allow any kind of access to its values.

> +
> +impl<K, V> RBTreeNodeReservation<K, V> {
> +    /// Initialises a node reservation.
> +    ///
> +    /// It then becomes an [`RBTreeNode`] that can be inserted into a tree.
> +    pub fn into_node(mut self, key: K, value: V) -> RBTreeNode<K, V> {
> +        let node_ptr = self.node.as_mut_ptr();
> +        // SAFETY: `node_ptr` is valid, and so are its fields.
> +        unsafe { addr_of_mut!((*node_ptr).links).write(bindings::rb_node::default()) };
> +        // SAFETY: `node_ptr` is valid, and so are its fields.
> +        unsafe { addr_of_mut!((*node_ptr).key).write(key) };
> +        // SAFETY: `node_ptr` is valid, and so are its fields.
> +        unsafe { addr_of_mut!((*node_ptr).value).write(value) };
> +        let raw = Box::into_raw(self.node);
> +        RBTreeNode {
> +            // SAFETY: The pointer came from a `MaybeUninit<Node>` whose fields have all been
> +            // initialised. Additionally, it has the same layout as `Node`.
> +            node: unsafe { Box::from_raw(raw as _) },
> +        }

Instead of doing `into_raw` and `from_raw`, I would use
`Box::assume_init` (it is unstable via `new_uninit`).

> +    }
> +}
> +
> +/// A red-black tree node.
> +///
> +/// The node is fully initialised (with key and value) and can be inserted into a tree without any
> +/// extra allocations or failure paths.
> +pub struct RBTreeNode<K, V> {
> +    node: Box<Node<K, V>>,
> +}
> +
> +// SAFETY: An [`RBTree`] allows the same kinds of access to its values that a struct allows to its
> +// fields, so we use the same Send condition as would be used for a struct with K and V fields.
> +unsafe impl<K: Send, V: Send> Send for RBTreeNode<K, V> {}
> +
> +// SAFETY: An [`RBTree`] allows the same kinds of access to its values that a struct allows to its
> +// fields, so we use the same Sync condition as would be used for a struct with K and V fields.
> +unsafe impl<K: Sync, V: Sync> Sync for RBTreeNode<K, V> {}

The two safety comments are copy-pasted. `RBTreeNode` does not
allow any kind of access to its values.

-- 
Cheers,
Benno

> 
> --
> 2.44.0.rc0.258.g7320e95886-goog
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ