linux-hardening - Re: [PATCH 1/2] rust: harden index manipulation using ownership

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZuhXjbb45QrefFb4@phenom.ffwll.local>
Date: Mon, 16 Sep 2024 18:06:37 +0200
From: Simona Vetter <simona.vetter@...ll.ch>
To: Alice Ryhl <aliceryhl@...gle.com>
Cc: Miguel Ojeda <ojeda@...nel.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Kees Cook <kees@...nel.org>, Boqun Feng <boqun.feng@...il.com>,
	Gary Guo <gary@...yguo.net>,
	Björn Roy Baron <bjorn3_gh@...tonmail.com>,
	Benno Lossin <benno.lossin@...ton.me>,
	Andreas Hindborg <a.hindborg@...sung.com>,
	Trevor Gross <tmgross@...ch.edu>,
	Carlos Llamas <cmllamas@...gle.com>, rust-for-linux@...r.kernel.org,
	linux-hardening@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] rust: harden index manipulation using ownership

On Fri, Sep 13, 2024 at 09:00:29PM +0000, Alice Ryhl wrote:
> Rust is very effective at preventing memory safety bugs such as
> use-after-free bugs. Can we also make it effective at preventing bugs
> that arise from using indices incorrectly? As it stands today, accessing
> an array out of bounds is caught and results in a kernel panic, but Rust
> will not stop you from doing it. Accessing an array at the wrong index
> is not caught at all. This patch explores ways to catch these kinds of
> indexing bugs.
> 
> The Android Binder driver has a component where it "translates" the
> contents of transactions sent using the driver, which involves complex
> and error-prone index manipulation. The C driver has had several bugs in
> this area. A few examples:
> 
> 4df153652cc4 ("binder: fix UAF caused by offsets overwrite")
> bdc1c5fac982 ("binder: fix UAF caused by faulty buffer cleanup")
> 16981742717b ("binder: fix incorrect calculation for num_valid")
> a56587065094 ("binder: Set end of SG buffer area properly.")
> 
> Rust is not immune to these issues either. In fact, I've already
> encountered and fixed two bugs like this in Rust Binder. In both cases,
> the bugs could result in `Allocation::cleanup_object` calls on invalid
> data. Rust's safety guarantees prevented these bugs from affecting the
> integrity of kernel itself, but an attacker could use them to e.g.
> manipulate handles that are present in other processes and for example
> obtaining IApplicationThread handle belonging to another app from
> system_server, which in turn allows loading code into that app.
> 
> Ultimately, the root cause of all of these bugs is that there is some
> index in the destination buffer that gets written to twice.
> 
> The primary idea of this new Range API is to utilize Rust's ownership
> semantics to prevent reuse of indices. The idea is that you create one
> big Range for the entire range of indices, and then there are various
> methods to split it into smaller ranges. An example from Rust Binder,
> where several kinds of data are stored next to each other:
> 
> // Split the buffer size into sub-ranges.
> let full_range = Range::new_area(len);
> let (data_range, after_data) = full_range.split_within(aligned_data_size)?;
> let (offsets_range, after_offsets) = after_data.split_within(aligned_offsets_size)?;
> let (buffers_range, after_buffers) = after_offsets.split_within(aligned_buffers_size)?;
> let (secctx_range, after_secctx) = after_buffers.split_within(aligned_secctx_size)?;
> after_secctx.assert_empty()?;
> 
> The Range type is set up so that it cannot be copied or cloned, which
> means that any attempts to use a Range more than once will fail to
> compile. For example, if the line creating `buffers_range` accidentally
> tried to split `after_data` instead of `after_offsets`, then that would
> lead to this compilation error:
> 
> error[E0382]: use of moved value: `after_data`
>     --> /usr/local/google/home/aliceryhl/rust-android-mainline/common/drivers/android/binder/thread.rs:1101:50
>      |
> 1099 | let (data_range, after_data) = full_range.split_within(aligned_data_size)?;
>      |                  ---------- move occurs because `after_data` has type `kernel::range::Range`, which does not implement the `Copy` trait
> 1100 | let (offsets_range, after_offsets) = after_data.split_within(aligned_offsets_size)?;
>      |                                                 ---------------------------------- `after_data` moved due to this method call
> 1101 | let (buffers_range, after_buffers) = after_data.split_within(aligned_buffers_size)?;
>      |                                      ^^^^^^^^^^ value used here after move
>      |
> note: `kernel::range::Range::split_within` takes ownership of the receiver `self`, which moves `after_data`
>     --> /usr/local/google/home/aliceryhl/rust-android-mainline/common/rust/kernel/range.rs:108:29
>      |
> 108  | pub fn split_within(mut self, length: usize) -> Result<(Range, Range), RangeError> {
>      |                         ^^^^
> 
> Basically, the error says that you tried to use the `after_data` range
> twice, which is not allowed because `split_within` destroys the object
> you call it on.
> 
> In Rust Binder, I am using the ranges to prevent reuse of indices in the
> *destination* buffer. To enforce that, all methods for writing to the
> destination buffer are modified to take a `Range` as an argument,
> consuming the range being written to. As ranges can only be split into
> smaller pieces, this enforces that you will never write to the same
> index twice.
> 
> Of course, the API is not completely bullet-proof. If you call
> `Range::new_area` twice, you'll get two overlapping ranges. But we don't
> need it to be completely impossible to misuse. It just needs to be
> really difficult to do so accidentally.
> 
> Binder has one case where it intentionally writes to the same location
> twice: when sending file descriptors over Binder, the driver does not
> know what value the fd will have when transferring the data, so it will
> first write u32::MAX. Much later, it will overwrite it with the real fd.
> There is a `duplicate_range_careful` method for this case.
> 
> It is interesting to compare with the uaccess patchset [1]. The uaccess
> API takes a userspace pointer and length representing a range of bytes
> in userspace, and lets you read the range incrementally. The API design
> prevents reading the same address twice. This helps prevent
> time-of-check-to-time-of-use bugs where userspace modifies the data in
> between two reads, which can cause bugs if the driver assumes that the
> memory is unchanged.

I don't really see an overlap between Range and UserSlice. The former is
about managing a range of indexes, the other about incremental
reading/writing to userspace without screwing up. I think we want both,
since often you get the things you need to carefuly slice/dice with Range
from somewhere else, e.g. from pagecache or the fw loader.

Also where performance doesn't matter so much often the copy_from_user is
one step and then we parse some complex data structure in there where
Range should be useful.

> In fact, both Rust Binder bugs mentioned earlier are caught by this part
> of the uaccess API, as both bugs eventually attempt to read past the end
> of the userspace region, triggering an error. Unfortunately, this
> happens too late, as the previously translated data has already been
> overwritten by the time the error is triggered.
> 
> This patchset is also simliar to Benno's untrusted data patchset [2],
> which explores a different way to write misuse-resistant APIs.

I think you want both: Range to slice up a block of memory into variable
length structure, Untrusted to safely parse each part.
> 
> Link: https://lore.kernel.org/r/20240528-alice-mm-v7-0-78222c31b8f4@google.com [1]
> Link: https://lore.kernel.org/r/20240913112643.542914-1-benno.lossin@proton.me [2]
> Signed-off-by: Alice Ryhl <aliceryhl@...gle.com>

I didn't take a close look at the api since real feedback would mean I
need to type up a user, but this looks great and I want :-)
-Sima


> ---
>  rust/kernel/lib.rs   |   1 +
>  rust/kernel/range.rs | 236 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 237 insertions(+)
>  create mode 100644 rust/kernel/range.rs
> 
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index a7fefc4ae725..72a998cd02e0 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -48,6 +48,7 @@
>  pub mod page;
>  pub mod prelude;
>  pub mod print;
> +pub mod range;
>  pub mod rbtree;
>  pub mod security;
>  pub mod seq_file;
> diff --git a/rust/kernel/range.rs b/rust/kernel/range.rs
> new file mode 100644
> index 000000000000..8fb9d96ac724
> --- /dev/null
> +++ b/rust/kernel/range.rs
> @@ -0,0 +1,236 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +// Copyright (C) 2024 Google LLC.
> +
> +//! Utilities for working with ranges of indices.
> +
> +/// A range of indices.
> +///
> +/// This utility is useful for ensuring that no index in the range is used more than once.
> +#[derive(Debug)]
> +pub struct Range {
> +    offset: usize,
> +    length: usize,
> +}
> +
> +impl Range {
> +    /// Creates a new `Range` for an area of the given size.
> +    pub fn new_area(size: usize) -> Self {
> +        Self {
> +            offset: 0,
> +            length: size,
> +        }
> +    }
> +
> +    /// Use this range of indices.
> +    ///
> +    /// This destroys the `Range` object, so these indices cannot be used again after this call.
> +    pub fn use_range(self) -> UsedRange {
> +        UsedRange {
> +            offset: self.offset,
> +            length: self.length,
> +        }
> +    }
> +
> +    /// Duplicate this range.
> +    ///
> +    /// Be careful when using this method, as it allows you to use a range twice.
> +    pub fn duplicate_range_careful(&self) -> Range {
> +        Range {
> +            offset: self.offset,
> +            length: self.length,
> +        }
> +    }
> +
> +    /// Peek at the offset without using the range.
> +    ///
> +    /// This doesn't destroy the `Range` object, so be careful that the range is not used twice.
> +    pub fn peek_offset(&self) -> usize {
> +        self.offset
> +    }
> +
> +    /// Peek at the length without using the range.
> +    ///
> +    /// This doesn't destroy the `Range` object, so be careful that the range is not used twice.
> +    pub fn peek_length(&self) -> usize {
> +        self.length
> +    }
> +
> +    /// Peek at the end without using the range.
> +    ///
> +    /// This doesn't destroy the `Range` object, so be careful that the range is not used twice.
> +    pub fn peek_end(&self) -> Result<usize, RangeError> {
> +        self.offset.checked_add(self.length).ok_or(RangeError)
> +    }
> +
> +    /// Truncates this range to the given length.
> +    pub fn truncate(&mut self, length: usize) -> Result<(), RangeError> {
> +        if length > self.length {
> +            return Err(RangeError);
> +        }
> +        self.length = length;
> +        Ok(())
> +    }
> +
> +    /// Assert that this range is aligned properly.
> +    pub fn assert_aligned(&self, alignment: usize) -> Result<(), RangeError> {
> +        if self.offset % alignment == 0 {
> +            Ok(())
> +        } else {
> +            Err(RangeError)
> +        }
> +    }
> +
> +    /// Assert that this range has the expected length.
> +    pub fn assert_length_eq(&self, length: usize) -> Result<(), RangeError> {
> +        if self.length == length {
> +            Ok(())
> +        } else {
> +            Err(RangeError)
> +        }
> +    }
> +
> +    /// Assert that this range is empty.
> +    pub fn assert_empty(self) -> Result<(), RangeError> {
> +        self.assert_length_eq(0)
> +    }
> +
> +    /// Splits the range into two sub-ranges.
> +    ///
> +    /// Fails if the `length` is greater than the range being split.
> +    pub fn split_within(mut self, length: usize) -> Result<(Range, Range), RangeError> {
> +        let left = self.take_from_start(length)?;
> +        Ok((left, self))
> +    }
> +
> +    /// Splits the range into two sub-ranges.
> +    ///
> +    /// Fails if the `position` is not within the current range.
> +    pub fn split_at(mut self, position: usize) -> Result<(Range, Range), RangeError> {
> +        let left = self.take_until(position)?;
> +        Ok((left, self))
> +    }
> +
> +    /// Modify this range by taking the first `length` bytes.
> +    pub fn take_until(&mut self, position: usize) -> Result<Range, RangeError> {
> +        let from_start = Range {
> +            offset: self.offset,
> +            length: position.checked_sub(self.offset).ok_or(RangeError)?,
> +        };
> +
> +        let new_self = Range {
> +            offset: position,
> +            length: self
> +                .length
> +                .checked_sub(from_start.length)
> +                .ok_or(RangeError)?,
> +        };
> +
> +        *self = new_self;
> +
> +        Ok(from_start)
> +    }
> +
> +    /// Modify this range by taking the first `length` bytes.
> +    pub fn take_from_start(&mut self, length: usize) -> Result<Range, RangeError> {
> +        let from_start = Range {
> +            offset: self.offset,
> +            length: length,
> +        };
> +
> +        let new_self = Range {
> +            offset: self.offset.checked_add(length).ok_or(RangeError)?,
> +            length: self.length.checked_sub(length).ok_or(RangeError)?,
> +        };
> +
> +        *self = new_self;
> +
> +        Ok(from_start)
> +    }
> +
> +    /// Split this range into sub-ranges of the given size.
> +    pub fn iter_chunks(self, chunk_size: usize) -> Result<ChunkIter, RangeError> {
> +        if self.length % chunk_size != 0 {
> +            return Err(RangeError);
> +        }
> +
> +        Ok(ChunkIter {
> +            pos: self.offset,
> +            end: self.offset.checked_add(self.length).ok_or(RangeError)?,
> +            chunk_size,
> +        })
> +    }
> +}
> +
> +/// An iterator over ranges of the same size.
> +pub struct ChunkIter {
> +    pos: usize,
> +    end: usize,
> +    chunk_size: usize,
> +}
> +
> +impl Iterator for ChunkIter {
> +    type Item = Range;
> +    fn next(&mut self) -> Option<Range> {
> +        if self.pos >= self.end {
> +            return None;
> +        }
> +
> +        let range = Range {
> +            offset: self.pos,
> +            length: self.chunk_size,
> +        };
> +        self.pos = self.pos + self.chunk_size;
> +
> +        Some(range)
> +    }
> +}
> +
> +/// A version of [`Range`] where the length is a compile-time constant.
> +///
> +/// This can be used to store a `Range` without using up space to store the length.
> +pub struct RangeFixedSize<const LENGTH: usize> {
> +    offset: usize,
> +}
> +
> +impl<const LENGTH: usize> RangeFixedSize<LENGTH> {
> +    /// Create a `RangeFixedSize` from a `Range`.
> +    pub fn from_range(range: Range) -> Result<Self, RangeError> {
> +        if range.length == LENGTH {
> +            Ok(Self {
> +                offset: range.offset,
> +            })
> +        } else {
> +            Err(RangeError)
> +        }
> +    }
> +
> +    /// Convert this back into a `Range`.
> +    pub fn into_range(self) -> Range {
> +        Range {
> +            offset: self.offset,
> +            length: LENGTH,
> +        }
> +    }
> +}
> +
> +/// The return value of [`Range::use_range`].
> +///
> +/// The only way to access the indices in a range is to mark it "used", which converts it into this
> +/// type, destroying the original [`Range`] object.
> +#[derive(Copy, Clone)]
> +pub struct UsedRange {
> +    /// The offset of this range.
> +    pub offset: usize,
> +    /// The length of this range.
> +    pub length: usize,
> +}
> +
> +/// The error type returned when ranges are used incorrectly.
> +pub struct RangeError;
> +
> +impl From<RangeError> for crate::error::Error {
> +    fn from(_range: RangeError) -> crate::error::Error {
> +        crate::error::code::EINVAL
> +    }
> +}
> -- 
> 2.46.0.662.g92d0881bb0-goog
> 

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch