[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZuhXjbb45QrefFb4@phenom.ffwll.local>
Date: Mon, 16 Sep 2024 18:06:37 +0200
From: Simona Vetter <simona.vetter@...ll.ch>
To: Alice Ryhl <aliceryhl@...gle.com>
Cc: Miguel Ojeda <ojeda@...nel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Kees Cook <kees@...nel.org>, Boqun Feng <boqun.feng@...il.com>,
Gary Guo <gary@...yguo.net>,
Björn Roy Baron <bjorn3_gh@...tonmail.com>,
Benno Lossin <benno.lossin@...ton.me>,
Andreas Hindborg <a.hindborg@...sung.com>,
Trevor Gross <tmgross@...ch.edu>,
Carlos Llamas <cmllamas@...gle.com>, rust-for-linux@...r.kernel.org,
linux-hardening@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] rust: harden index manipulation using ownership
On Fri, Sep 13, 2024 at 09:00:29PM +0000, Alice Ryhl wrote:
> Rust is very effective at preventing memory safety bugs such as
> use-after-free bugs. Can we also make it effective at preventing bugs
> that arise from using indices incorrectly? As it stands today, accessing
> an array out of bounds is caught and results in a kernel panic, but Rust
> will not stop you from doing it. Accessing an array at the wrong index
> is not caught at all. This patch explores ways to catch these kinds of
> indexing bugs.
>
> The Android Binder driver has a component where it "translates" the
> contents of transactions sent using the driver, which involves complex
> and error-prone index manipulation. The C driver has had several bugs in
> this area. A few examples:
>
> 4df153652cc4 ("binder: fix UAF caused by offsets overwrite")
> bdc1c5fac982 ("binder: fix UAF caused by faulty buffer cleanup")
> 16981742717b ("binder: fix incorrect calculation for num_valid")
> a56587065094 ("binder: Set end of SG buffer area properly.")
>
> Rust is not immune to these issues either. In fact, I've already
> encountered and fixed two bugs like this in Rust Binder. In both cases,
> the bugs could result in `Allocation::cleanup_object` calls on invalid
> data. Rust's safety guarantees prevented these bugs from affecting the
> integrity of kernel itself, but an attacker could use them to e.g.
> manipulate handles that are present in other processes and for example
> obtaining IApplicationThread handle belonging to another app from
> system_server, which in turn allows loading code into that app.
>
> Ultimately, the root cause of all of these bugs is that there is some
> index in the destination buffer that gets written to twice.
>
> The primary idea of this new Range API is to utilize Rust's ownership
> semantics to prevent reuse of indices. The idea is that you create one
> big Range for the entire range of indices, and then there are various
> methods to split it into smaller ranges. An example from Rust Binder,
> where several kinds of data are stored next to each other:
>
> // Split the buffer size into sub-ranges.
> let full_range = Range::new_area(len);
> let (data_range, after_data) = full_range.split_within(aligned_data_size)?;
> let (offsets_range, after_offsets) = after_data.split_within(aligned_offsets_size)?;
> let (buffers_range, after_buffers) = after_offsets.split_within(aligned_buffers_size)?;
> let (secctx_range, after_secctx) = after_buffers.split_within(aligned_secctx_size)?;
> after_secctx.assert_empty()?;
>
> The Range type is set up so that it cannot be copied or cloned, which
> means that any attempts to use a Range more than once will fail to
> compile. For example, if the line creating `buffers_range` accidentally
> tried to split `after_data` instead of `after_offsets`, then that would
> lead to this compilation error:
>
> error[E0382]: use of moved value: `after_data`
> --> /usr/local/google/home/aliceryhl/rust-android-mainline/common/drivers/android/binder/thread.rs:1101:50
> |
> 1099 | let (data_range, after_data) = full_range.split_within(aligned_data_size)?;
> | ---------- move occurs because `after_data` has type `kernel::range::Range`, which does not implement the `Copy` trait
> 1100 | let (offsets_range, after_offsets) = after_data.split_within(aligned_offsets_size)?;
> | ---------------------------------- `after_data` moved due to this method call
> 1101 | let (buffers_range, after_buffers) = after_data.split_within(aligned_buffers_size)?;
> | ^^^^^^^^^^ value used here after move
> |
> note: `kernel::range::Range::split_within` takes ownership of the receiver `self`, which moves `after_data`
> --> /usr/local/google/home/aliceryhl/rust-android-mainline/common/rust/kernel/range.rs:108:29
> |
> 108 | pub fn split_within(mut self, length: usize) -> Result<(Range, Range), RangeError> {
> | ^^^^
>
> Basically, the error says that you tried to use the `after_data` range
> twice, which is not allowed because `split_within` destroys the object
> you call it on.
>
> In Rust Binder, I am using the ranges to prevent reuse of indices in the
> *destination* buffer. To enforce that, all methods for writing to the
> destination buffer are modified to take a `Range` as an argument,
> consuming the range being written to. As ranges can only be split into
> smaller pieces, this enforces that you will never write to the same
> index twice.
>
> Of course, the API is not completely bullet-proof. If you call
> `Range::new_area` twice, you'll get two overlapping ranges. But we don't
> need it to be completely impossible to misuse. It just needs to be
> really difficult to do so accidentally.
>
> Binder has one case where it intentionally writes to the same location
> twice: when sending file descriptors over Binder, the driver does not
> know what value the fd will have when transferring the data, so it will
> first write u32::MAX. Much later, it will overwrite it with the real fd.
> There is a `duplicate_range_careful` method for this case.
>
> It is interesting to compare with the uaccess patchset [1]. The uaccess
> API takes a userspace pointer and length representing a range of bytes
> in userspace, and lets you read the range incrementally. The API design
> prevents reading the same address twice. This helps prevent
> time-of-check-to-time-of-use bugs where userspace modifies the data in
> between two reads, which can cause bugs if the driver assumes that the
> memory is unchanged.
I don't really see an overlap between Range and UserSlice. The former is
about managing a range of indexes, the other about incremental
reading/writing to userspace without screwing up. I think we want both,
since often you get the things you need to carefuly slice/dice with Range
from somewhere else, e.g. from pagecache or the fw loader.
Also where performance doesn't matter so much often the copy_from_user is
one step and then we parse some complex data structure in there where
Range should be useful.
> In fact, both Rust Binder bugs mentioned earlier are caught by this part
> of the uaccess API, as both bugs eventually attempt to read past the end
> of the userspace region, triggering an error. Unfortunately, this
> happens too late, as the previously translated data has already been
> overwritten by the time the error is triggered.
>
> This patchset is also simliar to Benno's untrusted data patchset [2],
> which explores a different way to write misuse-resistant APIs.
I think you want both: Range to slice up a block of memory into variable
length structure, Untrusted to safely parse each part.
>
> Link: https://lore.kernel.org/r/20240528-alice-mm-v7-0-78222c31b8f4@google.com [1]
> Link: https://lore.kernel.org/r/20240913112643.542914-1-benno.lossin@proton.me [2]
> Signed-off-by: Alice Ryhl <aliceryhl@...gle.com>
I didn't take a close look at the api since real feedback would mean I
need to type up a user, but this looks great and I want :-)
-Sima
> ---
> rust/kernel/lib.rs | 1 +
> rust/kernel/range.rs | 236 +++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 237 insertions(+)
> create mode 100644 rust/kernel/range.rs
>
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index a7fefc4ae725..72a998cd02e0 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -48,6 +48,7 @@
> pub mod page;
> pub mod prelude;
> pub mod print;
> +pub mod range;
> pub mod rbtree;
> pub mod security;
> pub mod seq_file;
> diff --git a/rust/kernel/range.rs b/rust/kernel/range.rs
> new file mode 100644
> index 000000000000..8fb9d96ac724
> --- /dev/null
> +++ b/rust/kernel/range.rs
> @@ -0,0 +1,236 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +// Copyright (C) 2024 Google LLC.
> +
> +//! Utilities for working with ranges of indices.
> +
> +/// A range of indices.
> +///
> +/// This utility is useful for ensuring that no index in the range is used more than once.
> +#[derive(Debug)]
> +pub struct Range {
> + offset: usize,
> + length: usize,
> +}
> +
> +impl Range {
> + /// Creates a new `Range` for an area of the given size.
> + pub fn new_area(size: usize) -> Self {
> + Self {
> + offset: 0,
> + length: size,
> + }
> + }
> +
> + /// Use this range of indices.
> + ///
> + /// This destroys the `Range` object, so these indices cannot be used again after this call.
> + pub fn use_range(self) -> UsedRange {
> + UsedRange {
> + offset: self.offset,
> + length: self.length,
> + }
> + }
> +
> + /// Duplicate this range.
> + ///
> + /// Be careful when using this method, as it allows you to use a range twice.
> + pub fn duplicate_range_careful(&self) -> Range {
> + Range {
> + offset: self.offset,
> + length: self.length,
> + }
> + }
> +
> + /// Peek at the offset without using the range.
> + ///
> + /// This doesn't destroy the `Range` object, so be careful that the range is not used twice.
> + pub fn peek_offset(&self) -> usize {
> + self.offset
> + }
> +
> + /// Peek at the length without using the range.
> + ///
> + /// This doesn't destroy the `Range` object, so be careful that the range is not used twice.
> + pub fn peek_length(&self) -> usize {
> + self.length
> + }
> +
> + /// Peek at the end without using the range.
> + ///
> + /// This doesn't destroy the `Range` object, so be careful that the range is not used twice.
> + pub fn peek_end(&self) -> Result<usize, RangeError> {
> + self.offset.checked_add(self.length).ok_or(RangeError)
> + }
> +
> + /// Truncates this range to the given length.
> + pub fn truncate(&mut self, length: usize) -> Result<(), RangeError> {
> + if length > self.length {
> + return Err(RangeError);
> + }
> + self.length = length;
> + Ok(())
> + }
> +
> + /// Assert that this range is aligned properly.
> + pub fn assert_aligned(&self, alignment: usize) -> Result<(), RangeError> {
> + if self.offset % alignment == 0 {
> + Ok(())
> + } else {
> + Err(RangeError)
> + }
> + }
> +
> + /// Assert that this range has the expected length.
> + pub fn assert_length_eq(&self, length: usize) -> Result<(), RangeError> {
> + if self.length == length {
> + Ok(())
> + } else {
> + Err(RangeError)
> + }
> + }
> +
> + /// Assert that this range is empty.
> + pub fn assert_empty(self) -> Result<(), RangeError> {
> + self.assert_length_eq(0)
> + }
> +
> + /// Splits the range into two sub-ranges.
> + ///
> + /// Fails if the `length` is greater than the range being split.
> + pub fn split_within(mut self, length: usize) -> Result<(Range, Range), RangeError> {
> + let left = self.take_from_start(length)?;
> + Ok((left, self))
> + }
> +
> + /// Splits the range into two sub-ranges.
> + ///
> + /// Fails if the `position` is not within the current range.
> + pub fn split_at(mut self, position: usize) -> Result<(Range, Range), RangeError> {
> + let left = self.take_until(position)?;
> + Ok((left, self))
> + }
> +
> + /// Modify this range by taking the first `length` bytes.
> + pub fn take_until(&mut self, position: usize) -> Result<Range, RangeError> {
> + let from_start = Range {
> + offset: self.offset,
> + length: position.checked_sub(self.offset).ok_or(RangeError)?,
> + };
> +
> + let new_self = Range {
> + offset: position,
> + length: self
> + .length
> + .checked_sub(from_start.length)
> + .ok_or(RangeError)?,
> + };
> +
> + *self = new_self;
> +
> + Ok(from_start)
> + }
> +
> + /// Modify this range by taking the first `length` bytes.
> + pub fn take_from_start(&mut self, length: usize) -> Result<Range, RangeError> {
> + let from_start = Range {
> + offset: self.offset,
> + length: length,
> + };
> +
> + let new_self = Range {
> + offset: self.offset.checked_add(length).ok_or(RangeError)?,
> + length: self.length.checked_sub(length).ok_or(RangeError)?,
> + };
> +
> + *self = new_self;
> +
> + Ok(from_start)
> + }
> +
> + /// Split this range into sub-ranges of the given size.
> + pub fn iter_chunks(self, chunk_size: usize) -> Result<ChunkIter, RangeError> {
> + if self.length % chunk_size != 0 {
> + return Err(RangeError);
> + }
> +
> + Ok(ChunkIter {
> + pos: self.offset,
> + end: self.offset.checked_add(self.length).ok_or(RangeError)?,
> + chunk_size,
> + })
> + }
> +}
> +
> +/// An iterator over ranges of the same size.
> +pub struct ChunkIter {
> + pos: usize,
> + end: usize,
> + chunk_size: usize,
> +}
> +
> +impl Iterator for ChunkIter {
> + type Item = Range;
> + fn next(&mut self) -> Option<Range> {
> + if self.pos >= self.end {
> + return None;
> + }
> +
> + let range = Range {
> + offset: self.pos,
> + length: self.chunk_size,
> + };
> + self.pos = self.pos + self.chunk_size;
> +
> + Some(range)
> + }
> +}
> +
> +/// A version of [`Range`] where the length is a compile-time constant.
> +///
> +/// This can be used to store a `Range` without using up space to store the length.
> +pub struct RangeFixedSize<const LENGTH: usize> {
> + offset: usize,
> +}
> +
> +impl<const LENGTH: usize> RangeFixedSize<LENGTH> {
> + /// Create a `RangeFixedSize` from a `Range`.
> + pub fn from_range(range: Range) -> Result<Self, RangeError> {
> + if range.length == LENGTH {
> + Ok(Self {
> + offset: range.offset,
> + })
> + } else {
> + Err(RangeError)
> + }
> + }
> +
> + /// Convert this back into a `Range`.
> + pub fn into_range(self) -> Range {
> + Range {
> + offset: self.offset,
> + length: LENGTH,
> + }
> + }
> +}
> +
> +/// The return value of [`Range::use_range`].
> +///
> +/// The only way to access the indices in a range is to mark it "used", which converts it into this
> +/// type, destroying the original [`Range`] object.
> +#[derive(Copy, Clone)]
> +pub struct UsedRange {
> + /// The offset of this range.
> + pub offset: usize,
> + /// The length of this range.
> + pub length: usize,
> +}
> +
> +/// The error type returned when ranges are used incorrectly.
> +pub struct RangeError;
> +
> +impl From<RangeError> for crate::error::Error {
> + fn from(_range: RangeError) -> crate::error::Error {
> + crate::error::code::EINVAL
> + }
> +}
> --
> 2.46.0.662.g92d0881bb0-goog
>
--
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
Powered by blists - more mailing lists