lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <C8826267-710B-4BC4-ADB1-BA029F6F89C5@collabora.com>
Date: Mon, 14 Oct 2024 15:48:56 -0300
From: Daniel Almeida <daniel.almeida@...labora.com>
To: Alice Ryhl <aliceryhl@...gle.com>
Cc: Miguel Ojeda <ojeda@...nel.org>,
 Andrew Morton <akpm@...ux-foundation.org>,
 Dave Chinner <david@...morbit.com>,
 Qi Zheng <zhengqi.arch@...edance.com>,
 Roman Gushchin <roman.gushchin@...ux.dev>,
 Muchun Song <muchun.song@...ux.dev>,
 Boqun Feng <boqun.feng@...il.com>,
 Gary Guo <gary@...yguo.net>,
 Björn Roy Baron <bjorn3_gh@...tonmail.com>,
 Benno Lossin <benno.lossin@...ton.me>,
 Andreas Hindborg <a.hindborg@...nel.org>,
 Trevor Gross <tmgross@...ch.edu>,
 linux-kernel@...r.kernel.org,
 linux-mm@...ck.org,
 rust-for-linux@...r.kernel.org
Subject: Re: [PATCH v2] rust: shrinker: add shrinker abstraction

Hi Alice,

> On 14 Oct 2024, at 13:00, Alice Ryhl <aliceryhl@...gle.com> wrote:
> 
> Rust Binder holds incoming transactions in a read-only mmap'd region
> where it manually manages the pages. These pages are only in use until
> the incoming transaction is consumed by userspace, but the kernel will
> keep the pages around for future transactions. Rust Binder registers a
> shrinker with the kernel so that it can give back these pages if the
> system comes under memory pressure.
> 
> Creating a shrinker is done via the ShrinkerBuilder type. Using a
> builder means that some options (seeks, batch) can be optional, while
> other options (name, private data) can be mandatory. Unlike seeks/batch,
> the private data is not set using a `set_private_data` method, as this
> makes it mandatory to provide a private data pointer.
> 
> The user specifies the callbacks in use by implementing the Shrinker
> trait for the type used for the private data. This requires specifying
> three things: implementations for count_objects and scan_objects, and
> the pointer type that the private data will be wrapped in.
> 
> The return values of count_objects and scan_objects are provided using
> newtypes called CountObjects and ScanObjects respectively. These types
> prevent the user from e.g. returning SHRINK_STOP from count_objects or
> returning SHRINK_EMPTY from scan_objects.
> 
> The CountObjects newtype treats CountObjects::new(0) as "the count is
> unknown" instead of "the count is zero" for consistency with the way
> that the C code works today.
> 
> ShrinkControl is not using Opaque because we need mutable access.
> 
> Support for numa/memcg aware shrinkers is not included here, as they
> cannot usefully be used without having list_lru bindings. Support for
> that will happen as a follow-up.
> 
> Signed-off-by: Alice Ryhl <aliceryhl@...gle.com>
> ---
> Dave, I still had a few outstanding questions in the thread on the last
> version. Most prominently the question on what to do with the builder
> instead. I didn't change it in this version as I'm not sure which
> alternative you prefer.
> ---
> Changes in v2:
> - Rename to ShrinkerBuilder / ShrinkerRegistration.
> - Rename `alloc` to `new`.
> - Update CountOjects to match the way C does it.
> - Change ScanObjects::MAX to SHRINK_STOP-1.
> - Rename gfp_fs to reclaim_fs_allowed, and add reclaim_io_allowed.
> - Remove max check in set_nr_scanned.
> - Comment that numa/memcg aware shrinkers aren't supported yet.
> - Link to v1: https://lore.kernel.org/r/20240912-shrinker-v1-1-18b7f1253553@google.com
> ---
> rust/bindings/bindings_helper.h |   3 +
> rust/kernel/lib.rs              |   1 +
> rust/kernel/shrinker.rs         | 335 ++++++++++++++++++++++++++++++++++++++++
> 3 files changed, 339 insertions(+)
> 
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index ae82e9c941af..fd6d15f5dde1 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -18,6 +18,7 @@
> #include <linux/phy.h>
> #include <linux/refcount.h>
> #include <linux/sched.h>
> +#include <linux/shrinker.h>
> #include <linux/slab.h>
> #include <linux/wait.h>
> #include <linux/workqueue.h>
> @@ -31,4 +32,6 @@ const gfp_t RUST_CONST_HELPER_GFP_KERNEL_ACCOUNT = GFP_KERNEL_ACCOUNT;
> const gfp_t RUST_CONST_HELPER_GFP_NOWAIT = GFP_NOWAIT;
> const gfp_t RUST_CONST_HELPER___GFP_ZERO = __GFP_ZERO;
> const gfp_t RUST_CONST_HELPER___GFP_HIGHMEM = ___GFP_HIGHMEM;
> +const gfp_t RUST_CONST_HELPER___GFP_FS = ___GFP_FS;
> +const gfp_t RUST_CONST_HELPER___GFP_IO = ___GFP_IO;
> const blk_features_t RUST_CONST_HELPER_BLK_FEAT_ROTATIONAL = BLK_FEAT_ROTATIONAL;
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index b5f4b3ce6b48..2e9ca7d413c4 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -45,6 +45,7 @@
> pub mod prelude;
> pub mod print;
> pub mod rbtree;
> +pub mod shrinker;
> pub mod sizes;
> mod static_assert;
> #[doc(hidden)]
> diff --git a/rust/kernel/shrinker.rs b/rust/kernel/shrinker.rs
> new file mode 100644
> index 000000000000..d87ad08e917c
> --- /dev/null
> +++ b/rust/kernel/shrinker.rs
> @@ -0,0 +1,335 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +// Copyright (C) 2024 Google LLC.
> +
> +//! Shrinker for handling memory pressure.
> +//!
> +//! C header: [`include/linux/shrinker.h`](srctree/include/linux/shrinker.h)
> +
> +use crate::{alloc::AllocError, bindings, c_str, str::CStr, types::ForeignOwnable};
> +
> +use core::{
> +    ffi::{c_int, c_long, c_ulong, c_void},
> +    marker::PhantomData,
> +    ptr::NonNull,
> +};
> +
> +const SHRINK_STOP: c_ulong = bindings::SHRINK_STOP as c_ulong;
> +const SHRINK_EMPTY: c_ulong = bindings::SHRINK_EMPTY as c_ulong;
> +
> +/// The default value for the number of seeks needed to recreate an object.
> +pub const DEFAULT_SEEKS: u32 = bindings::DEFAULT_SEEKS;
> +
> +/// An unregistered shrinker.
> +///
> +/// This type can be used to modify the settings of the shrinker before it is registered.
> +///
> +/// # Invariants
> +///
> +/// The `shrinker` pointer references an unregistered shrinker.
> +pub struct ShrinkerBuilder {
> +    shrinker: NonNull<bindings::shrinker>,
> +}
> +
> +// SAFETY: Moving an unregistered shrinker between threads is okay.
> +unsafe impl Send for ShrinkerBuilder {}
> +// SAFETY: An unregistered shrinker is thread safe.
> +unsafe impl Sync for ShrinkerBuilder {}
> +
> +impl ShrinkerBuilder {
> +    /// Create a new shrinker.
> +    pub fn new(name: &CStr) -> Result<Self, AllocError> {
> +        // TODO: Support numa/memcg aware shrinkers once list_lru is available.
> +        let flags = 0;
> +
> +        // SAFETY: Passing `0` as flags is okay. Using `%s` as the format string is okay when we
> +        // pass a nul-terminated string as the string for `%s` to print.
> +        let ptr = unsafe {
> +            bindings::shrinker_alloc(flags, c_str!("%s").as_char_ptr(), name.as_char_ptr())
> +        };
> +
> +        let shrinker = NonNull::new(ptr).ok_or(AllocError)?;
> +
> +        // INVARIANT: The allocated shrinker is unregistered.
> +        Ok(Self { shrinker })
> +    }
> +
> +    /// Create a new shrinker using format arguments for the name.
> +    pub fn new_fmt(name: core::fmt::Arguments<'_>) -> Result<Self, AllocError> {
> +        // TODO: Support numa/memcg aware shrinkers once list_lru is available.
> +        let flags = 0;
> +
> +        // SAFETY: Passing `0` as flags is okay. Using `%pA` as the format string is okay when we
> +        // pass a `fmt::Arguments` as the value to print.
> +        let ptr = unsafe {
> +            bindings::shrinker_alloc(
> +                flags,
> +                c_str!("%pA").as_char_ptr(),
> +                &name as *const _ as *const c_void,
> +            )
> +        };
> +
> +        let shrinker = NonNull::new(ptr).ok_or(AllocError)?;
> +
> +        // INVARIANT: The allocated shrinker is unregistered.
> +        Ok(Self { shrinker })
> +    }
> +
> +    /// Set the number of seeks needed to recreate an object.
> +    pub fn set_seeks(&mut self, seeks: u32) {
> +        unsafe { (*self.shrinker.as_ptr()).seeks = seeks as c_int };
> +    }
> +
> +    /// Set the batch size for reclaiming on this shrinker.
> +    pub fn set_batch(&mut self, batch: usize) {
> +        unsafe { (*self.shrinker.as_ptr()).batch = batch as c_long };
> +    }
> +
> +    /// Register the shrinker.
> +    ///
> +    /// The provided pointer is used as the private data, and the type `T` determines the callbacks
> +    /// that the shrinker will use.
> +    pub fn register<T: Shrinker>(self, private_data: T::Ptr) -> ShrinkerRegistration<T> {
> +        let shrinker = self.shrinker;
> +        let ptr = shrinker.as_ptr();
> +
> +        // The destructor of `self` calls `shrinker_free`, so skip the destructor.
> +        core::mem::forget(self);
> +
> +        let private_data_ptr = <T::Ptr as ForeignOwnable>::into_foreign(private_data);
> +

I don’t usually comment on style because there's a great deal of personal taste involved.

OTOH, I find this a bit hard to read. Maybe I am not the only one? A line-break per SAFETY
would improve legibility IMHO.

> +        // SAFETY: We own the private data, so we can assign to it.
> +        unsafe { (*ptr).private_data = private_data_ptr.cast_mut() };
> +        // SAFETY: The shrinker is not yet registered, so we can update this field.
> +        unsafe { (*ptr).count_objects = Some(rust_count_objects::<T>) };
> +        // SAFETY: The shrinker is not yet registered, so we can update this field.
> +        unsafe { (*ptr).scan_objects = Some(rust_scan_objects::<T>) };
> +
> +        // SAFETY: The shrinker is unregistered, so it's safe to register it.
> +        unsafe { bindings::shrinker_register(ptr) };
> +
> +        ShrinkerRegistration {
> +            shrinker,
> +            _phantom: PhantomData,
> +        }
> +    }
> +}
> +
> +impl Drop for ShrinkerBuilder {
> +    fn drop(&mut self) {
> +        // SAFETY: The shrinker is a valid but unregistered shrinker, and we will not use it
> +        // anymore.
> +        unsafe { bindings::shrinker_free(self.shrinker.as_ptr()) };
> +    }
> +}
> +
> +/// A shrinker that is registered with the kernel.
> +///
> +/// # Invariants
> +///
> +/// The `shrinker` pointer refers to a registered shrinker using `T` as the private data.
> +pub struct ShrinkerRegistration<T: Shrinker> {
> +    shrinker: NonNull<bindings::shrinker>,
> +    _phantom: PhantomData<T::Ptr>,
> +}
> +
> +// SAFETY: This allows you to deregister the shrinker from a different thread, which means that
> +// private data could be dropped from any thread.
> +unsafe impl<T: Shrinker> Send for ShrinkerRegistration<T> where T::Ptr: Send {}
> +// SAFETY: The only thing you can do with an immutable reference is access the private data, which
> +// is okay to access in parallel as the `Shrinker` trait requires the private data to be `Sync`.
> +unsafe impl<T: Shrinker> Sync for ShrinkerRegistration<T> {}
> +
> +impl<T: Shrinker> ShrinkerRegistration<T> {
> +    /// Access the private data in this shrinker.
> +    pub fn private_data(&self) -> <T::Ptr as ForeignOwnable>::Borrowed<'_> {
> +        // SAFETY: We own the private data, so we can access it.
> +        let private = unsafe { (*self.shrinker.as_ptr()).private_data };
> +        // SAFETY: By the type invariants, the private data is `T`. This access could happen in
> +        // parallel with a shrinker callback, but that's okay as the `Shrinker` trait ensures that
> +        // `T::Ptr` is `Sync`.
> +        unsafe { <T::Ptr as ForeignOwnable>::borrow(private) }
> +    }
> +}
> +
> +impl<T: Shrinker> Drop for ShrinkerRegistration<T> {
> +    fn drop(&mut self) {
> +        // SAFETY: We own the private data, so we can access it.
> +        let private = unsafe { (*self.shrinker.as_ptr()).private_data };
> +        // SAFETY: We will not access the shrinker after this call.
> +        unsafe { bindings::shrinker_free(self.shrinker.as_ptr()) };
> +        // SAFETY: The above call blocked until the completion of any shrinker callbacks, so there
> +        // are no longer any users of the private data.
> +        drop(unsafe { <T::Ptr as ForeignOwnable>::from_foreign(private) });
> +    }
> +}
> +
> +/// Callbacks for a shrinker.
> +pub trait Shrinker {
> +    /// The pointer type used to store the private data of the shrinker.
> +    ///
> +    /// Needs to be `Sync` because the shrinker callback could access this value immutably from
> +    /// several thread in parallel.
> +    type Ptr: ForeignOwnable + Sync;
> +
> +    /// Count the number of freeable items in the cache.
> +    fn count_objects(
> +        me: <Self::Ptr as ForeignOwnable>::Borrowed<'_>,
> +        sc: ShrinkControl<'_>,
> +    ) -> CountObjects;
> +
> +    /// Free some objects in this cache.
> +    fn scan_objects(
> +        me: <Self::Ptr as ForeignOwnable>::Borrowed<'_>,
> +        sc: ShrinkControl<'_>,
> +    ) -> ScanObjects;
> +}
> +
> +/// How many objects are there in the cache?
> +///
> +/// This is used as the return value of [`Shrinker::count_objects`].
> +pub struct CountObjects {
> +    inner: c_ulong,
> +}
> +
> +impl CountObjects {
> +    /// Indicates that the number of objects is zero.
> +    pub const EMPTY: Self = Self {
> +        inner: SHRINK_EMPTY,
> +    };
> +
> +    /// The maximum possible number of freeable objects.
> +    pub const MAX: Self = Self {
> +        // The shrinker code assumes that it can multiply this value by two without overflow.
> +        inner: c_ulong::MAX / 2,
> +    };
> +
> +    /// Creates a new `CountObjects` with the given value.
> +    ///
> +    /// This should be the number of objects that were actually freed. Objects that were scanned
> +    /// but not freed should be counted in `nr_scanned` but not here.

This is apparently wrong? The count_objects callback computes how many objects *can* be freed.

> +    ///
> +    /// If `count` is zero, then this indicates that the real count is unknown. Use
> +    /// `CountObjects::EMPTY` to indicate that the shrinker is empty.
> +    pub fn new(count: usize) -> Self {
> +        if count > Self::MAX.inner as usize {
> +            return Self::MAX;
> +        }

nit: maybe a comment explaining why silently truncating is preferable?
  
> +
> +        Self {
> +            inner: count as c_ulong,
> +        }
> +    }
> +}
> +
> +/// How many objects were freed?
> +///
> +/// This is used as the return value of [`Shrinker::scan_objects`].
> +pub struct ScanObjects {
> +    inner: c_ulong,
> +}
> +
> +impl ScanObjects {
> +    /// Indicates that the shrinker should stop trying to free objects from this cache due to
> +    /// potential deadlocks.
> +    pub const STOP: Self = Self { inner: SHRINK_STOP };
> +
> +    /// The maximum possible number of freeable objects.
> +    pub const MAX: Self = Self {
> +        inner: SHRINK_STOP - 1,
> +    };
> +
> +    /// Creates a new `CountObjects` with the given value.
> +    pub fn from_count(count: usize) -> Self {
> +        if count > Self::MAX.inner as usize {
> +            return Self::MAX;
> +        }
> +
> +        Self {
> +            inner: count as c_ulong,
> +        }
> +    }
> +}
> +
> +/// This struct is used to pass information from page reclaim to the shrinkers.
> +///
> +/// # Invariants
> +///
> +/// `ptr` has exclusive access to a valid `struct shrink_control`.
> +pub struct ShrinkControl<'a> {
> +    ptr: NonNull<bindings::shrink_control>,
> +    _phantom: PhantomData<&'a bindings::shrink_control>,
> +}
> +
> +impl<'a> ShrinkControl<'a> {
> +    /// Create a `ShrinkControl` from a raw pointer.
> +    ///
> +    /// # Safety
> +    ///
> +    /// The pointer should point at a valid `shrink_control` for the duration of 'a.
> +    pub unsafe fn from_raw(ptr: *mut bindings::shrink_control) -> Self {
> +        Self {
> +            // SAFETY: Caller promises that this pointer is valid.
> +            ptr: unsafe { NonNull::new_unchecked(ptr) },
> +            _phantom: PhantomData,
> +        }
> +    }
> +
> +    /// Determines whether it is safe to call into filesystem code.
> +    pub fn reclaim_fs_allowed(&self) -> bool {
> +        // SAFETY: Okay by type invariants.
> +        let mask = unsafe { (*self.ptr.as_ptr()).gfp_mask };
> +
> +        (mask & bindings::__GFP_FS) != 0
> +    }
> +
> +    /// Determines whether it is safe to call into IO code.
> +    pub fn reclaim_io_allowed(&self) -> bool {
> +        // SAFETY: Okay by type invariants.
> +        let mask = unsafe { (*self.ptr.as_ptr()).gfp_mask };
> +
> +        (mask & bindings::__GFP_IO) != 0
> +    }
> +
> +    /// Returns the number of objects that `scan_objects` should try to reclaim.
> +    pub fn nr_to_scan(&self) -> usize {
> +        // SAFETY: Okay by type invariants.
> +        unsafe { (*self.ptr.as_ptr()).nr_to_scan as usize }
> +    }
> +
> +    /// The callback should set this value to the number of objects inspected by the shrinker.
> +    pub fn set_nr_scanned(&mut self, val: usize) {
> +        // SAFETY: Okay by type invariants.
> +        unsafe { (*self.ptr.as_ptr()).nr_scanned = val as c_ulong };
> +    }
> +}
> +
> +unsafe extern "C" fn rust_count_objects<T: Shrinker>(
> +    shrink: *mut bindings::shrinker,
> +    sc: *mut bindings::shrink_control,
> +) -> c_ulong {
> +    // SAFETY: We own the private data, so we can access it.
> +    let private = unsafe { (*shrink).private_data };
> +    // SAFETY: This function is only used with shrinkers where `T` is the type of the private data.
> +    let private = unsafe { <T::Ptr as ForeignOwnable>::borrow(private) };
> +    // SAFETY: The caller passes a valid `sc` pointer.
> +    let sc = unsafe { ShrinkControl::from_raw(sc) };
> +
> +    let ret = T::count_objects(private, sc);
> +    ret.inner
> +}
> +
> +unsafe extern "C" fn rust_scan_objects<T: Shrinker>(
> +    shrink: *mut bindings::shrinker,
> +    sc: *mut bindings::shrink_control,
> +) -> c_ulong {
> +    // SAFETY: We own the private data, so we can access it.
> +    let private = unsafe { (*shrink).private_data };
> +    // SAFETY: This function is only used with shrinkers where `T` is the type of the private data.
> +    let private = unsafe { <T::Ptr as ForeignOwnable>::borrow(private) };
> +    // SAFETY: The caller passes a valid `sc` pointer.
> +    let sc = unsafe { ShrinkControl::from_raw(sc) };
> +
> +    let ret = T::scan_objects(private, sc);
> +    ret.inner
> +}
> 
> ---
> base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b
> change-id: 20240911-shrinker-f8371af00b68
> 
> Best regards,
> -- 
> Alice Ryhl <aliceryhl@...gle.com>
> 
> 

— Daniel


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ