lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGSQo03nX2cMc0WZQ2YmUKkOLCH_iu9MeQuLSXfCHmNBbuDJ9g@mail.gmail.com>
Date: Tue, 1 Jul 2025 14:53:52 -0700
From: Matthew Maurer <mmaurer@...gle.com>
To: Danilo Krummrich <dakr@...nel.org>
Cc: Benno Lossin <lossin@...nel.org>, Greg Kroah-Hartman <gregkh@...uxfoundation.org>, 
	Miguel Ojeda <ojeda@...nel.org>, Alex Gaynor <alex.gaynor@...il.com>, 
	Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>, 
	Björn Roy Baron <bjorn3_gh@...tonmail.com>, 
	Andreas Hindborg <a.hindborg@...nel.org>, Alice Ryhl <aliceryhl@...gle.com>, 
	Trevor Gross <tmgross@...ch.edu>, "Rafael J. Wysocki" <rafael@...nel.org>, 
	Sami Tolvanen <samitolvanen@...gle.com>, Timur Tabi <ttabi@...dia.com>, linux-kernel@...r.kernel.org, 
	rust-for-linux@...r.kernel.org, Dirk Behme <dirk.behme@...bosch.com>
Subject: Re: [PATCH v8 4/6] rust: debugfs: Support arbitrary owned backing for File

On Tue, Jul 1, 2025 at 1:16 PM Danilo Krummrich <dakr@...nel.org> wrote:
>
> On Tue, Jul 01, 2025 at 10:09:10PM +0200, Benno Lossin wrote:
> > On Tue Jul 1, 2025 at 10:03 PM CEST, Benno Lossin wrote:
> > > On Tue Jul 1, 2025 at 9:58 PM CEST, Danilo Krummrich wrote:
> > >> On 7/1/25 9:46 PM, Benno Lossin wrote:
> > >>> On Tue Jul 1, 2025 at 9:21 PM CEST, Danilo Krummrich wrote:
> > >>>> On Tue, Jul 01, 2025 at 11:11:13AM -0700, Matthew Maurer wrote:
> > >>>>> If we implement *only* pinned files, we run into an additional problem
> > >>>>> - you can't easily extend a pinned vector. This means that you cannot
> > >>>>> have dynamically created devices unless you're willing to put every
> > >>>>> new `File` into its own `Box`, because you aren't allowed to move any
> > >>>>> of the previously allocated `File`s for a resize.
> > >>>>>
> > >>>>> Where previously you would have had
> > >>>>>
> > >>>>> ```
> > >>>>> debug_files: Vec<File>
> > >>>>> ```
> > >>>>>
> > >>>>> you would now have
> > >>>>>
> > >>>>> ```
> > >>>>> debug_files: Vec<PinBox<File<T>>>
> > >>>>> ```
> > >>>>
> > >>>> Stuffing single File instances into a Vec seems like the wrong thing to do.
> > >>>>
> > >>>> Instead you may have instances of some data structure that is created
> > >>>> dynamically in your driver that you want to expose through debugfs.
> > >>>>
> > >>>> Let's say you have (userspace) clients that can be registered arbitrarily, then
> > >>>> you want a Vec<Client>, which contains the client instances. In order to provide
> > >>>> information about the Client in debugfs you then have the client embed things as
> > >>>> discussed above.
> > >>>>
> > >>>>  struct Client {
> > >>>>     id: File<ClientId>,
> > >>>>     data: File<ClientData>,
> > >>>>     ...
> > >>>>  }
> > >>>>
> > >>>> I think that makes much more sense than keeping a Vec<Arc<Client>> *and* a
> > >>>> Vec<File> separately. Also, note that with the above, your Client instances
> > >>>> don't need to be reference counted anymore.
> > >>>>
> > >>>> I think this addresses the concerns below.
> > >>>
> > >>> You still have the issue that `Client` now needs to be pinned and the
> > >>> vector can't be resized. But if you know that it's bounded, then we
> > >>> could just make `Pin<Vec<T>>` work as expected (not relocating the
> > >>> underlying allocation by not exposing `push`, only
> > >>> `push_within_capacity`).
> > >>>
> > >>> We also could have a `SegmentedVec<T>` that doesn't move elements.
> > >>> Essentially it is
> > >>>
> > >>>      enum SegmentedVec<T> {
> > >>>          Cons(Segment<T>, KBox<SegmentedVec<T>>)
> > >>>          Nul,
> > >>>      }
> > >>>
> > >>>      struct Segment<T> {
> > >>>          elements: [T; 16]
> > >>>      }
> > >>>
> > >>> or make the segments variable-sized and grow them accordingly.
> > >>
> > >> That sounds a lot like the perfect application for XArray. :)
> > >
> > > Haha I didn't know this already existed in the kernel :) Yeah then we
> > > should make XArray work for this use-case.
> >
> > Ah wait, I meant for `SegmentedVec<T>` to store multiple `T` in the same
> > allocation, so it would only grow sometimes and amortize the allocations
> > just like `Vec`. It seems to me that XArray only stores pointers, so you
> > have to box everything (which we're trying to avoid IIUC).
>
> Yes, that sounds good. And if the potential number of Client instances can get
> pretty large Vec isn't a good choice anyways. If we really get a ton of clients,
> they should be allocated with a kmem_cache and stored in an XArray, maple tree,
> etc.

OK. I've outlined why I disagree, but it sounds like consensus here is that:
1. You want a `File<T>` that implements `Deref` to expose a `&T`.
2. To enable `T` to take on values which are not `ForeignOwnable`, you
want me to make `create_file` return a `PinInit<File<T>>` (We will
also grow a `T: Sized` bound here.)

You are aware that:
1. A `File` can no longer be placed in a normal Rust collection, you
will either need to box it or use a special collection kind.
2. Adding or removing DebugFS support for some data in a driver may
cause more noise (code edits non-local to the debug logic) than
expected.

The one thing I still don't see a consensus on:

What do you want me to do about standard traits for `File`? If we're
intending it to be heavily used throughout structs, we'll effectively
break `#[derive]` if I don't add support. For example, if we had
```
#[derive(Debug, PartialEq)]
struct FooAttrs {
  name: String,
  size: usize,
}
// In my current patch world someone adds `File`, it's by putting
`FooAttrs` into an `Arc` and passing that into `create_file`, no
modifications to the structure are made.
```
before, we have an obvious semantics for this. If someone adds `File`
with the new API, we get
```
#[derive(Debug, PartialEq)]
#[pin_data]
struct FooAttrs {
  #[pin]
  name: File<String>,
  #[pin]
  size: File<usize>,
}
```

This means that for the `#[derive]` to keep working, `File` needs to
implement these traits. Do we want it to:
A. Purely delegate, so making this sort of change keeps the same
behavior of derived traits
B. Do what makes sense for a `File`, so e.g. Debug might print 'File {
data: "name_of_foo" }', and PartialEq might test for equality of the
dentry if not an error

I'm guessing you want A, but I've been through so many API reworks at
this point that I wanted to ask in case there's  disagreement here.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ