[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z89vSWdl3skR-N2q@thinkpad>
Date: Mon, 10 Mar 2025 19:01:35 -0400
From: Yury Norov <yury.norov@...il.com>
To: Burak Emir <bqe@...gle.com>
Cc: Rasmus Villemoes <linux@...musvillemoes.dk>,
Viresh Kumar <viresh.kumar@...aro.org>,
Miguel Ojeda <ojeda@...nel.org>,
Alex Gaynor <alex.gaynor@...il.com>,
Boqun Feng <boqun.feng@...il.com>, Gary Guo <gary@...yguo.net>,
Björn Roy Baron <bjorn3_gh@...tonmail.com>,
Benno Lossin <benno.lossin@...ton.me>,
Andreas Hindborg <a.hindborg@...nel.org>,
Alice Ryhl <aliceryhl@...gle.com>, Trevor Gross <tmgross@...ch.edu>,
rust-for-linux@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] rust: add bindings and API for bitmap.h and bitops.h.
On Mon, Mar 10, 2025 at 11:01:37PM +0100, Burak Emir wrote:
> > > + let ptr = unsafe { bindings::bitmap_zalloc(nbits_u32, flags.as_raw()) };
> > > + // Zero-size allocation is ok and yields a dangling pointer.
> >
> > Zero-sized allocation makes no sense, and usually is a sign of a bug.
> > What for you explicitly allow it?
> >
>
> I copied from dbitmap, but more importantly Rust requires that when
> memory is accessed, it has to be initialized before. So if I used alloc,
> I'd have to initialize manually.
If you have a pointer to a memory area of size 0, it's unsafe to
even dereference it. Luckily, initialization of 0-bit array is a
no-op. I don't understand how to initialize 0 bits.
This dbitmap thing looks messy. It uses atomic set_bit() where
non-atomic should be used. Now it encourages to create 0-sized
bitmaps. Please don't copy anything blindly from there...
[...]
> On Mon, Mar 10, 2025 at 7:12 PM Yury Norov <yury.norov@...il.com> wrote:
> > > + /// Copies all bits from `src` and sets any remaining bits to zero.
> > > + ///
> > > + /// # Panics
> > > + ///
> > > + /// Panics if `src.nbits` has more bits than this bitmap.
> > > + #[inline]
> > > + pub fn copy_from_bitmap_and_extend(&mut self, src: &Bitmap) {
> > > + if self.nbits < src.nbits {
> > > + panic_not_in_bounds_le("src.nbits", self.nbits, src.nbits);
> >
> > The _lt usually stands for 'less than', or '<'. And _le is 'less than or
> > equal', or '<='. But in your code you do exactly opposite. Is that on
> > purpose?
> >
>
> Yeah, I realize this is a bit hard to read. Made the check consistent
> with the panic message call.
>
> > Also, you can make it similar to BUG_ON() semantics, so that it will
> > be a single line of code, not 3:
> >
> > RUST_PANIC("Copy: out of bonds", self.nbits < src.nbits);
> >
> > And to that extend, panic message should be available to all rust
> > subsystems, just like BUG_ON().
>
> A general bound checking macro BUG_ON sounds interesting
> I will have to ask the other folks about this. For now, I'd like
> to keep the simpler but more verbose if & panic.
I don't suggest to use BUG_ON(). I suggest to use similar semantics.
You repeat that 3 lines again and again. It cries to become a nice
macro.
> > > + }
> > > + // SAFETY: nbits == 0 is supported and access to `self` and `src` is within bounds.
> > > + unsafe {
> > > + bindings::bitmap_copy_and_extend(self.as_mut_ptr(), src.as_ptr(), src.nbits as u32, self.nbits as u32)
> > > + };
> > > + }
> > > +
> > > + /// Finds the last bit.
> > > + #[inline]
> > > + pub fn find_last_bit(&self) -> usize {
> > > + // SAFETY: nbits == 0 is supported and access is within bounds.
> > > + unsafe { bindings::_find_last_bit(self.as_ptr(), self.nbits) }
> > > + }
> > > +
> > > + /// Finds the last bit, searching up to `nbits` bits.
> > > + ///
> > > + /// # Panics
> > > + ///
> > > + /// Panics if `nbits` is too large for this bitmap.
> > > + #[inline]
> > > + pub fn find_last_bit_upto(&self, nbits: usize) -> usize {
> >
> > So this is not a binding, right? This is a new function. In C code we
> > don't support partial search. Can you confirm you need a partial
> > search here? What's your use scenario?
> >
> > Really, this should go with the series that converts dbitmap.
> > Otherwise it's hard to understand what you're trying to do.
> >
>
> Tamir asked about these as well. I think I misunderstood the
> C API as supporting partial search, and wanted to make
> sure the functionality is preserved.
>
> I now realize that the intention of the size parameter is to always
> pass the size. Will remove.
OK, good. You may look at cpumasks and nodemasks - those are 2 most
developed wrappers around bitmaps, and good examples of the API usage.
> > > + if self.nbits < nbits {
> > > + panic_not_in_bounds_le("nbits", self.nbits, nbits);
> > > + }
> > > + // SAFETY: nbits == 0 is supported and access is within bounds.
> > > + unsafe { bindings::_find_last_bit(self.as_ptr(), nbits) }
> > > + }
> > > +
> > > + /// Finds the next zero bit, searching up to `nbits`.
> > > + ///
> > > + /// # Panics
> > > + ///
> > > + /// Panics if `nbits` is too large for this bitmap.
> > > + #[inline]
> > > + pub fn find_next_zero_bit_upto(&self, nbits: usize) -> usize {
> >
> > 1. This should be 'find_first_zero_bit'.
> >
> > 2. The same question as to previous function. In this case you will
> > most likely be OK with plain find_first_zero_bit(). So if it returns a
> > number greater than 'nbits', it means that first nbits are empty for
> > sure. Is it a performance trick?
> >
>
> No, I just looked at the find_first_zero implementation and thought
> that it supports partial search. Removing.
In C user provides nbits explicitly, so partial search is allowed
automatically. Here you pack a pointer with size, and it makes your
API less robust. This robustness may or may not be needed by users.
Cpumasks and nodemasks don't need floating nbits, so they just hardcode
it.
You didn't show the actual code, and I don't understand what exactly do
you need. If you want to operate on first nbits, just don't attach size
to the pointer, IMO...
> > > + if self.nbits < nbits {
> > > + panic_not_in_bounds_le("nbits", self.nbits, nbits);
> > > + }
> > > + // SAFETY: nbits == 0 is supported and access is within bounds.
> > > + unsafe {
> > > + bindings::_find_next_zero_bit(self.as_ptr(), nbits, 0 /* offset */)
> >
> > For offset == 0 we have find_first_bit() functions.
> >
>
> Good point, I will remove this and expose only the variant with offset.
> That is the one used by dbitmap anyways.
>
> > > + }
> > > + }
> > > +
> > > + /// Finds the next zero bit, searching up to `nbits` bits, with offset `offset`.
> > > + ///
> > > + /// # Panics
> > > + ///
> > > + /// Panics if `nbits` is too large for this bitmap.
> > > + #[inline]
> > > + pub fn find_next_zero_bit_upto_offset(&self, nbits: usize, offset: usize) -> usize {
> > > + if self.nbits < nbits {
> > > + panic_not_in_bounds_le("nbits", self.nbits, nbits);
> > > + }
> > > + // SAFETY: nbits == 0 and out-of-bounds offset is supported, and access is within bounds.
> >
> > find_bit() functions are all safe against nbits == 0 or
> > offset >= nbits. If you add those panics for hardening reasons - it's
> > OK. If you add them to make your code safer - you don't need them. The
> > C version is already safe.
> >
>
> I assume you are asking about the SAFETY comment, not the bound check.
No, I'm asking about your panic()s. find_bit() functions don't dereference
memory that is out of boundaries, and don't crash the kernel. If nbits is
0, or offset >= nbits, find_bit() returns immediately and doesn't touch
any memory. But you may want to print a message and go panic. This is not
how the original find_bit() works. But you may want to do it for debugging
or hardening reasons.
So I wanted to clarify: is it intentional? Maybe add a comment? Also,
if it's a debugging feature, you most likely need to make it depending
on CONFIG_DEBUG_XXX.
Thanks,
Yury
> The convention with SAFETY comments is in the Rust coding guidelines.
> The comments are there to convince the reader that the
> immediately-following "unsafe {...}" block can never lead to undefined behavior.
> I looked at the C code and documented that it is safe, so the reader
> does not have to. The goal
> is to enable local reasoning, so the reader can check a piece of code
> in isolation.
>
> I call out the nbits == 0 case every time, because we are passing a
> dangling pointer to C.
> It may not be very obvious that this is an ok thing to do, and
> explicit is better than implicit.
>
> > > + unsafe { bindings::_find_next_zero_bit(self.as_ptr(), nbits, offset) }
> > > + }
> > > +}
> > > diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> > > index efbd7be98dab..be06ffc47473 100644
> > > --- a/rust/kernel/lib.rs
> > > +++ b/rust/kernel/lib.rs
> > > @@ -36,6 +36,7 @@
> > > pub use ffi;
> > >
> > > pub mod alloc;
> > > +pub mod bitmap;
> > > #[cfg(CONFIG_BLOCK)]
> > > pub mod block;
> > > #[doc(hidden)]
> > > --
> > > 2.49.0.rc0.332.g42c0ae87b1-goog
Powered by blists - more mailing lists