[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <DDZ3QBBUM27H.MJS1S3WHWJO0@nvidia.com>
Date: Mon, 03 Nov 2025 22:42:13 +0900
From: "Alexandre Courbot" <acourbot@...dia.com>
To: "Yury Norov" <yury.norov@...il.com>, "Alexandre Courbot"
 <acourbot@...dia.com>
Cc: "Alice Ryhl" <aliceryhl@...gle.com>, "Danilo Krummrich"
 <dakr@...nel.org>, "Miguel Ojeda" <ojeda@...nel.org>, "Joel Fernandes"
 <joelagnelf@...dia.com>, "Jesung Yang" <y.j3ms.n@...il.com>, "Boqun Feng"
 <boqun.feng@...il.com>, "Gary Guo" <gary@...yguo.net>,
 Björn Roy Baron <bjorn3_gh@...tonmail.com>, "Benno Lossin"
 <lossin@...nel.org>, "Andreas Hindborg" <a.hindborg@...nel.org>, "Trevor
 Gross" <tmgross@...ch.edu>, <linux-kernel@...r.kernel.org>,
 <rust-for-linux@...r.kernel.org>
Subject: Re: [PATCH 1/2] rust: add BitInt integer wrapping type
Hi Yury,
On Mon Nov 3, 2025 at 11:17 AM JST, Yury Norov wrote:
> On Fri, Oct 31, 2025 at 10:39:57PM +0900, Alexandre Courbot wrote:
>> Add the `BitInt` type, which is an integer on which the number of bits
>> allowed to be used is restricted, capping its maximal value below that
>> of primitive type is wraps.
>> 
>> This is useful to e.g. enforce guarantees when working with bit fields.
>> 
>> Alongside this type, provide many `From` and `TryFrom` implementations
>> are to reduce friction when using with regular integer types. Proxy
>> implementations of common integer traits are also provided.
>> 
>> Signed-off-by: Alexandre Courbot <acourbot@...dia.com>
>> ---
>>  rust/kernel/lib.rs        |   1 +
>>  rust/kernel/num.rs        |  75 ++++
>>  rust/kernel/num/bitint.rs | 896 ++++++++++++++++++++++++++++++++++++++++++++++
>>  3 files changed, 972 insertions(+)
>  
> ...
>
>> +/// Evaluates to `true` if `$value` can be represented using at most `$num_bits` on `$type`.
>> +///
>> +/// Can be used in const context.
>> +macro_rules! fits_within {
>> +    ($value:expr, $type:ty, $num_bits:expr) => {{
>> +        // Any attempt to create a `BitInt` with more bits used for representation than its backing
>> +        // type (i.e. create an invalid `BitInt`) must be aborted at build time.
>> +        build_assert!(
>> +            <$type>::BITS >= $num_bits,
>> +            "Number of bits requested for representation is larger than backing type."
>> +        );
>> +
>> +        let shift: u32 = <$type>::BITS - $num_bits;
>> +        let v = $value;
>> +
>> +        // The value fits within `NUM_BITS` if shifting it left by the number of unused bits, then
>> +        // right by the same number, doesn't change the value.
>> +        //
>> +        // This method has the benefit of working with both unsigned and signed integers.
>> +        (v << shift) >> shift == v
>
> In C it doesn't work:
>
>         int c = 0x7fffffff;
>         printf("%x\t%x\n", c, (c << 1) >> 1); // 7fffffff	ffffffff
>
> Neither in rust:   
>
>         let c: i32 = 0x7fffffff;
>         println!("{} {}", c, (c << 1) >> 1);  // 2147483647 -1
>
> Or I misunderstand something?
For a short while I though this was indeed not working, to my despair as
this looked like an elegant solution.
But then I considered why we are doing that shift by 1 in the first
place: that would be because we are intent on using only 31 of the 32
bits of the `0x7fffffff` value, in order to encode a signed number.
And the positive `0x7fffffff` value cannot be encoded as a signed
integer of 31 bits, as after ignoring the MSB the next bit (which would
be the sign bit) is `1`. So the post-shift value differs from the
original one, and the creation of a `BitInt<i32, 31>` for that value
fails, which is working as intended.
If OTOH we did the same operation for a `BitInt<u32, 31>`, the
non-signed nature of the value would make the shift-and-back operation
produce the same value as the initial one - allowing the creation of the
`BitInt`, which again is correct because `0x7fffffff` can be
represented as an unsigned value using 31 bits.
I have tested both cases successfully - so this way of validating still
looks correct to me.
>
>> +    }};
>> +}
>> +
>> +/// Trait for primitive integer types that can be used to back a [`BitInt`].
>> +///
>> +/// This is mostly used to lock all the operations we need for [`BitInt`] in a single trait.
>> +pub trait Boundable
>> +where
>> +    Self: Integer
>> +        + Sized
>> +        + Copy
>> +        + core::ops::Shl<u32, Output = Self>
>> +        + core::ops::Shr<u32, Output = Self>
>> +        + core::cmp::PartialEq,
>> +    Self: TryInto<u8> + TryInto<u16> + TryInto<u32> + TryInto<u64>,
>> +    Self: TryInto<i8> + TryInto<i16> + TryInto<i32> + TryInto<i64>,
>> +{
>> +    /// Returns `true` if `value` can be represented with at most `NUM_BITS` on `T`.
>> +    fn fits_within(value: Self, num_bits: u32) -> bool {
>> +        fits_within!(value, Self, num_bits)
>> +    }
>> +}
>> +
>> +/// Inplement `Boundable` for all integers types.
>> +impl<T> Boundable for T
>> +where
>> +    T: Integer
>> +        + Sized
>> +        + Copy
>> +        + core::ops::Shl<u32, Output = Self>
>> +        + core::ops::Shr<u32, Output = Self>
>> +        + core::cmp::PartialEq,
>> +    Self: TryInto<u8> + TryInto<u16> + TryInto<u32> + TryInto<u64>,
>> +    Self: TryInto<i8> + TryInto<i16> + TryInto<i32> + TryInto<i64>,
>> +{
>> +}
>> +
>> +/// Integer type for which only the bits `0..NUM_BITS` are valid.
>> +///
>> +/// # Invariants
>> +///
>> +/// Stored values are represented with at most `NUM_BITS` bits.
>> +///
>> +/// # Examples
>> +///
>> +/// ```
>> +/// use kernel::num::BitInt;
>> +///
>> +/// // An unsigned 8-bit integer, of which only the 4 LSBs can ever be set.
>> +/// // The value `15` is statically validated to fit within the specified number of bits.
>> +/// let v = BitInt::<u8, 4>::new::<15>();
>
> What for do you make user to declare the storage explicitly? From
> end-user perspective, declaring 4-bit variable must imply the most
> suitable storage... C version of the same doesn't allow user to select
> the storage as well:
>
>         _BitInt(4) x = 8;
>
> I can't think out any useful usecase for this... I believe that the
> optimal storage must be chosen by implementation. And it may even be
> different for different arches.
Alice provided a good justification in her reply, but let me elaborate a
bit more.
C allows itself to implicitly cast values when performing operations.
For instance:
    signed char a = -128;
    unsigned int b = 120;
    unsigned short c = b + a;
    printf("%d\n", c);
This absolutely not confusing program is perfectly valid and prints
`65528`.
For the equivalent code in Rust:
    let a = -128i8;
    let b = 120u32;
    let c: u16 = b + a;
I don't even bother printing the result because these 3 lines alone
produce 3 build errors. Rust doesn't let you perform operations on
primitive integers that are not exactly the same type.
So that's the first reason for explicitly passing a type: so you can
perform the operations you want with your `BitInt` without jumping
through cast hoops. This is particularly important to use it with
bitfields, which is the primary goal.
Another reason is that even if you don't care about the size of the
storage, you should care about its signedness, which is part of the
type. I played a bit with C's `_Bitint`, and managed to produce this
wonder:
    _BitInt(8) v = -1;
    printf("%d\n", v);
This programs prints `255`, even though I used the signed "%d"
formatter? Maybe that's because I should make my `_BitInt` explicitly
signed?
    signed _BitInt(8) v = -1;
    printf("%d\n", v);
Nope. Still `255`. Go figure. ¯\_(ツ)_/¯
You cannot do that with our implementation. You either specify
    let v = BitInt::<i8, 8>::new::<-1>();
and get a proper, signed `-1` value that prints properly, or do
    let v = BitInt::<u8, 8>::new::<-1>();
and get the build error you should get. No ambiguity, no surprises.
>
>> +/// assert_eq!(v.get(), 15);
>> +///
>> +/// let v = BitInt::<i8, 4>::new::<-8>();
>> +/// assert_eq!(v.get(), -8);
>> +///
>> +/// // This doesn't build: a `u8` is smaller than the requested 9 bits.
>> +/// // let _ = BitInt::<u8, 9>::new::<10>();
>> +///
>> +/// // This also doesn't build: the requested value doesn't fit within the requested bits.
>> +/// // let _ = BitInt::<i8, 4>::new::<8>();
>> +///
>> +/// // Values can also be validated at runtime. This succeeds because `15` can be represented
>> +/// // with 4 bits.
>> +/// assert!(BitInt::<u8, 4>::try_new(15).is_some());
>> +/// // This fails because `16` cannot be represented with 4 bits.
>> +/// assert!(BitInt::<u8, 4>::try_new(16).is_none());
>
> Nice! Maybe .is_overflow() instead of .is_none(), so that user will
> know that the variable contains truncated value. Just like C does.
>
> Can you please print the exact error that user will get on compile-
> and runtime? How big is the cost of runtime test for overflow? If it
> is indeed nonzero, can you consider making the runtime part
> configurable?
`is_none()` comes from the `Option` type, so we cannot change its name
and it's the idiomatic way to do anyway.
The runtime cost for overflow is the double-shift and comparison with
the initial value.
>
>> +/// // Non-constant expressions can also be validated at build-time thanks to compiler
>> +/// // optimizations. This should be used as a last resort though.
>> +/// let v = BitInt::<u8, 4>::from_expr(15);
>
> Not sure I understand that. Can you confirm my understanding?
>
> 1. For compile-time initialization I use BitInt::<i8, 4>::new::<8>();
> 2. For compile- or runtime initialization: BitInt::<i8, 4>::from_expr(val);
> 3. For runtime-only initialization: BitInt::<i8, 4>::try_new(val);
>
> In this scheme #3 looks excessive...
This example was not very good. The v2 features a better one:
    let v = BitInt::<u32, 4>::from_expr(some_number() & 0xf);
Here assume that `some_number()` returns a random value. You cannot use
the static `new` constructor, as the exact value is not statically
known, so the alternative would be to use `try_new` and check for an
overflow error you know cannot happen because the value is masked with
`0xf` and thus will fit the 4 bits.
`from_expr` allows you to create this `BitInt` infallibly, by relying on
the compiler being smart enough to infer from the mask operation that
the value will indeed satify the constraints of the `BitInt`, and throws
a linker error if it couldn't. If the program builds, there is no need
for error checking and no runtime validation.
>
>> +/// // Common operations are supported against the backing type.
>> +/// assert_eq!(v + 5, 20);
>> +/// assert_eq!(v / 3, 5);
>
> No, v + 5 == 20 for a different reason. There's nothing about 'backing
> storage' here.
>
> v + 5 should be 20 because addition implies typecasting to the wider
> type. In this case, 20 is numeral, or int, and BitInt(4) + int == int.
>
> I tried C23, and it works exactly like that:
>
>     unsigned _BitInt(4) x = 15;
>
>     printf("%d\n", x + 5);                              // 20
>     printf("%d\n", x / 3);                              // 5
>     printf("%d\n", x + (unsigned _BitInt(4))5);         // 4
>     x += 5;
>     printf("%d\n", x);                                  // 4
>
> Rust _must_ do the same thing to at least be arithmetically
> compatible to the big brother. 
Rust doesn't do implicit casts. I do agree that "backing storage" is not
the best choice of words though.
>
> It makes me more confident that this 'backing storage' concept
> brings nothing but confusion.
>
>> +/// // The backing type can be changed while preserving the number of bits used for representation.
>> +/// assert_eq!(v.cast::<u32>(), BitInt::<u32, 4>::new::<15>());
>> +///
>> +/// // We can safely extend the number of bits...
>> +/// assert_eq!(v.extend::<5>(), BitInt::<u8, 5>::new::<15>());
>> +/// // ... but reducing the number of bits fails here as the value doesn't fit anymore.
>> +/// assert_eq!(v.try_shrink::<3>(), None);
>
> Not sure what for you need this. If I need to 'extend', I just assign
> the value to a variable:
>
>         BitInt(3) a = 3;
>         BitInt(10) b;
>         int c;
>
>         b = a + 123;    // extend
>         c = b;          // another extend
>
> How would this 'extend' and 'shrink' work with arrays of BitInts?
I think this discussion would be more productive if we don't rely on
examples in a language that is irrelevant for the current patch. :)
Rust does not allow overloading the `=` operator, so these implicit
conversions from one type to another cannot be performed. Having
dedicated methods is an idiomatic way to do this according to my
experience - an alternative would be to have `From` implementations, but
doing this elegantly would require one language feature (generic const
expressions) that is still not stable.
>
>> +/// // Conversion into primitive types is dependent on the number of useful bits, not the backing
>> +/// // type.
>> +/// //
>> +/// // Event though its backing type is `u32`, this `BitInt` only uses 8 bits and thus can safely
>> +/// // be converted to a `u8`.
>> +/// assert_eq!(u8::from(BitInt::<u32, 8>::new::<128>()), 128u8);
>
> 'Backing type' is useless here too.
>
>> +/// // The same applies to signed values.
>> +/// asserkkt_eq!(i8::from(BitInt::<i32, 8>::new::<127>()), 127i8);
>> +///
>> +/// // This however is not allowed, as 10 bits won't fit into a `u8` (regardless of the actually
>> +/// // contained value).
>> +/// // let _ = u8::from(BitInt::<u32, 10>::new::<10>());
>
> If I explicitly typecast from a wider type, please just let me do
> that. In the above examples you show that .is_some() and .is_none()
> can help user to check for overflow if needed.
>
> Otherwise, user will hack your protection by just converting
> BitInt(8) to u32, and then to BitInt(10).
Doing so would require validating that the value in the `u32` can fit
within the 10-bit `BitInt` one way or the other, so the protection
cannot be bypassed that way.
After comparing this implementation with C's `_BitInt`, I have also come
to a more fundamental divergence between the two.
The C `_BitInt` is used to express numbers with an arbitrary number of
bits - which could be less than a primitive type, but also *more* - for
instance a `_BitInt(4094)` is a valid thing!
Which is really cool, but also not something we need or want in the
kernel. Our purposes here is strictly to limit the width of existing
primitive types to provide extra guarantees about the code. And even if
we wanted to mimic the C `_BitInt`, we simply couldn't without compiler
support as literal values larger than a primitive type cannot even be
expressed.
So although I liked the `BitInt` name, that makes it quite a bit
misleading for our type as users could think that they will have an
equivalent to the C namesake, while the purpose and use is different.
The original `BoundedInt` name was a more accurate fit IMHO, but I hope
we can find something shorter.
Powered by blists - more mailing lists