[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f2c2d7ae-08c1-4122-a131-f5a65e9ed3d2@ralfj.de>
Date: Thu, 27 Feb 2025 14:55:22 +0100
From: Ralf Jung <post@...fj.de>
To: David Laight <david.laight.linux@...il.com>
Cc: Ventura Jack <venturajack85@...il.com>,
Kent Overstreet <kent.overstreet@...ux.dev>,
Miguel Ojeda <miguel.ojeda.sandonis@...il.com>, Gary Guo <gary@...yguo.net>,
torvalds@...ux-foundation.org, airlied@...il.com, boqun.feng@...il.com,
ej@...i.de, gregkh@...uxfoundation.org, hch@...radead.org, hpa@...or.com,
ksummit@...ts.linux.dev, linux-kernel@...r.kernel.org,
rust-for-linux@...r.kernel.org
Subject: Re: C aggregate passing (Rust kernel policy)
Hi all,
> ...
>>> Unions in C, C++ and Rust (not Rust "enum"/tagged union) are
>>> generally sharp. In Rust, it requires unsafe Rust to read from
>>> a union.
>>
>> Definitely sharp. At least in Rust we have a very clear specification though,
>> since we do allow arbitrary type punning -- you "just" reinterpret whatever
>> bytes are stored in the union, at whatever type you are reading things. There is
>> also no "active variant" or anything like that, you can use any variant at any
>> time, as long as the bytes are "valid" for the variant you are using. (So for
>> instance if you are trying to read a value 0x03 at type `bool`, that is UB.)
>
> That is actually a big f***ing problem.
> The language has to define the exact behaviour when 'bool' doesn't contain
> 0 or 1.
No, it really does not. If you want a variable that can hold all values in
0..256, use `u8`. The entire point of the `bool` type is to represent values
that can only ever be `true` or `false`. So the language requires that when you
do type-unsafe manipulation of raw bytes, and when you then make the choice of
the `bool` type for that code (which you are not forced to!), then you must
indeed uphold the guarantees of `bool`: the data must be `0x00` or `0x01`.
> Much the same as the function call interface defines whether it is the caller
> or called code is responsible for masking the high bits of a register that
> contains a 'char' type.
>
> Now the answer could be that 'and' is (or may be) a bit-wise operation.
> But that isn't UB, just an undefined/unexpected result.
>
> I've actually no idea if/when current gcc 'sanitises' bool values.
> A very old version used to generate really crap code (and I mean REALLY)
> because it repeatedly sanitised the values.
> But IMHO bool just shouldn't exist, it isn't a hardware type and is actually
> expensive to get right.
> If you use 'int' with zero meaning false there is pretty much no ambiguity.
We have many types in Rust that are not hardware types. Users can even define
them themselves:
enum MyBool { MyFalse, MyTrue }
This is, in fact, one of the entire points of higher-level languages like Rust:
to let users define types that represent concepts that are more abstract than
what exists in hardware. Hardware would also tell us that `&i32` and `*const
i32` are basically the same thing, and yet of course there's a world of a
difference between those types in Rust.
Kind regards,
Ralf
Powered by blists - more mailing lists