linux-kernel - Re: C aggregate passing (Rust kernel policy)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <780ff858-4f8e-424f-b40c-b9634407dce3@ralfj.de>
Date: Wed, 26 Feb 2025 12:34:14 +0100
From: Ralf Jung <post@...fj.de>
To: Kent Overstreet <kent.overstreet@...ux.dev>,
 Miguel Ojeda <miguel.ojeda.sandonis@...il.com>
Cc: Ventura Jack <venturajack85@...il.com>, Gary Guo <gary@...yguo.net>,
 torvalds@...ux-foundation.org, airlied@...il.com, boqun.feng@...il.com,
 david.laight.linux@...il.com, ej@...i.de, gregkh@...uxfoundation.org,
 hch@...radead.org, hpa@...or.com, ksummit@...ts.linux.dev,
 linux-kernel@...r.kernel.org, rust-for-linux@...r.kernel.org
Subject: Re: C aggregate passing (Rust kernel policy)

Hi all,

(For context, I am the supervisor of the Tree Borrows project and the main 
author of its predecessor, Stacked Borrows. I am also maintaining Miri, a Rust 
UB detection tool that was mentioned elsewhere in this thread. I am happy to 
answer any questions you might have about any of these projects. :)

>> Not sure what I said, but Cc'ing Ralf in case he has time and wants to
>> share something on this (thanks in advance!).
> 
> Yeah, this looks like just the thing. At the conference you were talking
> more about memory provenance in C, if memory serves there was cross
> pollination going on between the C and Rust folks - did anything come of
> the C side?

On the C side, there is a provenance model called pnvi-ae-udi (yeah the name is 
terrible, it's a long story ;), which you can read more about at 
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2676.pdf>. My understanding is 
that it will not become part of the standard though; I don't understand the 
politics of WG14 well enough to say what exactly its status is. However, my 
understanding is that that model would require some changes to both clang and 
gcc for them to be compliant (and likely most other C compilers that do any kind 
of non-trivial alias analysis); I am not sure what the plans/timeline are for 
making that happen.

The Rust aliasing model 
(https://doc.rust-lang.org/nightly/std/ptr/index.html#strict-provenance) is 
designed to not require changes to the backend, except for fixing things that 
are clear bugs that also affect C code 
(https://github.com/llvm/llvm-project/issues/33896, 
https://github.com/llvm/llvm-project/issues/34577).

I should also emphasize that defining the basic treatment of provenance is a 
necessary, but not sufficient, condition for defining an aliasing model.

>>  From a quick look, Tree Borrows was submitted for publication back in November:
>>
>>      https://jhostert.de/assets/pdf/papers/villani2024trees.pdf
>>      https://perso.crans.org/vanille/treebor/
> 
> That's it.
> 
> This looks fantastic, much further along than the last time I looked.
> The only question I'm trying to answer is whether it's been pushed far
> enough into llvm for the optimization opportunities to be realized - I'd
> quite like to take a look at some generated code.

I'm glad you like it. :)

Rust has informed LLVM about some basic aliasing facts since ~forever, and LLVM 
is using those opportunities all over Rust code. Specifically, Rust has set 
"noalias" (the LLVM equivalent of C "restrict") on all function parameters that 
are references (specifically mutable reference without pinning, and shared 
references without interior mutability). Stacked Borrows and Tree Borrows kind 
of retroactively are justifying this by clarifying the rules that are imposed on 
unsafe Rust, such that if unsafe Rust follows those rules, they also follow 
LLVM's "noalias". Unfortunately, C "restrict" and LLVM "noalias" are not 
specified very precisely, so we can only hope that this connection indeed holds.

Both Stacked Borrows and Tree Borrows go further than "noalias"; among other 
differences, they impose aliasing requirements on references that stay within a 
function. Most of those extra requirements are not yet used by the optimizer (it 
is not clear how to inform LLVM about them, and Rust's own optimizer doesn't use 
them either). Part of the reason for this is that without a precise model, it is 
hard to be sure which optimizations are correct (in the sense that they do not 
break correct unsafe code) -- and both Stacked Borrows and Tree Borrows are 
still experiments, nothing has been officially decided yet.

Let me also reply to some statements made further up-thread by Ventura Jack (in 
<https://lore.kernel.org/rust-for-linux/CAFJgqgSqMO724SQxinNqVGCGc7=ibUvVq-f7Qk1=S3A47Mr-ZQ@mail.gmail.com/>):

> - Aliasing in Rust is not opt-in or opt-out,
>     it is always on.
>     https://doc.rust-lang.org/nomicon/aliasing.html

This is true, but only for references. There are no aliasing requirements on raw 
pointers. There *are* aliasing requirements if you mix references and raw 
pointers to the same location, so if you want to do arbitrary aliasing you have 
to make sure you use only raw pointers, no references. So unlike in C, you have 
a way to opt-out entirely within standard Rust.

The ergonomics of working with raw pointers could certainly be improved. The 
experience of kernel developers using Rust could help inform that effort. :) 
Though currently the main issue here is that there's nobody actively pushing for 
this.

> - Rust has not defined its aliasing model.

Correct. But then, neither has C. The C aliasing rules are described in English 
prose that is prone to ambiguities and misintepretation. The strict aliasing 
analysis implemented in GCC is not compatible with how most people read the 
standard (https://bugs.llvm.org/show_bug.cgi?id=21725). There is no tool to 
check whether code follows the C aliasing rules, and due to the aforementioned 
ambiguities it would be hard to write such a tool and be sure it interprets the 
standard the same way compilers do.

For Rust, we at least have two candidate models that are defined in full 
mathematical rigor, and a tool that is widely used in the community, ensuring 
the models match realistic use of Rust.

> - The aliasing rules in Rust are possibly as hard or
>     harder than for C "restrict", and it is not possible to
>     opt out of aliasing in Rust, which is cited by some
>     as one of the reasons for unsafe Rust being
>     harder than C.

That is not quite correct; it is possible to opt-out by using raw pointers.

>     the aliasing rules, may try to rely on MIRI. MIRI is
>     similar to a sanitizer for C, with similar advantages and
>     disadvantages. MIRI uses both the stacked borrow
>     and the tree borrow experimental research models.
>     MIRI, like sanitizers, does not catch everything, though
>     MIRI has been used to find undefined behavior/memory
>     safety bugs in for instance the Rust standard library.

Unlike sanitizers, Miri can actually catch everything. However, since the exact 
details of what is and is not UB in Rust are still being worked out, we cannot 
yet make in good conscience a promise saying "Miri catches all UB". However, as 
the Miri README states:
"To the best of our knowledge, all Undefined Behavior that has the potential to 
affect a program's correctness is being detected by Miri (modulo bugs), but you 
should consult the Reference for the official definition of Undefined Behavior. 
Miri will be updated with the Rust compiler to protect against UB as it is 
understood by the current compiler, but it makes no promises about future 
versions of rustc."
See the Miri README (https://github.com/rust-lang/miri/?tab=readme-ov-file#miri) 
for further details and caveats regarding non-determinism.

So, the situation for Rust here is a lot better than it is in C. Unfortunately, 
running kernel code in Miri is not currently possible; figuring out how to 
improve that could be an interesting collaboration.

Kind regards,
Ralf