linux-kernel - Re: C aggregate passing (Rust kernel policy)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f3a83d60-3506-4e20-b202-ef2ea99ef4dc@ralfj.de>
Date: Wed, 26 Feb 2025 17:32:36 +0100
From: Ralf Jung <post@...fj.de>
To: Ventura Jack <venturajack85@...il.com>
Cc: Kent Overstreet <kent.overstreet@...ux.dev>,
 Miguel Ojeda <miguel.ojeda.sandonis@...il.com>, Gary Guo <gary@...yguo.net>,
 torvalds@...ux-foundation.org, airlied@...il.com, boqun.feng@...il.com,
 david.laight.linux@...il.com, ej@...i.de, gregkh@...uxfoundation.org,
 hch@...radead.org, hpa@...or.com, ksummit@...ts.linux.dev,
 linux-kernel@...r.kernel.org, rust-for-linux@...r.kernel.org
Subject: Re: C aggregate passing (Rust kernel policy)

Hi VJ,

>>
>>> - Rust has not defined its aliasing model.
>>
>> Correct. But then, neither has C. The C aliasing rules are described in English
>> prose that is prone to ambiguities and misintepretation. The strict aliasing
>> analysis implemented in GCC is not compatible with how most people read the
>> standard (https://bugs.llvm.org/show_bug.cgi?id=21725). There is no tool to
>> check whether code follows the C aliasing rules, and due to the aforementioned
>> ambiguities it would be hard to write such a tool and be sure it interprets the
>> standard the same way compilers do.
>>
>> For Rust, we at least have two candidate models that are defined in full
>> mathematical rigor, and a tool that is widely used in the community, ensuring
>> the models match realistic use of Rust.
> 
> But it is much more significant for Rust than for C, at least in
> regards to C's "restrict", since "restrict" is rarely used in C, while
> aliasing optimizations are pervasive in Rust. For C's "strict aliasing",
> I think you have a good point, but "strict aliasing" is still easier to
> reason about in my opinion than C's "restrict". Especially if you
> never have any type casts of any kind nor union type punning.

Is it easier to reason about? At least GCC got it wrong, making no-aliasing 
assumptions that are not justified by most people's interpretation of the model:
https://bugs.llvm.org/show_bug.cgi?id=21725
(But yes that does involve unions.)

>>> - The aliasing rules in Rust are possibly as hard or
>>>      harder than for C "restrict", and it is not possible to
>>>      opt out of aliasing in Rust, which is cited by some
>>>      as one of the reasons for unsafe Rust being
>>>      harder than C.
>>
>> That is not quite correct; it is possible to opt-out by using raw pointers.
> 
> Again, I did have this list item:
> 
> - Applies to certain pointer kinds in Rust, namely
>      Rust "references".
>      Rust pointer kinds:
>      https://doc.rust-lang.org/reference/types/pointer.html
> 
> where I wrote that the aliasing rules apply to Rust "references".

Okay, fair. But it is easy to misunderstand the other items in your list in 
isolation.

> 
>>>      the aliasing rules, may try to rely on MIRI. MIRI is
>>>      similar to a sanitizer for C, with similar advantages and
>>>      disadvantages. MIRI uses both the stacked borrow
>>>      and the tree borrow experimental research models.
>>>      MIRI, like sanitizers, does not catch everything, though
>>>      MIRI has been used to find undefined behavior/memory
>>>      safety bugs in for instance the Rust standard library.
>>
>> Unlike sanitizers, Miri can actually catch everything. However, since the exact
>> details of what is and is not UB in Rust are still being worked out, we cannot
>> yet make in good conscience a promise saying "Miri catches all UB". However, as
>> the Miri README states:
>> "To the best of our knowledge, all Undefined Behavior that has the potential to
>> affect a program's correctness is being detected by Miri (modulo bugs), but you
>> should consult the Reference for the official definition of Undefined Behavior.
>> Miri will be updated with the Rust compiler to protect against UB as it is
>> understood by the current compiler, but it makes no promises about future
>> versions of rustc."
>> See the Miri README (https://github.com/rust-lang/miri/?tab=readme-ov-file#miri)
>> for further details and caveats regarding non-determinism.
>>
>> So, the situation for Rust here is a lot better than it is in C. Unfortunately,
>> running kernel code in Miri is not currently possible; figuring out how to
>> improve that could be an interesting collaboration.
> 
> I do not believe that you are correct when you write:
> 
>      "Unlike sanitizers, Miri can actually catch everything."
> 
> Critically and very importantly, unless I am mistaken about MIRI, and
> similar to sanitizers, MIRI only checks with runtime tests. That means
> that MIRI will not catch any undefined behavior that a test does
> not encounter. If a project's test coverage is poor, MIRI will not
> check a lot of the code when run with those tests. Please do
> correct me if I am mistaken about this. I am guessing that you
> meant this as well, but I do not get the impression that it is
> clear from your post.

Okay, I may have misunderstood what you mean by "catch everything". All 
sanitizers miss some UB that actually occurs in the given execution. This is 
because they are inserted in the pipeline after a bunch of compiler-specific 
choices have already been made, potentially masking some UB. I'm not aware of a 
sanitizer for sequence point violations. I am not aware of a sanitizer for 
strict aliasing or restrict. I am not aware of a sanitizer that detects UB due 
to out-of-bounds pointer arithmetic (I am not talking about OOB accesses; just 
the arithmetic is already UB), or UB due to violations of "pointer lifetime end 
zapping", or UB due to comparing pointers derived from different allocations. Is 
there a sanitizer that correctly models what exactly happens when a struct with 
padding gets copied? The padding must be reset to be considered "uninitialized", 
even if the entire struct was zero-initialized before. Most compilers implement 
such a copy as memcpy; a sanitizer would then miss this UB.

In contrast, Miri checks for all the UB that is used anywhere in the Rust 
compiler -- everything else would be a critical bug in either Miri or the compiler.
But yes, it only does so on the code paths you are actually testing. And yes, it 
is very slow.

Kind regards,
Ralf