linux-kernel - Re: C aggregate passing (Rust kernel policy)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <dd28fe6e2c174f605a104723a5ab8d5445fe8002.camel@tugraz.at>
Date: Wed, 26 Feb 2025 20:07:22 +0100
From: Martin Uecker <uecker@...raz.at>
To: Ralf Jung <post@...fj.de>, Ventura Jack <venturajack85@...il.com>
Cc: Kent Overstreet <kent.overstreet@...ux.dev>, Miguel Ojeda
	 <miguel.ojeda.sandonis@...il.com>, Gary Guo <gary@...yguo.net>, 
	torvalds@...ux-foundation.org, airlied@...il.com, boqun.feng@...il.com, 
	david.laight.linux@...il.com, ej@...i.de, gregkh@...uxfoundation.org, 
	hch@...radead.org, hpa@...or.com, ksummit@...ts.linux.dev, 
	linux-kernel@...r.kernel.org, rust-for-linux@...r.kernel.org
Subject: Re: C aggregate passing (Rust kernel policy)

Am Mittwoch, dem 26.02.2025 um 17:32 +0100 schrieb Ralf Jung:
> Hi VJ,
> 
> > > 
> > > > - Rust has not defined its aliasing model.
> > > 
> > > Correct. But then, neither has C. The C aliasing rules are described in English
> > > prose that is prone to ambiguities and misintepretation. The strict aliasing
> > > analysis implemented in GCC is not compatible with how most people read the
> > > standard (https://bugs.llvm.org/show_bug.cgi?id=21725). There is no tool to
> > > check whether code follows the C aliasing rules, and due to the aforementioned
> > > ambiguities it would be hard to write such a tool and be sure it interprets the
> > > standard the same way compilers do.
> > > 
> > > For Rust, we at least have two candidate models that are defined in full
> > > mathematical rigor, and a tool that is widely used in the community, ensuring
> > > the models match realistic use of Rust.
> > 
> > But it is much more significant for Rust than for C, at least in
> > regards to C's "restrict", since "restrict" is rarely used in C, while
> > aliasing optimizations are pervasive in Rust. For C's "strict aliasing",
> > I think you have a good point, but "strict aliasing" is still easier to
> > reason about in my opinion than C's "restrict". Especially if you
> > never have any type casts of any kind nor union type punning.
> 
> Is it easier to reason about? At least GCC got it wrong, making no-aliasing 
> assumptions that are not justified by most people's interpretation of the model:
> https://bugs.llvm.org/show_bug.cgi?id=21725
> (But yes that does involve unions.)

Did you mean to say LLVM got this wrong?   As far as I know,
the GCC TBBA code is more correct than LLVMs.  It gets 
type-changing stores correct that LLVM does not implement.

> 
> > > > - The aliasing rules in Rust are possibly as hard or
> > > >      harder than for C "restrict", and it is not possible to
> > > >      opt out of aliasing in Rust, which is cited by some
> > > >      as one of the reasons for unsafe Rust being
> > > >      harder than C.
> > > 
> > > That is not quite correct; it is possible to opt-out by using raw pointers.
> > 
> > Again, I did have this list item:
> > 
> > - Applies to certain pointer kinds in Rust, namely
> >      Rust "references".
> >      Rust pointer kinds:
> >      https://doc.rust-lang.org/reference/types/pointer.html
> > 
> > where I wrote that the aliasing rules apply to Rust "references".
> 
> Okay, fair. But it is easy to misunderstand the other items in your list in 
> isolation.
> 
> > 
> > > >      the aliasing rules, may try to rely on MIRI. MIRI is
> > > >      similar to a sanitizer for C, with similar advantages and
> > > >      disadvantages. MIRI uses both the stacked borrow
> > > >      and the tree borrow experimental research models.
> > > >      MIRI, like sanitizers, does not catch everything, though
> > > >      MIRI has been used to find undefined behavior/memory
> > > >      safety bugs in for instance the Rust standard library.
> > > 
> > > Unlike sanitizers, Miri can actually catch everything. However, since the exact
> > > details of what is and is not UB in Rust are still being worked out, we cannot
> > > yet make in good conscience a promise saying "Miri catches all UB". However, as
> > > the Miri README states:
> > > "To the best of our knowledge, all Undefined Behavior that has the potential to
> > > affect a program's correctness is being detected by Miri (modulo bugs), but you
> > > should consult the Reference for the official definition of Undefined Behavior.
> > > Miri will be updated with the Rust compiler to protect against UB as it is
> > > understood by the current compiler, but it makes no promises about future
> > > versions of rustc."
> > > See the Miri README (https://github.com/rust-lang/miri/?tab=readme-ov-file#miri)
> > > for further details and caveats regarding non-determinism.
> > > 
> > > So, the situation for Rust here is a lot better than it is in C. Unfortunately,
> > > running kernel code in Miri is not currently possible; figuring out how to
> > > improve that could be an interesting collaboration.
> > 
> > I do not believe that you are correct when you write:
> > 
> >      "Unlike sanitizers, Miri can actually catch everything."
> > 
> > Critically and very importantly, unless I am mistaken about MIRI, and
> > similar to sanitizers, MIRI only checks with runtime tests. That means
> > that MIRI will not catch any undefined behavior that a test does
> > not encounter. If a project's test coverage is poor, MIRI will not
> > check a lot of the code when run with those tests. Please do
> > correct me if I am mistaken about this. I am guessing that you
> > meant this as well, but I do not get the impression that it is
> > clear from your post.
> 
> Okay, I may have misunderstood what you mean by "catch everything". All 
> sanitizers miss some UB that actually occurs in the given execution. This is 
> because they are inserted in the pipeline after a bunch of compiler-specific 
> choices have already been made, potentially masking some UB. I'm not aware of a 
> sanitizer for sequence point violations. I am not aware of a sanitizer for 
> strict aliasing or restrict. I am not aware of a sanitizer that detects UB due 
> to out-of-bounds pointer arithmetic (I am not talking about OOB accesses; just 
> the arithmetic is already UB), or UB due to violations of "pointer lifetime end 
> zapping", or UB due to comparing pointers derived from different allocations. Is 
> there a sanitizer that correctly models what exactly happens when a struct with 
> padding gets copied? The padding must be reset to be considered "uninitialized", 
> even if the entire struct was zero-initialized before. Most compilers implement 
> such a copy as memcpy; a sanitizer would then miss this UB.

Note that reading padding bytes in C is not UB. Regarding
uninitialized variables, only automatic variables whose address
is not taken is UB in C.   Although I suspect that compilers
have compliance isues here.

But yes, it sanitizers are still rather poor.

Martin

> 
> In contrast, Miri checks for all the UB that is used anywhere in the Rust 
> compiler -- everything else would be a critical bug in either Miri or the compiler.
> But yes, it only does so on the code paths you are actually testing. And yes, it 
> is very slow.
> 
> Kind regards,
> Ralf
>