[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54b3b2b1-3a0f-4e39-9661-4d1b947663f3@ralfj.de>
Date: Wed, 26 Feb 2025 17:06:21 +0100
From: Ralf Jung <post@...fj.de>
To: Ventura Jack <venturajack85@...il.com>,
Miguel Ojeda <miguel.ojeda.sandonis@...il.com>
Cc: Alice Ryhl <aliceryhl@...gle.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Kent Overstreet <kent.overstreet@...ux.dev>, Gary Guo <gary@...yguo.net>,
airlied@...il.com, boqun.feng@...il.com, david.laight.linux@...il.com,
ej@...i.de, gregkh@...uxfoundation.org, hch@...radead.org, hpa@...or.com,
ksummit@...ts.linux.dev, linux-kernel@...r.kernel.org,
rust-for-linux@...r.kernel.org
Subject: Re: C aggregate passing (Rust kernel policy)
Hi all,
> You are right that I should have written "currently tied", not "tied", and
> I do hope and assume that the work with aliasing will result
> in some sorts of specifications.
>
> The language reference directly referring to LLVM's aliasing rules,
> and that the preprint paper also refers to LLVM, does indicate a tie-in,
> even if that tie-in is incidental and not desired. With more than one
> major compiler, such tie-ins are easier to avoid.
>
> https://doc.rust-lang.org/stable/reference/behavior-considered-undefined.html
> "Breaking the pointer aliasing rules
> http://llvm.org/docs/LangRef.html#pointer-aliasing-rules
> . Box<T>, &mut T and &T follow LLVM’s scoped noalias
> http://llvm.org/docs/LangRef.html#noalias
> model, except if the &T contains an UnsafeCell<U>.
> References and boxes must not be dangling while they are
> live. The exact liveness duration is not specified, but some
> bounds exist:"
The papers mention LLVM since LLVM places a key constraint on the Rust model:
every program that is well-defined in Rust must also be well-defined in
LLVM+noalias. We could design our models completely in empty space and come up
with something theoretically beautiful, but the fact of the matter is that Rust
wants LLVM's noalias-based optimizations, and so a model that cannot justify
those is pretty much dead at arrival.
Not sure if that qualifies as us "tying" ourselves to LLVM -- mostly it just
ensures that in our papers we don't come up with a nonsense model that's useless
in practice. :)
The only real tie that exists is that LLVM is the main codegen backend for Rust,
so we strongly care about what it takes to get LLVM to generate good code. We
are aware of this as a potential concern for over-fitting the model, and are
trying to take that into account. So far, the main cases of over-fitting we are
having is that we often make something allowed (not UB) in Rust "because we
can", because it is not UB in LLVM -- and that is a challenge for gcc-rs
whenever C has more UB than LLVM, and GCC follows C (some cases where this
occurs: comparing dead/dangling pointers with "==", comparing entirely unrelated
pointers with "<", doing memcpy with a size of 0 [but C is allowing this soon so
GCC will have to adjust anyway], creating but never using an out-of-bounds
pointer with `wrapping_offset`). But I think that's fine (for gcc-rs to work, it
puts pressure on GCC to support these operations efficiently without UB, which I
don't think is a bad thing); it gets concerning only once we make *more* things
UB than we would otherwise for no good reason other than "LLVM says so". I don't
think we are doing that. I think what we did in the aliasing model is entirely
reasonable and can be justified based on optimization benefits and the structure
of how Rust lifetimes and function scopes interact, but this is a subjective
judgment calls and reasonable people could disagree on this.
The bigger problem is people doing interesting memory management shenanigans via
FFI, and it being not clear whether and how LLVM has considered those
shenanigans in their model, so on the Rust side we can't tell users "this is
fine" until we have an "ok" from the LLVM side -- and meanwhile people do use
those same patterns in C without worrying about it. It can then take a while
until we have convinced LLVM to officially give us (and clang) the guarantees
that clang users have been assuming already for a while.
Kind regards,
Ralf
Powered by blists - more mailing lists