[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFJgqgTtR9MCU5nmhEbEB5oiEELCUGeFYy23jmxQXGDZ1Re3Rw@mail.gmail.com>
Date: Wed, 26 Feb 2025 07:26:18 -0700
From: Ventura Jack <venturajack85@...il.com>
To: Kent Overstreet <kent.overstreet@...ux.dev>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>, Alice Ryhl <aliceryhl@...gle.com>,
Gary Guo <gary@...yguo.net>, airlied@...il.com, boqun.feng@...il.com,
david.laight.linux@...il.com, ej@...i.de, gregkh@...uxfoundation.org,
hch@...radead.org, hpa@...or.com, ksummit@...ts.linux.dev,
linux-kernel@...r.kernel.org, miguel.ojeda.sandonis@...il.com,
rust-for-linux@...r.kernel.org
Subject: Re: C aggregate passing (Rust kernel policy)
On Tue, Feb 25, 2025 at 4:34 PM Kent Overstreet
<kent.overstreet@...ux.dev> wrote:
>
> On Tue, Feb 25, 2025 at 01:24:42PM -0800, Linus Torvalds wrote:
> > What we do know works is hard rules based on provenance. All compilers
> > will happily do sane alias analysis based on "this is a variable that
> > I created, I know it cannot alias with anything else, because I didn't
> > expose the address to anything else".
>
> Yep. That's what all this is based on.
>
> > So *provenance*-based aliasing works, but it only works in contexts
> > where you can see the provenance. Having some way to express
> > provenance across functions (and not *just* at allocation time) might
> > be a good model.
>
> We have that! That's exactly what lifetime annotations are.
>
> We don't have that for raw pointers, but I'm not sure that would ever be
> needed since you use raw pointers in small and localized places, and a
> lot of the places where aliasing comes up in C (e.g. memmove()) you
> express differently in Rust, with slices and indices.
>
> (You want to drop from references to raw pointers at the last possible
> moment).
The Rust community in general warns a lot against unsafe Rust, and
encourages developers to write as little unsafe Rust as possible,
or avoid it entirely. And multiple blog posts have been written
claiming that unsafe Rust is harder than C as well as C++.
I will link some of the blog posts upon request, I have linked some
of them in other emails.
And there have been undefined behavior/memory safety bugs
in Rust projects, both in the Rust standard library (which has a lot
of unsafe Rust relative to many other Rust projects) and in
other Rust projects.
https://nvd.nist.gov/vuln/detail/CVE-2024-27308
Amazon Web Services, possibly the biggest Rust developer employer,
initiated last year a project for formal verification of the Rust standard
library.
However, due to various reasons such as the general difficulty of
formal verification, the project is crowd-sourced.
https://aws.amazon.com/blogs/opensource/verify-the-safety-of-the-rust-standard-library/
"Verifying the Rust libraries is difficult because: 1/ lack of a
specification, 2/ lack of an existing verification mechanism
in the Rust ecosystem, 3/ the large size of the verification
problem, and 4/ the unknowns of scalable verification. Given
the magnitude and scope of the effort, we believe that a single
team would be unable to make significant inroads. Our
approach is to create a community owned effort."
All in all, unsafe Rust appears very difficult in practice, and tools
like MIRI, while very good, does not catch everything, and share
many of the advantages and disadvantages of sanitizers.
Would unsafe Rust have been substantially easier if Rust did not
have pervasive aliasing optimizations? If a successor language
to Rust also includes the safe-unsafe divide, but does not have
pervasive aliasing optimizations, that may yield an indication of
an answer to that question. Especially if such a language only
uses aliasing optimizations when the compiler, not the
programmer, proves it is safe to do those optimizations.
Rust is very unlikely to skip its aliasing optimizations, since it is one
major reason why Rust has often had comparable, or sometimes
better, performance than C and C++ in some benchmarks, despite
some runtime checks as I understand it in Rust.
> And besides, a lot of the places where aliasing comes up in C are
> already gone in Rust, there's a lot of little things that help.
> Algebraic data types are a big one, since a lot of the sketchy hackery
> that goes on in C where aliasing is problematic is just working around
> the lack of ADTs.
Algebraic data types/tagged unions, together with pattern matching,
are indeed excellent. But they are independent of Rust's novel features,
they are part of the functional programming tradition, and they have
been added to many old and new mainstream programming
languages. They are low-hanging fruits. They help not only with
avoiding undefined behavior/memory safety bugs, but also with
general correctness, maintainability, etc.
C seems to avoid features that would bring it closer to C++, and C
is seemingly kept simple, but otherwise it should not be difficult to
add them to C. C's simplicity makes it easier to write new C compilers.
Though these days people often write backends for GCC or LLVM,
as I understand it.
If you, the Linux kernel community, really want these low-hanging
fruits, I suspect that you might be able to get the C standards
people to do it. Little effort, a lot of benefit for all your new or
refactored C code.
C++ has std::variant, but no pattern matching. Neither of the two
pattern matching proposals for C++26 were accepted, but C++29
will almost certainly have pattern matching.
Curiously, C++ does not have C's "restrict" keyword.
> > But in the absence of knowledge, and in the absence of
> > compiler-imposed rules (and "unsafe" is by *definition* that absence),
> > I think the only rule that works is "don't assume they don't alias".
>
> Well, for the vast body of Rust code that's been written that just
> doesn't seem to be the case, and I think it's been pretty well
> demonstrated that anything we can do in C, we can also do just as
> effectively in Rust.
>
> treeborrow is already merged into Miri - this stuff is pretty far along.
>
> Now if you're imagining directly translating all the old grotty C code I
> know you have in your head - yeah, that won't work. But we already knew
> that.
Yet the Rust community encourages not to use unsafe Rust when
it is possible to not use it, and many have claimed in the Rust
community that unsafe Rust is harder than C and C++. And there
is still only one major Rust compiler and no specification, unlike
for C.
As for tree borrows, it is not yet used by default in MIRI as far as
I can tell, when I ran MIRI against an example with UB, I got a
warning that said that the Stacked Borrows rules are still
experimental. I am guessing that you have to use a flag to enable
tree borrows.
Best, VJ.
Powered by blists - more mailing lists