lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFJgqgSqMO724SQxinNqVGCGc7=ibUvVq-f7Qk1=S3A47Mr-ZQ@mail.gmail.com>
Date: Sun, 23 Feb 2025 08:30:06 -0700
From: Ventura Jack <venturajack85@...il.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Kent Overstreet <kent.overstreet@...ux.dev>, Gary Guo <gary@...yguo.net>, airlied@...il.com, 
	boqun.feng@...il.com, david.laight.linux@...il.com, ej@...i.de, 
	gregkh@...uxfoundation.org, hch@...radead.org, hpa@...or.com, 
	ksummit@...ts.linux.dev, linux-kernel@...r.kernel.org, 
	miguel.ojeda.sandonis@...il.com, rust-for-linux@...r.kernel.org
Subject: Re: C aggregate passing (Rust kernel policy)

Just to be clear and avoid confusion, I would
like to clarify some aspects of aliasing.
In case that you do not already know about this,
I suspect that you may find it very valuable.

I am not an expert at Rust, so for any Rust experts
out there, please feel free to point out any errors
or mistakes that I make in the following.

The Rustonomicon is (as I gather) the semi-official
documentation site for unsafe Rust.

Aliasing in C and Rust:

C "strict aliasing":
- Is not a keyword.
- Based on "type compatibility".
- Is turned off by default in the kernel by using
    a compiler flag.

C "restrict":
- Is a keyword, applied to pointers.
- Is opt-in to a kind of aliasing.
- Is seldom used in practice, since many find
    it difficult to use correctly and avoid
    undefined behavior.

Rust aliasing:
- Is not a keyword.
- Applies to certain pointer kinds in Rust, namely
    Rust "references".
    Rust pointer kinds:
    https://doc.rust-lang.org/reference/types/pointer.html
- Aliasing in Rust is not opt-in or opt-out,
    it is always on.
    https://doc.rust-lang.org/nomicon/aliasing.html
- Rust has not defined its aliasing model.
    https://doc.rust-lang.org/nomicon/references.html
        "Unfortunately, Rust hasn't actually
        defined its aliasing model.
        While we wait for the Rust devs to specify
        the semantics of their language, let's use
        the next section to discuss what aliasing is
        in general, and why it matters."
    There is active experimental research on
    defining the aliasing model, including tree borrows
    and stacked borrows.
- The aliasing model not being defined makes
    it harder to reason about and work with
    unsafe Rust, and therefore harder to avoid
    undefined behavior/memory safety bugs.
- Rust "references" are common and widespread.
- If the aliasing rules are broken, undefined
    behavior and lack of memory safety can
    happen.
- In safe Rust, if aliasing rules are broken,
    depending on which types and functions
    are used, a compile-time error or UB-safe runtime
    error occurs. For instance, RefCell.borrow_mut()
    can panic if used incorrectly. If all the unsafe Rust
    code and any safe Rust code the unsafe Rust
    code relies on is implemented correctly, there is
    no risk of undefined behavior/memory safety bugs
    when working in safe Rust.

    With a few caveats that I ignore here, like type
    system holes allowing UB in safe Rust, and no
    stack overflow protection if #![no_std] is used.
    Rust for Linux uses #![no_std].
- The correctness of unsafe Rust code can rely on
    safe Rust code being correct.
    https://doc.rust-lang.org/nomicon/working-with-unsafe.html
        "Because it relies on invariants of a struct field,
        this unsafe code does more than pollute a whole
        function: it pollutes a whole module. Generally,
        the only bullet-proof way to limit the scope of
        unsafe code is at the module boundary with privacy."
- In unsafe Rust, it is the programmer's responsibility
    to obey the aliasing rules, though the type system
    can offer limited help.
- The aliasing rules in Rust are possibly as hard or
    harder than for C "restrict", and it is not possible to
    opt out of aliasing in Rust, which is cited by some
    as one of the reasons for unsafe Rust being
    harder than C.
- It is necessary to have some understanding of the
    aliasing rules for Rust in order to work with
    unsafe Rust in general.
- Many find unsafe Rust harder than C:
    https://chadaustin.me/2024/10/intrusive-linked-list-in-rust/
    https://lucumr.pocoo.org/2022/1/30/unsafe-rust/
    https://youtube.com/watch?v=DG-VLezRkYQ
    Unsafe Rust being harder than C and C++ is a common
    sentiment in the Rust community, possibly the large
    majority view.
- Some Rust developers, instead of trying to understand
    the aliasing rules, may try to rely on MIRI. MIRI is
    similar to a sanitizer for C, with similar advantages and
    disadvantages. MIRI uses both the stacked borrow
    and the tree borrow experimental research models.
    MIRI, like sanitizers, does not catch everything, though
    MIRI has been used to find undefined behavior/memory
    safety bugs in for instance the Rust standard library.

So if you do not wish to deal with aliasing rules, you
may need to avoid the pieces of code that contains unsafe
Rust.

Best, VJ.

On Sat, Feb 22, 2025 at 12:18 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> On Sat, 22 Feb 2025 at 10:54, Kent Overstreet <kent.overstreet@...ux.dev> wrote:
> >
> > If that work is successful it could lead to significant improvements in
> > code generation, since aliasing causes a lot of unnecessary spills and
> > reloads - VLIW could finally become practical.
>
> No.
>
> Compiler people think aliasing matters. It very seldom does. And VLIW
> will never become practical for entirely unrelated reasons (read: OoO
> is fundamentally superior to VLIW in general purpose computing).
>
> Aliasing is one of those bug-bears where compiler people can make
> trivial code optimizations that look really impressive. So compiler
> people *love* having simplistic aliasing rules that don't require real
> analysis, because the real analysis is hard (not just expensive, but
> basically unsolvable).
>
> And they matter mainly on bad CPUs and HPC-style loads, or on trivial
> example code. And for vectorization.
>
> And the sane model for those was to just have the HPC people say what
> the aliasing rules were (ie the C "restrict" keyword), but because it
> turns out that nobody wants to use that, and because one of the main
> targets was HPC where there was a very clear type distinction between
> integer indexes and floating point arrays, some "clever" person
> thought "why don't we use that obvious distinction to say that things
> don't alias". Because then you didn't have to add "restrict" modifiers
> to your compiler benchmarks, you could just use the existing syntax
> ("double *").
>
> And so they made everything worse for everybody else, because it made
> C HPC code run as fast as the old Fortran code, and the people who
> cared about DGEMM and BLAS were happy. And since that was how you
> defined supercomputer speeds (before AI), that largely pointless
> benchmark was a BigDeal(tm).
>
> End result: if you actually care about HPC and vectorization, just use
> 'restrict'. If you want to make it better (because 'restrict'
> certainly isn't perfect either), extend on the concept. Don't make
> things worse for everybody else by introducing stupid language rules
> that are fundamentally based on "the compiler can generate code better
> by relying on undefined behavior".
>
> The C standards body has been much too eager to embrace "undefined behavior".
>
> In original C, it was almost entirely about either hardware
> implementation issues or about "you got your pointer arithetic wrong,
> and the source code is undefined, so the result is undefined".
> Together with some (very unfortunate) order of operations and sequence
> point issues.
>
> But instead of trying to tighten that up (which *has* happened: the
> sequence point rules _have_ actually become better!) and turning the
> language into a more reliable one by making for _fewer_ undefined or
> platform-defined things, many C language features have been about
> extending on the list of undefined behaviors.
>
> The kernel basically turns all that off, as much as possible. Overflow
> isn't undefined in the kernel. Aliasing isn't undefined in the kernel.
> Things like that.
>
> And making the rules stricter makes almost no difference for code
> generation in practice. Really. The arguments for the garbage that is
> integer overflow or 'strict aliasing' in C were always just wrong.
>
> When 'integer overflow' means that you can _sometimes_ remove one
> single ALU operation in *some* loops, but the cost of it is that you
> potentially introduced some seriously subtle security bugs, I think we
> know it was the wrong thing to do.
>
>              Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ