linux-kernel - Re: C aggregate passing (Rust kernel policy)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ed7ef66dbde453035117c3f2acb1daefa5bd19eb.camel@tugraz.at>
Date: Wed, 26 Feb 2025 21:00:12 +0100
From: Martin Uecker <uecker@...raz.at>
To: Linus Torvalds <torvalds@...ux-foundation.org>, Ralf Jung
 <post@...fj.de>,  "Paul E. McKenney" <paulmck@...nel.org>
Cc: Alice Ryhl <aliceryhl@...gle.com>, Ventura Jack
 <venturajack85@...il.com>,  Kent Overstreet <kent.overstreet@...ux.dev>,
 Gary Guo <gary@...yguo.net>, airlied@...il.com, boqun.feng@...il.com, 
 david.laight.linux@...il.com, ej@...i.de, gregkh@...uxfoundation.org, 
 hch@...radead.org, hpa@...or.com, ksummit@...ts.linux.dev, 
 linux-kernel@...r.kernel.org, miguel.ojeda.sandonis@...il.com, 
 rust-for-linux@...r.kernel.org
Subject: Re: C aggregate passing (Rust kernel policy)

Am Mittwoch, dem 26.02.2025 um 09:59 -0800 schrieb Linus Torvalds:
> On Wed, 26 Feb 2025 at 05:54, Ralf Jung <post@...fj.de> wrote:
> > 
> >      The only approach we know that we can actually
> > pull through systematically (in the sense of "at least in principle, we can
> > formally prove this correct") is to define the "visible behavior" of the source
> > program, the "visible behavior" of the generated assembly, and promise that they
> > are the same.
> 
> That's literally what I ask for with that "naive" code generation, you
> just stated it much better.

The model is exactly the same as in C.  One defines "observable
behavior" (to use C terminology) and compiler can do whatever it
wants as long as it preserves this. 

Regarding undefined behavior, the idea the C standard had originally
was that compilers do something "naive" (e.g. what the architecture
does for some operation) or at least reasonable.  This worked well
until modern optimizers started rather aggressively exploit
that there is UB. C and Rust are are in the same boat here.

As Ralf said, the difference is that Rust makes it much harder to
accidentally trigger UB.


> 
> I think some of the C standards problems came from the fact that at
> some point the standards people decided that the only way to specify
> the language was from a high-level language _syntax_ standpoint.
> 
> Which is odd, because a lot of the original C semantics came from
> basically a "this is how the result works". It's where a lot of the
> historical C architecture-defined (and undefined) details come from:
> things like how integer division rounding happens, how shifts bigger
> than the word size are undefined, etc. But most tellingly, it's how
> "volatile" was defined.

Compiler changed here, not the C standard.  Of course, later the
compiler people in ISO WG14 may have pushed back against 
*removing* UB or even clarifying things (e.g. TS 6010 is not in C23
because compiler people want to evaluate the impact on optimization
first)

> 
> I suspect that what happened is that the C++ people hated the volatile
> definition *so* much (because of how they changed what an "access"
> means), that they then poisoned the C standards body against
> specifying behavior in terms of how the code *acts*, and made all
> subsequent C standards rules be about some much more abstract
> higher-level model that could not ever talk about actual code
> generation, only about syntax.

At least since C89 the model did not change.
For example, see "5.1.2.3 Program execution" in this draft
for C89:

https://www.open-std.org/JTC1/sc22/wg14/www/docs/n1256.pdf


C++ was not standardized until 1998.

> And that was a fundamental shift, and not a good one.
> 
> It caused basically insurmountable problems for the memory model
> descriptions. Paul McKenney tried to introduce the RCU memory model
> requirements into the C memory model discussion, and it was entirely
> impossible. You can't describe memory models in terms of types and
> syntax abstractions. You *have* to talk about what it means for the
> actual code generation.

The C model for concurrency indeed came to C11 from C++.  It is defined
in terms of accesses to memory objects and when those accesses
become visible to other threads.
> 
> The reason? The standards people wanted to describe the memory model
> not at a "this is what the program does" level, but at the "this is
> the type system and the syntactic rules" level. So the RCU accesses
> had to be defined in terms of the type system, but the actual language
> rules for the RCU accesses are about how the data is then used after
> the load.

If your point is that this should be phrased in terms of atomic
accesses instead of accesses to atomic objects, then I absolutely
agree with you.  This is something I tried to get fixed, but it
is difficult. The concurrency work mostly happens in WG21 
and not WG14.

But still, the fundamental definition of the model is in terms
of accesses and when those become visible to other threads, and
not in terms of syntax and types.

> 
> (We have various memory model documentation in
> tools/memory-model/Documentation and that goes into the RCU rules in
> *much* more detail, but simplified and much shortened: a
> "rcu_dereference()" could be seen as a much weaker form of
> "load_acquire": it's a barrier only to accesses that are
> data-dependencies, and if you turn a data dependency into a control
> dependency you have to then add specific barriers.
> 
> When a variable access is no longer about "this loads this value from
> memory", but is something much more high-level, trying to describe
> that is complete chaos. Plus the description gets to be so abstract
> that nobody understands it - neither the user of the language nor the
> person implementing the compiler.
> 
> So I am personally - after having seen that complete failure as a
> by-stander - 100% convinced that the semantics of a language *should*
> be defined in terms of behavior, not in terms of syntax and types.
> Sure, you have to describe the syntax and type system *too*, but then
> you use those to explain the behavior and use the behavior to explain
> what the allowable optimizations are.
> 
> > So the Rust compiler promises nothing about the shape of the assembly
> > you will get, only about its "visible" behavior
> 
> Oh, absolutely. That should be the basic rule of optimization: you can
> do anything AT ALL, as long as the visible behavior is the same.
> 
> > (and which exact memory access occurs when is generally
> > not considered "visible").
> 
> .. but this really has to be part of it. It's obviously part of it
> when there might be aliases, but it's also part of it when there is
> _any_ question about threading and/or memory ordering.
> 
> And just as an example: threading fundamentally introduces a notion of
> "aliasing" because different *threads* can access the same location
> concurrently. And that actually has real effects that a good language
> absolutely needs to deal with, even when there is absolutely *no*
> memory ordering or locking in the source code.
> 
> For example, it means that you cannot ever widen stores unless you
> know that the data you are touching is thread-local. Because the bytes
> *next* to you may not be things that you control.
> 
> It also *should* mean that a language must never *ever* rematerialize
> memory accesses (again, unless thread-local).
> 
> Seriously - I consider memory access rematerialization a huge bug, and
> both a security and correctness issue. I think it should be expressly
> forbidden in *any* language that claims to be reliablel.
> Rematerialization of memory accesses is a bug, and is *hugely* visible
> in the end result. It introduces active security issues and makes
> TOCTOU (Time-of-check to time-of-use) a much bigger problem than it
> needs to be.

Rematerialization or widening is essentially forbidden by 
the C++ / C memory model.

> 
> So memory accesses need to be part of the "visible" rules.
> 
> I claim that C got that right with "volatile". What C got wrong was to
> move away from that concept, and _only_ have "volatile" defined in
> those terms. Because "volatile" on its own is not very good (and that
> "not very good" has nothing to do with the mess that C++ made of it).

I don't get your point. The compiler needs to preserve
observable behavior (which includes volatile accesses), while
the concurrency model is defined in terms of visibility of
stored values as seen by loads from other threads.  This
visibility does not imply observable behavior, so all non-volatile
accesses do not have to be preserved by optimizations. Still this
model fundamentally constrains the optimization, e.g. by ruling
out the widening stores you mention above.   I think this is
basically how this *has* to work, or at least I do not see how
this can be done differently. 

I think C++ messed up a lot (including time-travel UB, uninitialized
variables, aliasing ules and much more), but I do not see
the problem here.


Martin