linux-kernel - Re: C aggregate passing (Rust kernel policy)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <p4bawegz52nu3v2l25gnj5gh34patcxeggcdbom327wh3dhxyq@cp735olb55ps>
Date: Fri, 28 Feb 2025 11:21:47 -0500
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Boqun Feng <boqun.feng@...il.com>
Cc: Ralf Jung <post@...fj.de>, David Laight <david.laight.linux@...il.com>, 
	Steven Rostedt <rostedt@...dmis.org>, Linus Torvalds <torvalds@...ux-foundation.org>, 
	Martin Uecker <uecker@...raz.at>, "Paul E. McKenney" <paulmck@...nel.org>, 
	Alice Ryhl <aliceryhl@...gle.com>, Ventura Jack <venturajack85@...il.com>, 
	Gary Guo <gary@...yguo.net>, airlied@...il.com, ej@...i.de, gregkh@...uxfoundation.org, 
	hch@...radead.org, hpa@...or.com, ksummit@...ts.linux.dev, 
	linux-kernel@...r.kernel.org, miguel.ojeda.sandonis@...il.com, rust-for-linux@...r.kernel.org
Subject: Re: C aggregate passing (Rust kernel policy)

On Fri, Feb 28, 2025 at 08:13:09AM -0800, Boqun Feng wrote:
> On Fri, Feb 28, 2025 at 11:04:28AM -0500, Kent Overstreet wrote:
> > On Fri, Feb 28, 2025 at 07:46:23AM -0800, Boqun Feng wrote:
> > > On Fri, Feb 28, 2025 at 10:41:12AM -0500, Kent Overstreet wrote:
> > > > On Fri, Feb 28, 2025 at 08:44:58AM +0100, Ralf Jung wrote:
> > > > > Hi,
> > > > > 
> > > > > > > I guess you can sum this up to:
> > > > > > > 
> > > > > > >    The compiler should never assume it's safe to read a global more than the
> > > > > > >    code specifies, but if the code reads a global more than once, it's fine
> > > > > > >    to cache the multiple reads.
> > > > > > > 
> > > > > > > Same for writes, but I find WRITE_ONCE() used less often than READ_ONCE().
> > > > > > > And when I do use it, it is more to prevent write tearing as you mentioned.
> > > > > > 
> > > > > > Except that (IIRC) it is actually valid for the compiler to write something
> > > > > > entirely unrelated to a memory location before writing the expected value.
> > > > > > (eg use it instead of stack for a register spill+reload.)
> > > > > > Not gcc doesn't do that - but the standard lets it do it.
> > > > > 
> > > > > Whether the compiler is permitted to do that depends heavily on what exactly
> > > > > the code looks like, so it's hard to discuss this in the abstract.
> > > > > If inside some function, *all* writes to a given location are atomic (I
> > > > > think that's what you call WRITE_ONCE?), then the compiler is *not* allowed
> > > > > to invent any new writes to that memory. The compiler has to assume that
> > > > > there might be concurrent reads from other threads, whose behavior could
> > > > > change from the extra compiler-introduced writes. The spec (in C, C++, and
> > > > > Rust) already works like that.
> > > > > 
> > > > > OTOH, the moment you do a single non-atomic write (i.e., a regular "*ptr =
> > > > > val;" or memcpy or so), that is a signal to the compiler that there cannot
> > > > > be any concurrent accesses happening at the moment, and therefore it can
> > > > > (and likely will) introduce extra writes to that memory.
> > > > 
> > > > Is that how it really works?
> > > > 
> > > > I'd expect the atomic writes to have what we call "compiler barriers"
> > > > before and after; IOW, the compiler can do whatever it wants with non
> > > 
> > > If the atomic writes are relaxed, they shouldn't have "compiler
> > > barriers" before or after, e.g. our kernel atomics don't have such
> > > compiler barriers. And WRITE_ONCE() is basically relaxed atomic writes.
> > 
> > Then perhaps we need a better definition of ATOMIC_RELAXED?
> > 
> > I've always taken ATOMIC_RELAXED to mean "may be reordered with accesses
> > to other memory locations". What you're describing seems likely to cause
> 
> You lost me on this one. if RELAXED means "reordering are allowed", then
> why the compiler barriers implied from it?

yes, compiler barrier is the wrong language here

> > e.g. if you allocate a struct, memset() it to zero it out, then publish
> > it, then do a WRITE_ONCE()...
> 
> How do you publish it? If you mean:
> 
> 	// assume gp == NULL initially.
> 
> 	*x = 0;
> 	smp_store_release(gp, x);
> 
> 	WRITE_ONCE(*x, 1);
> 
> and the other thread does
> 
> 	x = smp_load_acquire(gp);
> 	if (p) {
> 		r1 = READ_ONCE(*x);
> 	}
> 
> r1 can be either 0 or 1.

So if the compiler does obey the store_release barrier, then we're ok.

IOW, that has to override the "compiler sees the non-atomic store as a
hint..." - but the thing is, since we're moving more to type system
described concurrency than helpers, I wonder if that will actually be
the case.

Also, what's the situation with reads? Can we end up in a situation
where a non-atomic read causes the compiler do erronious things with an
atomic_load(..., relaxed)?