linux-kernel - Re: [RFC] tools/memory-model: Rule out OOTA

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a3bf910f-509a-4ad3-a1cc-4b14ef9b3259@rowland.harvard.edu>
Date: Tue, 7 Jan 2025 11:09:55 -0500
From: Alan Stern <stern@...land.harvard.edu>
To: Jonas Oberhauser <jonas.oberhauser@...weicloud.com>
Cc: paulmck@...nel.org, parri.andrea@...il.com, will@...nel.org,
	peterz@...radead.org, boqun.feng@...il.com, npiggin@...il.com,
	dhowells@...hat.com, j.alglave@....ac.uk, luc.maranget@...ia.fr,
	akiyks@...il.com, dlustig@...dia.com, joel@...lfernandes.org,
	urezki@...il.com, quic_neeraju@...cinc.com, frederic@...nel.org,
	linux-kernel@...r.kernel.org, lkmm@...ts.linux.dev,
	hernan.poncedeleon@...weicloud.com
Subject: Re: [RFC] tools/memory-model: Rule out OOTA

On Mon, Jan 06, 2025 at 10:40:03PM +0100, Jonas Oberhauser wrote:
> The current LKMM allows out-of-thin-air (OOTA), as evidenced in the following
> example shared on this list a few years ago:
> 
>   P0(int *a, int *b, int *x, int *y) {
>   	int r1;
> 
>   	r1 = READ_ONCE(*x);
>   	smp_rmb();
>   	if (r1 == 1) {
>   		*a = *b;
>   	}
>   	smp_wmb();
>   	WRITE_ONCE(*y, 1);
>   }
> 
>   P1(int *a, int *b, int *x, int *y) {
>   	int r1;
> 
>   	r1 = READ_ONCE(*y);
>   	smp_rmb();
>   	if (r1 == 1) {
>   		*b = *a;
>   	}
>   	smp_wmb();
>   	WRITE_ONCE(*x, 1);
>   }
> 
>   exists b=42
> 
> The root cause is an interplay between plain accesses and rw-fences, i.e.,
> smp_rmb() and smp_wmb(): while smp_rmb() and smp_wmb() provide sufficient
> ordering for plain accesses to rule out data races, they do not in the current
> formalization generally actually order the plain accesses, allowing, e.g., the
> load and stores to *b to proceed in any order even if P1 reads from P0; and in
> particular, the marked accesses around those plain accesses are also not
> ordered, which causes this OOTA.

That's right.  The memory model deliberately tries to avoid placing 
restrictions on plain accesses, whenever it can.

In the example above, for instance, I think it's more interesting to ask

	exists 0:r1=1 /\ 1:r1=1

than to concentrate on a and b.

OOTA is a very difficult subject.  It can be approached only by making 
the memory model take all sorts of compiler optimizations into account, 
and doing this for all possible optimizations is not feasible.

(For example, in a presentation to the C++ working group last year, Paul 
and I didn't try to show how to extend the C++ memory model to exclude 
OOTA [other than by fiat, as it does now].  Instead we argued that with 
the existing memory model, no reasonable compiler would ever produce an 
executable that could exhibit OOTA and so the memory model didn't need 
to be changed.)

> In this patch, we choose the rather conservative approach of forcing only the
> order of these marked accesses, specifically, when a marked read r is
> separated from a plain read r' by an smp_rmb() (or r' has an address
> dependency on r or is r'), on which a write w' depends, and w' is either plain
> and seperated by a subsequent write w by an smp_wmb() (or w' is w), then r
> precedes w in ppo.

Is this really valid?  In the example above, if there were no other 
references to a or b in the rest of the program, the compiler could 
eliminate them entirely.  (Whether the result could count as OOTA is 
open to question, but that's not the point.)  Is it not possible that a 
compiler might find other ways to defeat your intentions?

In any case, my feeling is that memory models for higher languages 
(i.e., anything above the assembler level) should not try very hard to 
address the question of OOTA.  And for LKMM, OOTA involving _plain_ 
accesses is doubly out of bounds.

Your proposed change seems to add a significant complication to the 
memory model for a not-very-clear benefit.

Alan