linux-kernel - RE: "Verifying and Optimizing Compact NUMA-Aware Locks on Weak Memory Models"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1326fa48d44b4571b436c07ae9f32d83@huawei.com>
Date:   Mon, 12 Sep 2022 10:46:38 +0000
From:   Jonas Oberhauser <jonas.oberhauser@...wei.com>
To:     Alan Stern <stern@...land.harvard.edu>,
        Hernan Luis Ponce de Leon <hernanl.leon@...wei.com>
CC:     Boqun Feng <boqun.feng@...il.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        "parri.andrea@...il.com" <parri.andrea@...il.com>,
        "will@...nel.org" <will@...nel.org>,
        "npiggin@...il.com" <npiggin@...il.com>,
        "dhowells@...hat.com" <dhowells@...hat.com>,
        "j.alglave@....ac.uk" <j.alglave@....ac.uk>,
        "luc.maranget@...ia.fr" <luc.maranget@...ia.fr>,
        "akiyks@...il.com" <akiyks@...il.com>,
        "dlustig@...dia.com" <dlustig@...dia.com>,
        "joel@...lfernandes.org" <joel@...lfernandes.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>
Subject: RE: "Verifying and Optimizing Compact NUMA-Aware Locks on Weak Memory
 Models"

Hi Alan,

Sorry for the confusion.

>  [...] it's certainly true (in all of these
models) than for any finite number N, there is a feasible execution in which a loop runs for more than N iterations before the termination condition eventually becomes true.

Indeed; but more interestingly, as the Theorem 5.3 in "making weak memory models fair" states, under certain conditions it suffices to look at graphs where N=1 to decide whether a loop can run forever (despite all writes propagating to all cores eventually) or will always terminate.

And since herd can generate all such graphs, herd could be extended to make that decision and output it, just like many other tools already do.

To illuminate this on an example, consider the program sent by Peter earlier:
	WRITE_ONCE(foo, 1);		while (!READ_ONCE(foo));

Without the assumption that fr is prefix finite, the graph with infinitely many reads of thread 2 all reading the initial value (and hence being fr-before the store foo=1) would be allowed. However, the tools used to analyze the qspinlock all assume that fr is prefix finite, and hence that such a graph with infinitely many fr-before events does not exist. Because of that, all of the tools will say that this loop always terminates.

However, if you change the code into the following:

	WRITE_ONCE(foo, 1);		WRITE_ONCE(foo, 0); while (!READ_ONCE(foo));

then even under the assumption of fr-prefix-finiteness, the coherence order in which WRITE_ONCE(foo, 1); is overwritten by WRITE_ONCE(foo, 0); of thread 2 would lead to non-terminating behaviors, and these are detected by those tools. Furthermore, if we synchronize the two stores as follows:

	while (! READ_ONCE(bar)) {}; WRITE_ONCE(foo, 1);		WRITE_ONCE(foo, 0); smp_store_release(&bar,1); while (!READ_ONCE(foo));

then the graphs with that co become prohibited as they all have hb cycles, and the tools again will not detect any liveness violations. But if we go further and relax the release barrier as below

	while (! READ_ONCE(bar)) {}; WRITE_ONCE(foo, 1);		WRITE_ONCE(foo, 0); WRITE_ONCE(bar,1); while (!READ_ONCE(foo));

then the hb cycle disappears, and the coherence order in which foo=0 overwrites foo=1 becomes allowed. Again, the tools will detect this and state that thread 2 could be stuck in its while loop forever.

In each of these cases, the decision can be made by looking for a graph in which the loop is executed for one iteration which reads from foo=0, and checking whether that store is co-maximal. So in some sense this is all that is needed to express the idea that a program can run forever, even though only in some very limited (but common) circumstances, namely that the loop iteration that repeats forever does not create modifications to state that are observed outside the loop. Luckily this is a very common case, so these checks have proven quite useful in practice.

The same kind of check could be implemented in herd together with either the implicit assumption that fr is prefix finite (like in other tools) or with some special syntax like

prefix-finite fr | co as fairness

which hopefully clears up the question below:

> > From an engineering perspective, I think the only issue is that cat
> > *currently* does not have any syntax for this,

> Syntax for what?  The difference between wmb and mb?
> [...]

Thanks for your patience and hoping I explained it more clearly,

jonas