linux-kernel - Re: "Verifying and Optimizing Compact NUMA-Aware Locks on Weak Memory Models"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YyHoG8jS/6shACws@rowland.harvard.edu>
Date:   Wed, 14 Sep 2022 10:41:31 -0400
From:   Alan Stern <stern@...land.harvard.edu>
To:     Hernan Luis Ponce de Leon <hernanl.leon@...wei.com>
Cc:     Jonas Oberhauser <jonas.oberhauser@...wei.com>,
        Joel Fernandes <joel@...lfernandes.org>,
        Boqun Feng <boqun.feng@...il.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        "parri.andrea@...il.com" <parri.andrea@...il.com>,
        "will@...nel.org" <will@...nel.org>,
        "npiggin@...il.com" <npiggin@...il.com>,
        "dhowells@...hat.com" <dhowells@...hat.com>,
        "j.alglave@....ac.uk" <j.alglave@....ac.uk>,
        "luc.maranget@...ia.fr" <luc.maranget@...ia.fr>,
        "akiyks@...il.com" <akiyks@...il.com>,
        "dlustig@...dia.com" <dlustig@...dia.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>
Subject: Re: "Verifying and Optimizing Compact NUMA-Aware Locks on Weak
 Memory Models"

On Mon, Sep 12, 2022 at 11:10:17AM +0000, Hernan Luis Ponce de Leon wrote:
> I think it is somehow possible to show the liveness violation using herd7 and the following variant of the code
> 
> ------------------------------------------------------------------------
> C Liveness
> {
>   atomic_t x = ATOMIC_INIT(0);
>   atomic_t y = ATOMIC_INIT(0);
> }
> 
> 
> P0(atomic_t *x) {
>   // clear_pending_set_locked
>   int r0 = atomic_fetch_add(2,x) ;
> }
> 
> P1(atomic_t *x, int *z, int *y) {
>   // this store breaks liveness
>   WRITE_ONCE(*y, 1);
>   // queued_spin_trylock
>   int r0 = atomic_read(x);
>   // barrier after the initialisation of nodes
>   smp_wmb();
>   // xchg_tail
>   int r1 = atomic_cmpxchg_relaxed(x,r0,42);
>   // link node into the waitqueue
>   WRITE_ONCE(*z, 1);
> }
> 
> P2(atomic_t *x,int *z, int *y) {
>   // node initialisation
>   WRITE_ONCE(*z, 2);
>   // queued_spin_trylock
>   int r0 = atomic_read(x);
>   // barrier after the initialisation of nodes
>   smp_wmb();
>   // if we read z==2 we expect to read this store
>   WRITE_ONCE(*y, 0);
>   // xchg_tail
>   int r1 = atomic_cmpxchg_relaxed(x,r0,24);
>   // spinloop
>   int r2 = READ_ONCE(*y);  
>   int r3 = READ_ONCE(*z);  
> }
> 
> exists (z=2 /\ y=1 /\ 2:r2=1 /\ 2:r3=2)
> ------------------------------------------------------------------------
> 
> Condition "2:r3=2" forces the spinloop to read from the first write in P2 and "z=2" forces this write 
> to be last in the coherence order. Conditions "2:r2=1" and "y=1" force the same for the other read.
> herd7 says this behavior is allowed by LKMM, showing that liveness can be violated.
> 
> In all the examples above, if we use mb() instead of wmb(), LKMM does not accept
> the behavior and thus liveness is guaranteed.

That's right.

The issue may be somewhat confused by the fact that there have been 
_two_ separate problems covered in this discussion.  One has to do with 
the use of relaxed vs. non-relaxed atomic accesses, and the other -- 
this one -- has to do with liveness (eventual termination of a spin 
loop).

You can see the distinction quite clearly by noticing in the litmus test 
above, the variable x plays absolutely no role.  There are no 
dependencies from it, it isn't accessed by any instructions that include 
an acquire or release memory barrier, and it isn't used in the "exists" 
clause.  If we remove x from the test (and remove P0, which is now 
vacuous), and we also remove the unneeded reads at the end of P2 
(unneeded because they observe the co-final values stored in y and z), 
what remains is:

P1(int *z, int *y) {
	WRITE_ONCE(*y, 1);
	smp_wmb();
	WRITE_ONCE(*z, 1);
}

P2(int *z, int *y) {
	WRITE_ONCE(*z, 2);
	smp_wmb();
	WRITE_ONCE(*y, 0);
}

exists (z=2 /\ y=1)

which is exactly the 2+2W pattern.  As others have pointed out, this 
pattern is permitted by LKML because we never found a good reason to 
forbid it, even though it cannot occur on any real hardware that 
supports Linux.

On the other hand, the simplicity of this "liveness" test leads me to 
wonder if it isn't missing some crucial feature of the actual qspinlock 
implementation.

Alan