linux-kernel - RE: "Verifying and Optimizing Compact NUMA-Aware Locks on Weak Memory Models"

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e8b6b7222a894984b4d66cdcc6435efe@huawei.com>
Date:   Sat, 10 Sep 2022 20:41:30 +0000
From:   Hernan Luis Ponce de Leon <hernanl.leon@...wei.com>
To:     Alan Stern <stern@...land.harvard.edu>
CC:     Jonas Oberhauser <jonas.oberhauser@...wei.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        "parri.andrea@...il.com" <parri.andrea@...il.com>,
        "will@...nel.org" <will@...nel.org>,
        "npiggin@...il.com" <npiggin@...il.com>,
        "dhowells@...hat.com" <dhowells@...hat.com>,
        "j.alglave@....ac.uk" <j.alglave@....ac.uk>,
        "luc.maranget@...ia.fr" <luc.maranget@...ia.fr>,
        "akiyks@...il.com" <akiyks@...il.com>,
        "dlustig@...dia.com" <dlustig@...dia.com>,
        "joel@...lfernandes.org" <joel@...lfernandes.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>
Subject: RE: "Verifying and Optimizing Compact NUMA-Aware Locks on Weak Memory
 Models"

> You were quoting Jonas here, right?  The email doesn't make this obvious
> because it doesn't have two levels of "> > " markings.

Yes, I was quoting Jonas. 
It seems my mail client did not format the email correctly and I did not notice.
Sorry for that.

> In general, _no_ two distinct relations in the LKMM have the same propagation
> properties.  If wmb always behaved the same way as mb, we wouldn't use two
> separate words for them.

I understand that relations with different names are intended to be different.
What I meant was 
	"wmb gives weaker propagation guarantees than mb and because of this, liveness of qspinlock is not guaranteed in LKMM" 

> 
> > The claim is based on these relations from the memory model
> >
> > let strong-fence = mb | gp
> > ...
> > let cumul-fence = [Marked] ; (A-cumul(strong-fence | po-rel) | wmb |
> > 	po-unlock-lock-po) ; [Marked]
> > let prop = [Marked] ; (overwrite & ext)? ; cumul-fence* ;
> > 	[Marked] ; rfe? ; [Marked]
> 
> Please be more specific.  What difference between mb and wmb are you
> concerned about?  

Since the code uses wmb, there are certain pairs of events that will not be in strong-fence.
Since strong-fence contributes to cumul-fence, cumul-fence to prop, and prop to hb, 
the pair of events related by hb is less that if the code would use mb instead.
Because of this, there are executions (in particular the one that violates liveness) that have 
an acyclic hb relation, but would create a cycle (and thus the memory model would not accept 
the behavior) if mb would have been used.

> Can you give a small litmus test that illustrates this
> difference?  Can you explain in more detail how this difference affects the
> qspinlock implementation?

Here is a litmus test showing the problem (I hope the comment are enough to relate it back to qspinlock)

C Liveness
{
  atomic_t x = ATOMIC_INIT(0);
  atomic_t y = ATOMIC_INIT(0);
}

P0(atomic_t *x) {
  // clear_pending_set_locked
  int r0 = atomic_fetch_add(2,x) ;
}

P1(atomic_t *x, int *z) {
  // queued_spin_trylock
  int r0 = atomic_read(x);
  // barrier after the initialization of nodes
  smp_wmb();
  // xchg_tail
  int r1 = atomic_cmpxchg_relaxed(x,r0,42);
  // link node into the waitqueue
  WRITE_ONCE(*z, 1);
}

P2(atomic_t *x,atomic_t *z) {
  // node initialization
  WRITE_ONCE(*z, 2);
  // queued_spin_trylock
  int r0 = atomic_read(x);
  // barrier after the initialization of nodes
  smp_wmb();
  // xchg_tail
  int r1 = atomic_cmpxchg_relaxed(x,r0,24);
}

exists (0:r0 = 24 /\ 1:r0 = 26 /\ z=2)

herd7 says that the behavior is observable. 
However if you change wmb by mb, it is not observable anymore.

> 
> > From an engineering perspective, I think the only issue is that cat
> > *currently* does not have any syntax for this,
> 
> Syntax for what?  The difference between wmb and mb?
> 
> >  nor does herd currently
> > implement the await model checking techniques proposed in those works
> > (c.f. Theorem 5.3. in the "making weak memory models fair" paper,
> > which says that for this kind of loop, iff the mo-maximal reads in
> > some graph are read in a loop iteration that does not exit the loop,
> > the loop can run forever). However GenMC and I believe also Dat3M and
> > recently also Nidhugg support such techniques. It may not even be too
> > much effort to implement something like this in herd if desired.
> 
> I believe that herd has no way to express the idea of a program running forever.
> On the other hand, it's certainly true (in all of these
> models) than for any finite number N, there is a feasible execution in which a
> loop runs for more than N iterations before the termination condition eventually
> becomes true.

Here I was again quoting Jonas.
I think his intention was to make a distinction between graph based semantics and tools.
While herd7 cannot reason about this kind of property, it is possible to define the property 
using graph based semantics and there are tools already using this.

Hernan