linux-kernel - Re: [PATCH] doc: Update wake_up() & co. memory-barrier guarantees

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20180627141522.GA10533@andrea>
Date:   Wed, 27 Jun 2018 16:15:22 +0200
From:   Andrea Parri <andrea.parri@...rulasolutions.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
        Alan Stern <stern@...land.harvard.edu>,
        Will Deacon <will.deacon@....com>,
        Boqun Feng <boqun.feng@...il.com>,
        Nicholas Piggin <npiggin@...il.com>,
        David Howells <dhowells@...hat.com>,
        Jade Alglave <j.alglave@....ac.uk>,
        Luc Maranget <luc.maranget@...ia.fr>,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
        Akira Yokosawa <akiyks@...il.com>,
        Daniel Lustig <dlustig@...dia.com>,
        Jonathan Corbet <corbet@....net>,
        Ingo Molnar <mingo@...hat.com>,
        Randy Dunlap <rdunlap@...radead.org>
Subject: Re: [PATCH] doc: Update wake_up() & co. memory-barrier guarantees

> So I'm not actually sure how many people rely on the RCsc transitive
> smp_mb() here. People certainly rely on the RELEASE semantics, and the
> code itself requires the store->load ordering, together that gives us
> the smp_mb() because that's simply the only barrier we have.
> 
> And looking at smp_mb__after_spinlock() again, we really only need the
> RCsc thing for rq->lock, not for the wakeups. The wakeups really only
> need that RCpc RELEASE + store->load thing (which we don't have).
> 
> So yes, smp_mb(), however the below still makes more sense to me, or am
> I just being obtuse again?

While trying to integrate these remarks into v1 and looking again at the
comment before smp_mb__after_spinlock(), I remembered a discussion where
Boqun suggested some improvements for this comment: so I wrote the commit
reported at the end of this email.

This raises the following two issues:

 1) First, the problem of integrating the resulting comment into v1,
    where I've been talking about _full_ barriers associated to the
    wakeups fuctions but where these are actually implemented as:
    
       spin_lock(s);
       smp_mb__after_spinlock();

 2) Second, the problem of formalizing the requirements described in
    that comment (remark: informally, the LKMM currently states that
    the sequence "spin_lock(s); smp_mb__after_spinlock();" generates
    a full barrier; in particular, this sequence orders
    
       {STORE,LOAD} -> {STORE,LOAD}

    according to the current LKMM).

For (1), I could simply replace each occurrence of "executes a full memory
barrier" with "execute the sequence spin_lock(s); smp_mb__after_spinlock()";
I haven't really thought about (2) yet, but please notice that something as
simple as

   let mb = [...]  |
            ([W] ; po? ; [LKW] ; fencerel(After-spinlock) ; [R])

would _not_ guarantee "RCsc transitivity" ...

A different approach (that could solve both problems at once) would be to
follow the current formalization in LKMM and to modify the comment before
smp_mb__after_spinlock() accordingly (say, informally, "it's required that
that spin_lock(); smp_mb__after_spinlock() provides a full barrier").

Thoughts?

  Andrea

>From c3648d5022bedcd356198efa65703e01541cbd3f Mon Sep 17 00:00:00 2001
From: Andrea Parri <andrea.parri@...rulasolutions.com>
Date: Wed, 27 Jun 2018 10:53:30 +0200
Subject: [PATCH 2/3] locking: Fix comment for smp_mb__after_spinlock()

Boqun reported that the snippet described in the header comment for
smp_mb__after_spinlock() can be misleading, because acquire/release
chains already provide us with the underlying guarantee (due to the
A-cumulativity of release).

This commit fixes the comment following Boqun's example in [1].

It's worth noting here that LKMM maintainers are currently actively
debating whether to enforce RCsc transitivity of locking operations
"by definition" [2]; should the guarantee be enforced in the future,
the requirement for smp_mb__after_spinlock() could be simplified to
include only the STORE->LOAD ordering requirement.

[1] http://lkml.kernel.org/r/20180312085600.aczjkpn73axzs2sb@tardis
[2] http://lkml.kernel.org/r/Pine.LNX.4.44L0.1711271553490.1424-100000@iolanthe.rowland.org
    http://lkml.kernel.org/r/Pine.LNX.4.44L0.1806211322160.2381-100000@iolanthe.rowland.org

Reported-and-Suggested-by: Boqun Feng <<boqun.feng@...il.com>
Signed-off-by: Andrea Parri <andrea.parri@...rulasolutions.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Will Deacon <will.deacon@....com>
Cc: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
---
 include/linux/spinlock.h | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index 1e8a464358384..c74828fe8d75c 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -121,22 +121,22 @@ do {								\
  *
  *   - it must ensure the critical section is RCsc.
  *
- * The latter is important for cases where we observe values written by other
- * CPUs in spin-loops, without barriers, while being subject to scheduling.
+ * The latter requirement guarantees that stores from two critical sections
+ * in different CPUs are ordered even outside the critical sections.  As an
+ * example illustrating this property, consider the following snippet:
  *
- * CPU0			CPU1			CPU2
+ * CPU0			CPU1				CPU2
  *
- *			for (;;) {
- *			  if (READ_ONCE(X))
- *			    break;
- *			}
- * X=1
- *			<sched-out>
- *						<sched-in>
- *						r = X;
+ * spin_lock(s);	spin_lock(s);			r2 = READ_ONCE(Y);
+ * WRITE_ONCE(X, 1);	smp_mb__after_spinlock();	smp_rmb();
+ * spin_unlock(s);	r1 = READ_ONCE(X);		r3 = READ_ONCE(X);
+ *			WRITE_ONCE(Y, 1);
+ *			spin_unlock(s);
  *
- * without transitivity it could be that CPU1 observes X!=0 breaks the loop,
- * we get migrated and CPU2 sees X==0.
+ * Without RCsc transitivity, it is allowed that CPU0's critical section
+ * precedes CPU1's critical section (r1=1) and that CPU2 observes CPU1's
+ * store to Y (r2=1) while it does not observe CPU0's store to X (r3=0),
+ * despite the presence of the smp_rmb().
  *
  * Since most load-store architectures implement ACQUIRE with an smp_mb() after
  * the LL/SC loop, they need no further barriers. Similarly all our TSO
-- 
2.7.4