linux-kernel - [PATCH v5 tip/core/locking 5/7] Documentation/memory-barriers.txt: Downgrade UNLOCK+LOCK

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1386638883-25379-5-git-send-email-paulmck@linux.vnet.ibm.com>
Date:	Mon,  9 Dec 2013 17:28:01 -0800
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	linux-kernel@...r.kernel.org
Cc:	mingo@...nel.org, laijs@...fujitsu.com, dipankar@...ibm.com,
	akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
	josh@...htriplett.org, niv@...ibm.com, tglx@...utronix.de,
	peterz@...radead.org, rostedt@...dmis.org, dhowells@...hat.com,
	edumazet@...gle.com, darren@...art.com, fweisbec@...il.com,
	sbw@....edu, "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...hat.com>,
	Oleg Nesterov <oleg@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Will Deacon <will.deacon@....com>,
	Tim Chen <tim.c.chen@...ux.intel.com>,
	Waiman Long <waiman.long@...com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Andi Kleen <andi@...stfloor.org>,
	Michel Lespinasse <walken@...gle.com>,
	Davidlohr Bueso <davidlohr.bueso@...com>,
	Rik van Riel <riel@...hat.com>,
	Peter Hurley <peter@...leysoftware.com>,
	"H. Peter Anvin" <hpa@...or.com>, Arnd Bergmann <arnd@...db.de>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>
Subject: [PATCH v5 tip/core/locking 5/7] Documentation/memory-barriers.txt: Downgrade UNLOCK+LOCK

From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>

Historically, an UNLOCK+LOCK pair executed by one CPU, by one task,
or on a given lock variable has implied a full memory barrier.  In a
recent LKML thread, the wisdom of this historical approach was called
into question: http://www.spinics.net/lists/linux-mm/msg65653.html,
in part due to the memory-order complexities of low-handoff-overhead
queued locks on x86 systems.

This patch therefore removes this guarantee from the documentation, and
further documents how to restore it via a new smp_mb__after_unlock_lock()
primitive.

Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Oleg Nesterov <oleg@...hat.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Will Deacon <will.deacon@....com>
Cc: Tim Chen <tim.c.chen@...ux.intel.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Waiman Long <waiman.long@...com>
Cc: Andrea Arcangeli <aarcange@...hat.com>
Cc: Andi Kleen <andi@...stfloor.org>
Cc: Michel Lespinasse <walken@...gle.com>
Cc: Davidlohr Bueso <davidlohr.bueso@...com>
Cc: Rik van Riel <riel@...hat.com>
Cc: Peter Hurley <peter@...leysoftware.com>
Cc: "H. Peter Anvin" <hpa@...or.com>
Cc: Arnd Bergmann <arnd@...db.de>
Cc: Benjamin Herrenschmidt <benh@...nel.crashing.org>
---
 Documentation/memory-barriers.txt | 51 +++++++++++++++++++++++++++++++++------
 1 file changed, 44 insertions(+), 7 deletions(-)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index a0763db314ff..efb791d33e5a 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -1626,7 +1626,10 @@ for each construct.  These operations all imply certain barriers:
      operation has completed.
 
      Memory operations issued before the LOCK may be completed after the LOCK
-     operation has completed.
+     operation has completed.  An smp_mb__before_spinlock(), combined
+     with a following LOCK, acts as an smp_wmb().  Note the "w",
+     this is smp_wmb(), not smp_mb().  The smp_mb__before_spinlock()
+     primitive is free on many architectures.
 
  (2) UNLOCK operation implication:
 
@@ -1646,9 +1649,6 @@ for each construct.  These operations all imply certain barriers:
      All LOCK operations issued before an UNLOCK operation will be completed
      before the UNLOCK operation.
 
-     All UNLOCK operations issued before a LOCK operation will be completed
-     before the LOCK operation.
-
  (5) Failed conditional LOCK implication:
 
      Certain variants of the LOCK operation may fail, either due to being
@@ -1656,9 +1656,6 @@ for each construct.  These operations all imply certain barriers:
      signal whilst asleep waiting for the lock to become available.  Failed
      locks do not imply any sort of barrier.
 
-Therefore, from (1), (2) and (4) an UNLOCK followed by an unconditional LOCK is
-equivalent to a full barrier, but a LOCK followed by an UNLOCK is not.
-
 [!] Note: one of the consequences of LOCKs and UNLOCKs being only one-way
     barriers is that the effects of instructions outside of a critical section
     may seep into the inside of the critical section.
@@ -1677,6 +1674,40 @@ may occur as:
 
 	LOCK, STORE *B, STORE *A, UNLOCK
 
+An UNLOCK followed by a LOCK may -not- be assumed to be a full memory
+barrier because it is possible for a preceding UNLOCK to pass a later LOCK
+from the viewpoint of the CPU, but not from the viewpoint of the compiler.
+Note that deadlocks cannot be introduced by this interchange because if
+such a deadlock threatened, the UNLOCK would simply complete.  If it is
+necessary for an UNLOCK-LOCK pair to produce a full barrier, the LOCK
+can be followed by an smp_mb__after_unlock_lock() invocation.  This will
+produce a full barrier if either (a) the UNLOCK and the LOCK are executed
+by the same CPU or task, or (b) the UNLOCK and LOCK act on the same
+lock variable.  The smp_mb__after_unlock_lock() primitive is free on
+many architectures.  Without smp_mb__after_unlock_lock(), the UNLOCK
+and LOCK can cross:
+
+	*A = a;
+	UNLOCK
+	LOCK
+	*B = b;
+
+may occur as:
+
+	LOCK, STORE *B, STORE *A, UNLOCK
+
+With smp_mb__after_unlock_lock(), they cannot, so that:
+
+	*A = a;
+	UNLOCK
+	LOCK
+	smp_mb__after_unlock_lock();
+	*B = b;
+
+will always occur as:
+
+	STORE *A, UNLOCK, LOCK, STORE *B
+
 Locks and semaphores may not provide any guarantee of ordering on UP compiled
 systems, and so cannot be counted on in such a situation to actually achieve
 anything at all - especially with respect to I/O accesses - unless combined
@@ -1903,6 +1934,7 @@ However, if the following occurs:
 	UNLOCK M	     [1]
 	ACCESS_ONCE(*D) = d;		ACCESS_ONCE(*E) = e;
 					LOCK M		     [2]
+					smp_mb__after_unlock_lock();
 					ACCESS_ONCE(*F) = f;
 					ACCESS_ONCE(*G) = g;
 					UNLOCK M	     [2]
@@ -1920,6 +1952,11 @@ But assuming CPU 1 gets the lock first, CPU 3 won't see any of:
 	*F, *G or *H preceding LOCK M [2]
 	*A, *B, *C, *E, *F or *G following UNLOCK M [2]
 
+Note that the smp_mb__after_unlock_lock() is critically important
+here: Without it CPU 3 might see some of the above orderings.
+Without smp_mb__after_unlock_lock(), the accesses are not guaranteed
+to be seen in order unless CPU 3 holds lock M.
+
 
 LOCKS VS I/O ACCESSES
 ---------------------
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/