linux-kernel - Re: [PATCH v5 tip/core/locking 5/7] Documentation/memory-barriers.txt: Downgrade UNLOCK+LOCK

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131210171247.GQ4208@linux.vnet.ibm.com>
Date:	Tue, 10 Dec 2013 09:12:47 -0800
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	linux-kernel@...r.kernel.org, mingo@...nel.org,
	laijs@...fujitsu.com, dipankar@...ibm.com,
	akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
	josh@...htriplett.org, niv@...ibm.com, tglx@...utronix.de,
	rostedt@...dmis.org, dhowells@...hat.com, edumazet@...gle.com,
	darren@...art.com, fweisbec@...il.com, sbw@....edu,
	Ingo Molnar <mingo@...hat.com>,
	Oleg Nesterov <oleg@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Will Deacon <will.deacon@....com>,
	Tim Chen <tim.c.chen@...ux.intel.com>,
	Waiman Long <waiman.long@...com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Andi Kleen <andi@...stfloor.org>,
	Michel Lespinasse <walken@...gle.com>,
	Davidlohr Bueso <davidlohr.bueso@...com>,
	Rik van Riel <riel@...hat.com>,
	Peter Hurley <peter@...leysoftware.com>,
	"H. Peter Anvin" <hpa@...or.com>, Arnd Bergmann <arnd@...db.de>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>
Subject: Re: [PATCH v5 tip/core/locking 5/7]
 Documentation/memory-barriers.txt: Downgrade UNLOCK+LOCK

On Tue, Dec 10, 2013 at 02:14:22PM +0100, Peter Zijlstra wrote:
> On Mon, Dec 09, 2013 at 05:28:01PM -0800, Paul E. McKenney wrote:
> > +An UNLOCK followed by a LOCK may -not- be assumed to be a full memory
> > +barrier because it is possible for a preceding UNLOCK to pass a later LOCK
> > +from the viewpoint of the CPU, but not from the viewpoint of the compiler.
> > +Note that deadlocks cannot be introduced by this interchange because if
> > +such a deadlock threatened, the UNLOCK would simply complete.
> 
> For me its easier to read if we start a new paragraph here.

Works for me!

> > If it is
> > +necessary for an UNLOCK-LOCK pair to produce a full barrier, the LOCK
> > +can be followed by an smp_mb__after_unlock_lock() invocation.  This will
> > +produce a full barrier if either (a) the UNLOCK and the LOCK are executed
> > +by the same CPU or task, or (b) the UNLOCK and LOCK act on the same
> > +lock variable.  The smp_mb__after_unlock_lock() primitive is free on
> > +many architectures. 
> 
> The way I read the above it says that you need
> smp_mb__after_unlock_lock() when the UNLOCK and LOCK are on the same
> variable. That doesn't make sense, I thought that was the one case we
> all agreed on it would indeed be a full barrier without extra trickery.

On x86, sure, but smp_mb__after_unlock_lock() is nothingness on x86
anyway.  Other architectures might benefit from requiring that the
smp_mb__after_unlock_lock() be used in this case.

> So I would expect something like:
> 
> "If it is necessary for an UNLOCK-LOCK pair to produce a full barrier,
> you must either ensure they operate on the same lock variable, or place
> smp_mb__after_unlock_lock() after the LOCK."

I respectfully disagree.  Requiring the smp_mb__after_unlock_lock()
isn't going to hurt anything and making the full-barrier assumption
explicit will make it a lot easier to inspect this stuff.

> > Without smp_mb__after_unlock_lock(), the UNLOCK
> > +and LOCK can cross:
> > +
> > +	*A = a;
> > +	UNLOCK
> > +	LOCK
> > +	*B = b;
> > +
> > +may occur as:
> > +
> > +	LOCK, STORE *B, STORE *A, UNLOCK
> > +
> > +With smp_mb__after_unlock_lock(), they cannot, so that:
> > +
> > +	*A = a;
> > +	UNLOCK
> > +	LOCK
> > +	smp_mb__after_unlock_lock();
> > +	*B = b;
> > +
> > +will always occur as:
> > +
> > +	STORE *A, UNLOCK, LOCK, STORE *B
> > +
> 
> Since we introduced the concept of lock variables -- since it now
> matters if the UNLOCK and LOCK act on the same one or not, we should
> reflect that in the above examples (and maybe throughout the document).
> 
> That is; we should clarify:
> 
>   *A = a
>   UNLOCK x
>   LOCK y
>   *B = b
> 
> Being different from:
> 
>   *A = a
>   UNLOCK x
>   LOCK x
>   *B = b
> 
> I also find the wording slightly weird in that LOCK and UNLOCK are
> stopped from crossing by smp_mb__after_unlock_lock(). They are not, what
> it stopped is *B = b from moving up and the rest from moving down. The
> UNLOCK and LOCK can still cross -- they happened before we issued the
> barrier after all.

Good point -- the UNLOCK and LOCK are guaranteed to be ordered only
if they both operate on the same lock variable.  OK, I will make the
example use different lock variables and show the different outcomes.
How about the following?

	If it is necessary for an UNLOCK-LOCK pair to
	produce a full barrier, the LOCK can be followed by an
	smp_mb__after_unlock_lock() invocation.  This will produce a
	full barrier if either (a) the UNLOCK and the LOCK are executed
	by the same CPU or task, or (b) the UNLOCK and LOCK act on the
	same lock variable.  The smp_mb__after_unlock_lock() primitive is
	free on many architectures.  Without smp_mb__after_unlock_lock(),
	the UNLOCK and LOCK can cross:

		*A = a;
		UNLOCK M
		LOCK N
		*B = b;

	could occur as:

		LOCK N, STORE *B, STORE *A, UNLOCK M

	With smp_mb__after_unlock_lock(), they cannot, so that:

		*A = a;
		UNLOCK M
		LOCK N
		smp_mb__after_unlock_lock();
		*B = b;

	will always occur as either of the following:

		STORE *A, UNLOCK, LOCK, STORE *B
		STORE *A, LOCK, UNLOCK, STORE *B

	If the UNLOCK and LOCK were instead both operating on the same
	lock variable, only the first of these two alternatives can occur.

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/