lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Fri, 11 Dec 2015 13:26:47 +0100
From:	Peter Zijlstra <peterz@...radead.org>
To:	Will Deacon <will.deacon@....com>
Cc:	Andrew Pinski <andrew.pinski@...iumnetworks.com>,
	Davidlohr Bueso <dbueso@...e.de>,
	Thomas Gleixner <tglx@...utronix.de>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...nel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"linux-arm-kernel@...ts.infradead.org" 
	<linux-arm-kernel@...ts.infradead.org>, david.daney@...ium.com
Subject: Re: FW: Commit 81a43adae3b9 (locking/mutex: Use acquire/release
 semantics) causing failures on arm64 (ThunderX)

On Fri, Dec 11, 2015 at 12:18:00PM +0000, Will Deacon wrote:
> On Fri, Dec 11, 2015 at 01:13:19PM +0100, Peter Zijlstra wrote:
> > On Fri, Dec 11, 2015 at 12:04:19PM +0000, Will Deacon wrote:
> > > I think Andrew meant the atomic_xchg_acquire at the start of osq_lock,
> > > as opposed to "compare and swap". In which case, it does look like
> > > there's a bug here because there is nothing to order the initialisation
> > > of the node fields with publishing of the node, whether that's
> > > indirectly as a result of setting the tail to the current CPU or
> > > directly as a result of the WRITE_ONCE.
> > 
> > Agreed, this does indeed look like a bug. If confirmed please write a
> > shiny changelog and I'll queue asap.
> 
> Yup. I've failed to reproduce the issue locally, so we'll need to wait
> for Andrew and/or David to get back to us first.

While we're there, the acquire in osq_wait_next() seems somewhat ill
documented too.

I _think_ we need ACQUIRE semantics there because we want to strictly
order the lock-unqueue A,B,C steps and we get that with:

 A: SC
 B: ACQ
 C: Relaxed

Similarly for unlock we want the WRITE_ONCE to happen after
osq_wait_next, but in that case we can even rely on the control
dependency there.


As noted in a previous email, the ACQUIRE for osq_wait_next() does not
come from its use in lock since its on the fail path, and trylock
failure doesn't imply any barriers.

Not should it have RELEASE semantics for its use in unlock, since we
already have that covered by the xchg() done prior.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ