lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 10 Aug 2017 13:49:34 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Boqun Feng <boqun.feng@...il.com>,
        Waiman Long <longman@...hat.com>,
        Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
        Pan Xinhui <xinhui@...ux.vnet.ibm.com>,
        Andrea Parri <parri.andrea@...il.com>,
        Will Deacon <will.deacon@....com>
Subject: Re: [RESEND PATCH v5] locking/pvqspinlock: Relax cmpxchg's to
 improve performance on some archs

On Thu, Aug 10, 2017 at 11:13:17AM +0200, Peter Zijlstra wrote:
> On Thu, Aug 10, 2017 at 04:12:13PM +0800, Boqun Feng wrote:
> 
> > > Or is the reason this doesn't work on PPC that its RCpc?
> 
> So that :-)
> 
> > Here is an example why PPC needs a sync() before the cmpxchg():
> > 
> > 	https://marc.info/?l=linux-kernel&m=144485396224519&w=2
> > 
> > and Paul Mckenney's detailed explanation about why this could happen:
> > 
> > 	https://marc.info/?l=linux-kernel&m=144485909826241&w=2
> > 
> > (Somehow, I feel like he was answering to a similar question question as
> > you ask here ;-))
> 
> Yes, and I had vague memories of having gone over this before, but
> couldn't quickly find things. Thanks!
> 
> > And I think aarch64 doesn't have a problem here because it is "(other)
> > multi-copy atomic". Will?
> 
> Right, its the RCpc vs RCsc thing. The ARM64 release is as you say
> multi-copy atomic, whereas the PPC lwsync is not.
> 
> This still leaves us with the situation that we need an smp_mb() between
> smp_store_release() and a possibly failing cmpxchg() if we want to
> guarantee the cmpxchg()'s load comes after the store-release.

For whatever it is worth, this is why C11 allows specifying one
memory-order strength for the success case and another for the failure
case.  But it is not immediately clear that we need another level
of combinatorial API explosion...

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ