netdev - Re: [PATCH v2 0/9] Remove spin_unlock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20170706152110.GZ2393@linux.vnet.ibm.com>
Date:   Thu, 6 Jul 2017 08:21:10 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     David Laight <David.Laight@...LAB.COM>
Cc:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "netfilter-devel@...r.kernel.org" <netfilter-devel@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "oleg@...hat.com" <oleg@...hat.com>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "dave@...olabs.net" <dave@...olabs.net>,
        "manfred@...orfullife.com" <manfred@...orfullife.com>,
        "tj@...nel.org" <tj@...nel.org>, "arnd@...db.de" <arnd@...db.de>,
        "linux-arch@...r.kernel.org" <linux-arch@...r.kernel.org>,
        "will.deacon@....com" <will.deacon@....com>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "stern@...land.harvard.edu" <stern@...land.harvard.edu>,
        "parri.andrea@...il.com" <parri.andrea@...il.com>,
        "torvalds@...ux-foundation.org" <torvalds@...ux-foundation.org>
Subject: Re: [PATCH v2 0/9] Remove spin_unlock_wait()

On Thu, Jul 06, 2017 at 02:12:24PM +0000, David Laight wrote:
> From: Paul E. McKenney
> > Sent: 06 July 2017 00:30
> > There is no agreed-upon definition of spin_unlock_wait()'s semantics,
> > and it appears that all callers could do just as well with a lock/unlock
> > pair.  This series therefore removes spin_unlock_wait() and changes
> > its users to instead use a lock/unlock pair.  The commits are as follows,
> > in three groups:
> > 
> > 1-7.	Change uses of spin_unlock_wait() and raw_spin_unlock_wait()
> > 	to instead use a spin_lock/spin_unlock pair.  These may be
> > 	applied in any order, but must be applied before any later
> > 	commits in this series.  The commit logs state why I believe
> > 	that these commits won't noticeably degrade performance.
> 
> I can't help feeling that it might be better to have a spin_lock_sync()
> call that is equivalent to a spin_lock/spin_unlock pair.
> The default implementation being an inline function that does exactly that.
> This would let an architecture implement a more efficient version.
> 
> It might even be that this is the defined semantics of spin_unlock_wait().

That was in fact my first proposal, see the comment header in current
mainline for spin_unlock_wait() in include/linux/spinlock.h.  But Linus
quite rightly pointed out that if spin_unlock_wait() was to be defined in
this way, we should get rid of spin_unlock_wait() entirely, especially
given that there are not very many calls to spin_unlock_wait() and
also given that none of them are particularly performance critical.
Hence the current patch set, which does just that.

> Note that it can only be useful to do a spin_lock/unlock pair if it is
> impossible for another code path to try to acquire the lock.
> (Or, at least, the code can't care if the lock is acquired just after.)

Indeed!  As Oleg Nesterov pointed out, a spin_lock()/spin_unlock() pair
is sort of like synchronize_rcu() for a given lock, where that lock's
critical sections play the role of RCU read-side critical sections.
So anything before the pair is visible to later critical sections, and
anything in prior critical sections is visible to anything after the pair.
But again, as Linus pointed out, if we are going to have these semantics,
just do spin_lock() immediately followed by spin_unlock().

> So if it can de determined that the lock isn't held (a READ_ONCE()
> might be enough) the lock itself need not be acquired (with the
> associated slow bus cycles).

If you want the full semantics of a spin_lock()/spin_unlock() pair,
you need a full memory barrier before the READ_ONCE(), even on x86.
Without that memory barrier, you don't get the effect of the release
implicit in spin_unlock().

For weaker architectures, such as PowerPC and ARM, a READ_ONCE()
does -not- suffice, not at all, even with smp_mb() before and after.
I encourage you to take a look at arch_spin_unlock_wait() in arm64
and powerpc if youi are interested.  There were also some lengthy LKML
threads discussing this about 18 months ago that could be illuminating.

And yes, there are architecture-specific optimizations for an
empty spin_lock()/spin_unlock() critical section, and the current
arch_spin_unlock_wait() implementations show some of these optimizations.
But I expect that performance benefits would need to be demonstrated at
the system level.

							Thanx, Paul