linux-kernel - Re: sem_lock() vs qspinlocks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Fri, 20 May 2016 23:06:18 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Boqun Feng <boqun.feng@...il.com>,
	Davidlohr Bueso <dave@...olabs.net>,
	Manfred Spraul <manfred@...orfullife.com>,
	Waiman Long <Waiman.Long@....com>,
	Ingo Molnar <mingo@...nel.org>, ggherdovich@...e.com,
	Mel Gorman <mgorman@...hsingularity.net>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Paul McKenney <paulmck@...ux.vnet.ibm.com>,
	Will Deacon <will.deacon@....com>
Subject: Re: sem_lock() vs qspinlocks

On Fri, May 20, 2016 at 10:00:45AM -0700, Linus Torvalds wrote:
> On Fri, May 20, 2016 at 9:04 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> > +        * queued_spin_lock_slowpath() can ACQUIRE the lock before
> > +        * issuing the unordered store that sets _Q_LOCKED_VAL.
> 
> Ugh. This was my least favorite part of the queued locks, and I really
> liked the completely unambiguous semantics of the x86 atomics-based
> versions we used to have.
> 
> But I guess we're stuck with it.

Yeah, I wasn't too happy either when I realized it today. We _could_
make these atomic ops, but then we're just making things slower for this
one weird case :/

> That said, it strikes me that almost all of the users of
> "spin_is_locked()" are using it for verification purposes (not locking
> correctness),

Right; although I feel those people should be using
lockdep_assert_held() for this instead. That not only compiles out when
!LOCKDEP but also asserts the current task/cpu is the lock owner, not
someone else.

> and that the people who are playing games with locking
> correctness are few and already have to play *other* games anyway.
> 
> See for example "ipc_smp_acquire__after_spin_is_unlocked()", which has
> a big comment atop of it that now becomes nonsensical with this patch.

Not quite; we still need that I think.

We now have:

	spin_lock(A);
	smp_mb();
	while (!spin_is_locked(B))
		cpu_relax();
	smp_rmb();

And that control dependency together with the rmb form a load-acquire on
the unlocked B, which matches the release of the spin_unlock(B) and
ensures we observe the whole previous critical section we waited for.

The new smp_mb() doesn't help with that.

> Now, I'd take Peter's patch as-is, because I don't think any of this
> matters from a *performance* standpoint, and Peter's patch is much
> smaller and simpler.

I would suggest you do this and also mark it for stable v4.2 and later.

> But the reason I think it might be a good thing to introduce those
> spin_lock_synchronize() and splin_lock_acquire_after_unlock() concepts
> would be to make it very very clear what those subtle implementations
> in mutexes and the multi-level locks in the ipc layer are doing and
> what they rely on.

We can always do the fancy stuff on top, but that isn't going to need
backporting to all stable trees, this is.

I'll think a little more on the explicit document vs simple thing.