[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160520210618.GK3193@twins.programming.kicks-ass.net>
Date: Fri, 20 May 2016 23:06:18 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Boqun Feng <boqun.feng@...il.com>,
Davidlohr Bueso <dave@...olabs.net>,
Manfred Spraul <manfred@...orfullife.com>,
Waiman Long <Waiman.Long@....com>,
Ingo Molnar <mingo@...nel.org>, ggherdovich@...e.com,
Mel Gorman <mgorman@...hsingularity.net>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Paul McKenney <paulmck@...ux.vnet.ibm.com>,
Will Deacon <will.deacon@....com>
Subject: Re: sem_lock() vs qspinlocks
On Fri, May 20, 2016 at 10:00:45AM -0700, Linus Torvalds wrote:
> On Fri, May 20, 2016 at 9:04 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> > + * queued_spin_lock_slowpath() can ACQUIRE the lock before
> > + * issuing the unordered store that sets _Q_LOCKED_VAL.
>
> Ugh. This was my least favorite part of the queued locks, and I really
> liked the completely unambiguous semantics of the x86 atomics-based
> versions we used to have.
>
> But I guess we're stuck with it.
Yeah, I wasn't too happy either when I realized it today. We _could_
make these atomic ops, but then we're just making things slower for this
one weird case :/
> That said, it strikes me that almost all of the users of
> "spin_is_locked()" are using it for verification purposes (not locking
> correctness),
Right; although I feel those people should be using
lockdep_assert_held() for this instead. That not only compiles out when
!LOCKDEP but also asserts the current task/cpu is the lock owner, not
someone else.
> and that the people who are playing games with locking
> correctness are few and already have to play *other* games anyway.
>
> See for example "ipc_smp_acquire__after_spin_is_unlocked()", which has
> a big comment atop of it that now becomes nonsensical with this patch.
Not quite; we still need that I think.
We now have:
spin_lock(A);
smp_mb();
while (!spin_is_locked(B))
cpu_relax();
smp_rmb();
And that control dependency together with the rmb form a load-acquire on
the unlocked B, which matches the release of the spin_unlock(B) and
ensures we observe the whole previous critical section we waited for.
The new smp_mb() doesn't help with that.
> Now, I'd take Peter's patch as-is, because I don't think any of this
> matters from a *performance* standpoint, and Peter's patch is much
> smaller and simpler.
I would suggest you do this and also mark it for stable v4.2 and later.
> But the reason I think it might be a good thing to introduce those
> spin_lock_synchronize() and splin_lock_acquire_after_unlock() concepts
> would be to make it very very clear what those subtle implementations
> in mutexes and the multi-level locks in the ipc layer are doing and
> what they rely on.
We can always do the fancy stuff on top, but that isn't going to need
backporting to all stable trees, this is.
I'll think a little more on the explicit document vs simple thing.
Powered by blists - more mailing lists