[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55B66ACB.6010702@hp.com>
Date: Mon, 27 Jul 2015 13:30:51 -0400
From: Waiman Long <waiman.long@...com>
To: Davidlohr Bueso <dave@...olabs.net>
CC: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
linux-kernel@...r.kernel.org, Scott J Norton <scott.norton@...com>,
Douglas Hatch <doug.hatch@...com>
Subject: Re: [PATCH v3 2/7] locking/pvqspinlock: Add pending bit support
On 07/26/2015 08:56 PM, Davidlohr Bueso wrote:
> On Wed, 2015-07-22 at 16:12 -0400, Waiman Long wrote:
>> Like the native qspinlock, using the pending bit when it is lightly
>> loaded to acquire the lock is faster than going through the PV queuing
>> process which is even slower than the native queuing process. It also
>> avoids loading two additional cachelines (the MCS and PV nodes).
>>
>> This patch adds the pending bit support for PV qspinlock. The pending
>> bit code has a smaller spin threshold (1<<10). It will default back
>> to the queuing method if it cannot acquired the lock within a certain
>> time limit.
> Can we infer that this new spin threshold is the metric to detect these
> "light loads"? If so, I cannot help but wonder if there is some more
> straightforward/ad-hoc way of detecting this, ie some pv_<> function.
> That would also save a lot of time as it would not be time based.
> Although it might be a more costly call altogether, I dunno.
I used the term "light load" to refer to the condition that at most 2
competing threads are trying to acquire the lock. In that case, the
pending code will be used. Once there are 3 or more competing threads,
it will switch back to the regular queuing code. It is the same
mechanism used in the native code. The only difference is in the
addition of a loop counter to make sure that the thread won't spend too
much time on spinning.
> Some comments about this 'loop' threshold.
>
>> +static int pv_pending_lock(struct qspinlock *lock, u32 val)
>> +{
>> + int loop = PENDING_SPIN_THRESHOLD;
>> + u32 new, old;
>> +
>> + /*
>> + * wait for in-progress pending->locked hand-overs
>> + */
>> + if (val == _Q_PENDING_VAL) {
>> + while (((val = atomic_read(&lock->val)) == _Q_PENDING_VAL)&&
>> + loop--)
>> + cpu_relax();
>> + }
>> +
>> + /*
>> + * trylock || pending
>> + */
>> + for (;;) {
>> + if (val& ~_Q_LOCKED_MASK)
>> + goto queue;
>> + new = _Q_LOCKED_VAL;
>> + if (val == new)
>> + new |= _Q_PENDING_VAL;
>> + old = atomic_cmpxchg(&lock->val, val, new);
>> + if (old == val)
>> + break;
>> + if (loop--<= 0)
>> + goto queue;
>> + }
> So I'm not clear about the semantics of what (should) occurs when the
> threshold is exhausted. In the trylock/pending loop above, you
> immediately return 0, indicating we want to queue. Ok, but below:
This is in the lock slowpath, so it can't return a lock failure.
>> +
>> + if (new == _Q_LOCKED_VAL)
>> + goto gotlock;
>> + /*
>> + * We are pending, wait for the owner to go away.
>> + */
>> + while (((val = smp_load_acquire(&lock->val.counter))& _Q_LOCKED_MASK)
>> + && (loop--> 0))
>> + cpu_relax();
>> +
>> + if (!(val& _Q_LOCKED_MASK)) {
>> + clear_pending_set_locked(lock);
>> + goto gotlock;
>> + }
>> + /*
>> + * Clear the pending bit and fall back to queuing
>> + */
>> + clear_pending(lock);
> ... you call clear_pending before returning. Is this intentional? Smells
> fishy.
The pending bit acts as a 1-slot waiting queue. So if the vCPU needs to
fall back to regular queuing, it needs to clear the bit.
>
> And basically afaict all this chunk of code does is spin until loop is
> exhausted, and breakout when we got the lock. Ie, something like this is
> a lot cleaner:
>
> while (loop--) {
> /*
> * We are pending, wait for the owner to go away.
> */
> val = smp_load_acquire(&lock->val.counter);
> if (!(val& _Q_LOCKED_MASK)) {
> clear_pending_set_locked(lock);
> goto gotlock;
> }
>
> cpu_relax();
> }
>
> /*
> * Clear the pending bit and fall back to queuing
> */
> clear_pending(lock);
>
Yes, we could change the loop to that. I was just following the same
logic in the native code.
Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists