linux-kernel - Re: [PATCH v11 06/16] qspinlock: prolong the stay in the pending bit path

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5398C894.6040808@hp.com>
Date:	Wed, 11 Jun 2014 17:22:28 -0400
From:	"Long, Wai Man" <waiman.long@...com>
To:	Peter Zijlstra <peterz@...radead.org>
CC:	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	"H. Peter Anvin" <hpa@...or.com>, linux-arch@...r.kernel.org,
	x86@...nel.org, linux-kernel@...r.kernel.org,
	virtualization@...ts.linux-foundation.org,
	xen-devel@...ts.xenproject.org, kvm@...r.kernel.org,
	Paolo Bonzini <paolo.bonzini@...il.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
	Boris Ostrovsky <boris.ostrovsky@...cle.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Rik van Riel <riel@...hat.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Raghavendra K T <raghavendra.kt@...ux.vnet.ibm.com>,
	David Vrabel <david.vrabel@...rix.com>,
	Oleg Nesterov <oleg@...hat.com>,
	Gleb Natapov <gleb@...hat.com>,
	Scott J Norton <scott.norton@...com>,
	Chegu Vinod <chegu_vinod@...com>
Subject: Re: [PATCH v11 06/16] qspinlock: prolong the stay in the pending
 bit path


On 6/11/2014 6:26 AM, Peter Zijlstra wrote:
> On Fri, May 30, 2014 at 11:43:52AM -0400, Waiman Long wrote:
>> ---
>>   kernel/locking/qspinlock.c |   18 ++++++++++++++++--
>>   1 files changed, 16 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
>> index fc7fd8c..7f10758 100644
>> --- a/kernel/locking/qspinlock.c
>> +++ b/kernel/locking/qspinlock.c
>> @@ -233,11 +233,25 @@ void queue_spin_lock_slowpath(struct qspinlock *lock, u32 val)
>>   	 */
>>   	for (;;) {
>>   		/*
>> -		 * If we observe any contention; queue.
>> +		 * If we observe that the queue is not empty or both
>> +		 * the pending and lock bits are set, queue
>>   		 */
>> -		if (val & ~_Q_LOCKED_MASK)
>> +		if ((val & _Q_TAIL_MASK) ||
>> +		    (val == (_Q_LOCKED_VAL|_Q_PENDING_VAL)))
>>   			goto queue;
>>   
>> +		if (val == _Q_PENDING_VAL) {
>> +			/*
>> +			 * Pending bit is set, but not the lock bit.
>> +			 * Assuming that the pending bit holder is going to
>> +			 * set the lock bit and clear the pending bit soon,
>> +			 * it is better to wait than to exit at this point.
>> +			 */
>> +			cpu_relax();
>> +			val = atomic_read(&lock->val);
>> +			continue;
>> +		}
>> +
>>   		new = _Q_LOCKED_VAL;
>>   		if (val == new)
>>   			new |= _Q_PENDING_VAL;
>
> So, again, you just posted a new version without replying to the
> previous discussion; so let me try again, what's wrong with the proposal
> here:
>
>    lkml.kernel.org/r/20140417163640.GT11096@...ns.programming.kicks-ass.net
>
>

I thought I had answered you before, maybe the message was lost or the 
answer was not complete. Anyway, I will try to response to your question 
again here.

> Wouldn't something like:
>
>	while (atomic_read(&lock->val) == _Q_PENDING_VAL)
>		cpu_relax();
>
> before the cmpxchg loop have gotten you all this?

That is not exactly the same. The loop will exit if other bits are set or the pending
bit cleared. In the case, we will need to do the same check at the beginning of the
for loop in order to avoid doing an extra cmpxchg that is not necessary.


> I just tried this on my code and I cannot see a difference.

As I said before, I did see a difference with that change. I think it 
depends on the CPU chip that we used for testing. I ran my test on a 
10-core Westmere-EX chip. I run my microbench on different pairs of core 
within the same chip. It produces different results that varies from 
779.5ms to up to 1192ms. Without that patch, the lowest value I can get 
is still close to 800ms, but the highest can be up to 1800ms or so. So I 
believe it is just a matter of timing that you did not observed in your 
test machine.

-Longman

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/