linux-kernel - Re: [PATCH 2/5] x86,smp: proportional backoff for ticket spinlocks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <50ECA3AA.7000101@redhat.com>
Date:	Tue, 08 Jan 2013 17:54:34 -0500
From:	Rik van Riel <riel@...hat.com>
To:	Eric Dumazet <eric.dumazet@...il.com>
CC:	linux-kernel@...r.kernel.org, aquini@...hat.com, walken@...gle.com,
	lwoodman@...hat.com, jeremy@...p.org,
	Jan Beulich <JBeulich@...ell.com>, knoel@...hat.com,
	chegu_vinod@...com, raghavendra.kt@...ux.vnet.ibm.com,
	mingo@...hat.com
Subject: Re: [PATCH 2/5] x86,smp: proportional backoff for ticket spinlocks

On 01/08/2013 05:50 PM, Eric Dumazet wrote:
> On Tue, 2013-01-08 at 17:32 -0500, Rik van Riel wrote:
>> Subject: x86,smp: proportional backoff for ticket spinlocks
>>
>> Simple fixed value proportional backoff for ticket spinlocks.
>> By pounding on the cacheline with the spin lock less often,
>> bus traffic is reduced. In cases of a data structure with
>> embedded spinlock, the lock holder has a better chance of
>> making progress.
>>
>> If we are next in line behind the current holder of the
>> lock, we do a fast spin, so as not to waste any time when
>> the lock is released.
>>
>> The number 50 is likely to be wrong for many setups, and
>> this patch is mostly to illustrate the concept of proportional
>> backup. The next patch automatically tunes the delay value.
>>
>> Signed-off-by: Rik van Riel <riel@...hat.com>
>> Signed-off-by: Michel Lespinasse <walken@...gle.com>
>> ---
>>   arch/x86/kernel/smp.c |   23 ++++++++++++++++++++---
>>   1 files changed, 20 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c
>> index 20da354..aa743e9 100644
>> --- a/arch/x86/kernel/smp.c
>> +++ b/arch/x86/kernel/smp.c
>> @@ -117,11 +117,28 @@ static bool smp_no_nmi_ipi = false;
>>    */
>>   void ticket_spin_lock_wait(arch_spinlock_t *lock, struct __raw_tickets inc)
>>   {
>> +	__ticket_t head = inc.head, ticket = inc.tail;
>> +	__ticket_t waiters_ahead;
>> +	unsigned loops;
>> +
>>   	for (;;) {
>> -		cpu_relax();
>> -		inc.head = ACCESS_ONCE(lock->tickets.head);
>> +		waiters_ahead = ticket - head - 1;
>> +		/*
>> +		 * We are next after the current lock holder. Check often
>> +		 * to avoid wasting time when the lock is released.
>> +		 */
>> +		if (!waiters_ahead) {
>> +			do {
>> +				cpu_relax();
>> +			} while (ACCESS_ONCE(lock->tickets.head) != ticket);
>> +			break;
>> +		}
>> +		loops = 50 * waiters_ahead;
>> +		while (loops--)
>> +			cpu_relax();
>>
>> -		if (inc.head == inc.tail)
>> +		head = ACCESS_ONCE(lock->tickets.head);
>> +		if (head == ticket)
>>   			break;
>>   	}
>>   }
>>
>
> Reviewed-by: Eric Dumazet <edumazet@...gle.com>
>
> In my tests, I used the following formula :
>
> loops = 50 * ((ticket - head) - 1/2);
>
> or :
>
> delta = ticket - head;
> loops = delay * delta - (delay >> 1);

I suppose that rounding down the delta might result
in more stable results, due to undersleeping less
often.

> (And I didnt use the special :
>
> 	if (!waiters_ahead) {
> 		do {
> 			cpu_relax();
> 		} while (ACCESS_ONCE(lock->tickets.head) != ticket);
> 		break;
> 	}
>
> Because it means this wont help machines with 2 cpus.
>
> (or more generally if there _is_ contention, but with
> one lock holder and one lock waiter)

Machines with 2 CPUs should not need help, because the
cpu_relax() alone gives enough of a pause that the lock
holder can make progress.

It may be interesting to try out your rounding-down of
delta, to see if that makes things better.

-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/