linux-kernel - Re: [PATCH 2/2] locking/qspinlock: Limit # of spins in _Q_PENDING

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f8650ae4-046e-edd2-3122-9ecbeef8a261@redhat.com>
Date:   Tue, 10 Apr 2018 14:53:53 -0400
From:   Waiman Long <longman@...hat.com>
To:     Will Deacon <will.deacon@....com>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org, boqun.feng@...il.com,
        catalin.marinas@....com,
        "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: [PATCH 2/2] locking/qspinlock: Limit # of spins in _Q_PENDING_VAL
 wait loop

On 04/10/2018 02:26 PM, Will Deacon wrote:
> Hi Waiman,
>
> On Mon, Apr 09, 2018 at 02:08:52PM -0400, Waiman Long wrote:
>> A locker in the pending code path is doing an infinite number of spins
>> when waiting for the _Q_PENDING_VAL to _Q_LOCK_VAL transition. There
>> is a concern that lock starvation can happen concurrent lockers are
>> able to take the lock in the cmpxchg loop without queuing and pass it
>> around amongst themselves.
>>
>> To ensure forward progress while still taking advantage of using
>> the pending code path without queuing, the code is now modified
>> to do a limited number of spins before aborting the effort and
>> going to queue itself.
>>
>> Ideally, the spinning times should be at least a few times the typical
>> cacheline load time from memory which I think can be down to 100ns or
>> so for each cacheline load with the newest systems or up to several
>> hundreds ns for older systems.
>>
>> Signed-off-by: Waiman Long <longman@...hat.com>
>> ---
>>  kernel/locking/qspinlock.c | 19 +++++++++++++++++--
>>  1 file changed, 17 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
>> index 634a49b..35367cc 100644
>> --- a/kernel/locking/qspinlock.c
>> +++ b/kernel/locking/qspinlock.c
>> @@ -82,6 +82,15 @@
>>  #endif
>>  
>>  /*
>> + * The pending bit spinning loop count.
>> + * This parameter can be overridden by another architecture specific
>> + * constant. Default is 512.
>> + */
>> +#ifndef _Q_PENDING_LOOP
>> +#define _Q_PENDING_LOOP	(1 << 9)
>> +#endif
> I really dislike heuristics like this because there's never a good number
> to choose and it almost certainly varies between systems and workloads
> rather than just by architecture. However, I've also not managed to come
> up with something better.

I share your concern about heuristic like this, but I can't think of
another easy way out.

> If I rewrite your code slightly to look like:
>
> 	if (val == _Q_PENDING_VAL) {
> 		int cnt = _Q_PENDING_LOOP;
> 		val = atomic_cond_read_relaxed(&lock->val, (VAL != _Q_PENDING_VAL) || !cnt--);
> 	}
>
> then architectures that implement atomic_cond_read_relaxed as something
> more interesting than a spinning loop will probably be happy with
> _Q_PENDING_LOOP == 1;

Right. That is why I state that _Q_PENDING_LOOP is an architecture
specific constant.

Cheers,
Longman