linux-kernel - Re: [PATCH 0/2] locking/mutex: Enable optimistic spinning of lock waiter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <56BE14C2.5080603@hpe.com>
Date:	Fri, 12 Feb 2016 12:22:10 -0500
From:	Waiman Long <waiman.long@....com>
To:	Jason Low <jason.low2@...com>
CC:	Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	<linux-kernel@...r.kernel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Ding Tianhong <dingtianhong@...wei.com>,
	Jason Low <jason.low2@....com>,
	Davidlohr Bueso <dave@...olabs.net>,
	"Paul E. McKenney" <paulmck@...ibm.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Will Deacon <Will.Deacon@....com>,
	Tim Chen <tim.c.chen@...ux.intel.com>
Subject: Re: [PATCH 0/2] locking/mutex: Enable optimistic spinning of lock
 waiter

On 02/09/2016 04:44 PM, Jason Low wrote:
> On Tue, 2016-02-09 at 14:47 -0500, Waiman Long wrote:
>> This patchset is a variant of PeterZ's "locking/mutex: Avoid spinner
>> vs waiter starvation" patch. The major difference is that the
>> waiter-spinner won't enter into the OSQ used by the spinners. Instead,
>> it will spin directly on the lock in parallel with the queue head
>> of the OSQ. So there will be a bit more cacheline contention on the
>> lock cacheline, but that shouldn't cause noticeable impact on system
>> performance.
>>
>> This patchset tries to address 2 issues with Peter's patch:
>>
>>   1) Ding Tianhong still find that hanging task could happen in some cases.
>>   2) Jason Low found that there was performance regression for some AIM7
>>      workloads.
> This might help address the hang that Ding reported.
>
> Performance wise, this patchset reduced AIM7 fserver throughput on the 8
> socket machine by -70%+ at 1000+ users.
>
>                  | fserver JPM
> -----------------------------
> baseline	| ~450000
> Peter's patch	| ~410000
> This patchset	| ~100000
>
> My guess is that waiters spinning/acquiring the lock is less efficient,
> and this patchset further increases the chance for waiters to
> spin/acquire the lock over the fastpath optimistic spinners.
>
> Jason
>

That was just a configuration error as the CPU scaling governor wasn't 
set to performance. With the performance scaling governor, the 
patchset's performance was comparable to Peter's patch.

Cheers,
Longman