linux-kernel - Re: [PATCH 0/2] locking/mutex: Enable optimistic spinning of lock waiter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1455054274.2976.31.camel@j-VirtualBox>
Date:	Tue, 09 Feb 2016 13:44:34 -0800
From:	Jason Low <jason.low2@...com>
To:	Waiman Long <Waiman.Long@....com>
Cc:	Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Ding Tianhong <dingtianhong@...wei.com>,
	Jason Low <jason.low2@....com>,
	Davidlohr Bueso <dave@...olabs.net>,
	"Paul E. McKenney" <paulmck@...ibm.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Will Deacon <Will.Deacon@....com>,
	Tim Chen <tim.c.chen@...ux.intel.com>, jason.low2@....com
Subject: Re: [PATCH 0/2] locking/mutex: Enable optimistic spinning of lock
 waiter

On Tue, 2016-02-09 at 14:47 -0500, Waiman Long wrote:
> This patchset is a variant of PeterZ's "locking/mutex: Avoid spinner
> vs waiter starvation" patch. The major difference is that the
> waiter-spinner won't enter into the OSQ used by the spinners. Instead,
> it will spin directly on the lock in parallel with the queue head
> of the OSQ. So there will be a bit more cacheline contention on the
> lock cacheline, but that shouldn't cause noticeable impact on system
> performance.
> 
> This patchset tries to address 2 issues with Peter's patch:
> 
>  1) Ding Tianhong still find that hanging task could happen in some cases.
>  2) Jason Low found that there was performance regression for some AIM7
>     workloads.

This might help address the hang that Ding reported.

Performance wise, this patchset reduced AIM7 fserver throughput on the 8
socket machine by -70%+ at 1000+ users.

                | fserver JPM
-----------------------------
baseline	| ~450000
Peter's patch	| ~410000
This patchset	| ~100000

My guess is that waiters spinning/acquiring the lock is less efficient,
and this patchset further increases the chance for waiters to
spin/acquire the lock over the fastpath optimistic spinners.

Jason