linux-kernel - Re: [PATCH v5 3/3] locking/qrwlock: Don't contend with readers when setting _QW

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <558C65A0.5040005@hp.com>
Date:	Thu, 25 Jun 2015 16:33:36 -0400
From:	Waiman Long <waiman.long@...com>
To:	Peter Zijlstra <peterz@...radead.org>
CC:	Ingo Molnar <mingo@...hat.com>, Arnd Bergmann <arnd@...db.de>,
	linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org,
	Will Deacon <will.deacon@....com>,
	Scott J Norton <scott.norton@...com>,
	Douglas Hatch <doug.hatch@...com>
Subject: Re: [PATCH v5 3/3] locking/qrwlock: Don't contend with readers when
 setting _QW_WAITING

On 06/25/2015 02:35 PM, Peter Zijlstra wrote:
> On Fri, Jun 19, 2015 at 11:50:02AM -0400, Waiman Long wrote:
>> The current cmpxchg() loop in setting the _QW_WAITING flag for writers
>> in queue_write_lock_slowpath() will contend with incoming readers
>> causing possibly extra cmpxchg() operations that are wasteful. This
>> patch changes the code to do a byte cmpxchg() to eliminate contention
>> with new readers.
>>
>> A multithreaded microbenchmark running 5M read_lock/write_lock loop
>> on a 8-socket 80-core Westmere-EX machine running 4.0 based kernel
>> with the qspinlock patch have the following execution times (in ms)
>> with and without the patch:
>>
>> With R:W ratio = 5:1
>>
>> 	Threads	   w/o patch	with patch	% change
>> 	-------	   ---------	----------	--------
>> 	   2	     990 	    895		  -9.6%
>> 	   3	    2136 	   1912		 -10.5%
>> 	   4	    3166	   2830		 -10.6%
>> 	   5	    3953	   3629		  -8.2%
>> 	   6	    4628	   4405		  -4.8%
>> 	   7	    5344	   5197		  -2.8%
>> 	   8	    6065	   6004		  -1.0%
>> 	   9	    6826	   6811		  -0.2%
>> 	  10	    7599	   7599		   0.0%
>> 	  15	    9757	   9766		  +0.1%
>> 	  20	   13767	  13817		  +0.4%
>>
>> With small number of contending threads, this patch can improve
>> locking performance by up to 10%. With more contending threads,
>> however, the gain diminishes.
>>
>> With the extended qrwlock structure defined in asm-generic/qrwlock,
>> the queue_write_unlock() function is also simplified to a
>> smp_store_release() call.
>>
>> Signed-off-by: Waiman Long<Waiman.Long@...com>
> This one does not in fact apply, seeing how I applied a previous
> version.
>
> Please send an incremental patch if you still want to change things to
> this form.

I saw that Ingo has merged a previous version of the patch. I am fine 
with that version. As Will is working on a qrwlock patch to enable ARM 
to use it, I will let him make the structure move to qrwlock.h if he 
choose to do so.

Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/