lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.1001060119010.3630@localhost.localdomain>
Date:	Wed, 6 Jan 2010 01:39:17 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
cc:	Minchan Kim <minchan.kim@...il.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Peter Zijlstra <peterz@...radead.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>, cl@...ux-foundation.org,
	"hugh.dickins" <hugh.dickins@...cali.co.uk>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Ingo Molnar <mingo@...e.hu>
Subject: Re: [RFC][PATCH 6/8] mm: handle_speculative_fault()



On Wed, 6 Jan 2010, KAMEZAWA Hiroyuki wrote:
>
>      9.08%  multi-fault-all  [kernel]                  [k] down_read_trylock

Btw, the "trylock" implementation of rwsemaphores doesn't look very good 
on x86, even after teaching x64-64 to use the xadd versions of rwsems.

The reason? It sadly does a "load/inc/cmpxchg" sequence to do an "add if 
there are no writers", and that may be the obvious model, but it sucks 
from a cache access standpoint.

Why? Because it starts the sequence with a plain read. So if it doesn't 
have the cacheline, it will likely get it first in non-exclusive mode, and 
then the cmpxchg a few instructions later will need to turn it into an 
exclusive ownership.

So now that trylock may well end up doing two bus accesses rather than 
one.

It's possible that an OoO x86 CPU might notice the "read followed by 
r-m-w" pattern and just turn it into an exclusive cache fetch immediately, 
but I don't think they are quite that smart. But who knows? 

Anyway, for the above reason it might actually be better to _avoid_ the 
load entirely, and just make __down_read_trylock() assume the rwsem starts 
out unlocked - replace the initial memory load with just loading a 
constant.

That way, it will do the cmpxchg first, and if it wasn't unlocked and had 
other readers active, it will end up doing an extra cmpxchg, but still 
hopefully avoid the extra bus cycles.

So it might be worth testing this trivial patch on top of my other one.

UNTESTED. But the patch is small and simple, so maybe it works anyway. It 
would be interesting to hear if it makes any difference. Better? Worse? Or 
is it a "There's no difference at all, Linus. You're on drugs with that 
whole initial bus cycle thing"

		Linus

---
 arch/x86/include/asm/rwsem.h |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/rwsem.h b/arch/x86/include/asm/rwsem.h
index 625baca..275c0a1 100644
--- a/arch/x86/include/asm/rwsem.h
+++ b/arch/x86/include/asm/rwsem.h
@@ -123,7 +123,6 @@ static inline int __down_read_trylock(struct rw_semaphore *sem)
 {
 	__s32 result, tmp;
 	asm volatile("# beginning __down_read_trylock\n\t"
-		     "  movl      %0,%1\n\t"
 		     "1:\n\t"
 		     "  movl	     %1,%2\n\t"
 		     "  addl      %3,%2\n\t"
@@ -133,7 +132,7 @@ static inline int __down_read_trylock(struct rw_semaphore *sem)
 		     "2:\n\t"
 		     "# ending __down_read_trylock\n\t"
 		     : "+m" (sem->count), "=&a" (result), "=&r" (tmp)
-		     : "i" (RWSEM_ACTIVE_READ_BIAS)
+		     : "i" (RWSEM_ACTIVE_READ_BIAS), "1" (RWSEM_UNLOCKED_VALUE)
 		     : "memory", "cc");
 	return result >= 0 ? 1 : 0;
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ