[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.2.00.1001060119010.3630@localhost.localdomain>
Date: Wed, 6 Jan 2010 01:39:17 -0800 (PST)
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
cc: Minchan Kim <minchan.kim@...il.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Peter Zijlstra <peterz@...radead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>, cl@...ux-foundation.org,
"hugh.dickins" <hugh.dickins@...cali.co.uk>,
Nick Piggin <nickpiggin@...oo.com.au>,
Ingo Molnar <mingo@...e.hu>
Subject: Re: [RFC][PATCH 6/8] mm: handle_speculative_fault()
On Wed, 6 Jan 2010, KAMEZAWA Hiroyuki wrote:
>
> 9.08% multi-fault-all [kernel] [k] down_read_trylock
Btw, the "trylock" implementation of rwsemaphores doesn't look very good
on x86, even after teaching x64-64 to use the xadd versions of rwsems.
The reason? It sadly does a "load/inc/cmpxchg" sequence to do an "add if
there are no writers", and that may be the obvious model, but it sucks
from a cache access standpoint.
Why? Because it starts the sequence with a plain read. So if it doesn't
have the cacheline, it will likely get it first in non-exclusive mode, and
then the cmpxchg a few instructions later will need to turn it into an
exclusive ownership.
So now that trylock may well end up doing two bus accesses rather than
one.
It's possible that an OoO x86 CPU might notice the "read followed by
r-m-w" pattern and just turn it into an exclusive cache fetch immediately,
but I don't think they are quite that smart. But who knows?
Anyway, for the above reason it might actually be better to _avoid_ the
load entirely, and just make __down_read_trylock() assume the rwsem starts
out unlocked - replace the initial memory load with just loading a
constant.
That way, it will do the cmpxchg first, and if it wasn't unlocked and had
other readers active, it will end up doing an extra cmpxchg, but still
hopefully avoid the extra bus cycles.
So it might be worth testing this trivial patch on top of my other one.
UNTESTED. But the patch is small and simple, so maybe it works anyway. It
would be interesting to hear if it makes any difference. Better? Worse? Or
is it a "There's no difference at all, Linus. You're on drugs with that
whole initial bus cycle thing"
Linus
---
arch/x86/include/asm/rwsem.h | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/rwsem.h b/arch/x86/include/asm/rwsem.h
index 625baca..275c0a1 100644
--- a/arch/x86/include/asm/rwsem.h
+++ b/arch/x86/include/asm/rwsem.h
@@ -123,7 +123,6 @@ static inline int __down_read_trylock(struct rw_semaphore *sem)
{
__s32 result, tmp;
asm volatile("# beginning __down_read_trylock\n\t"
- " movl %0,%1\n\t"
"1:\n\t"
" movl %1,%2\n\t"
" addl %3,%2\n\t"
@@ -133,7 +132,7 @@ static inline int __down_read_trylock(struct rw_semaphore *sem)
"2:\n\t"
"# ending __down_read_trylock\n\t"
: "+m" (sem->count), "=&a" (result), "=&r" (tmp)
- : "i" (RWSEM_ACTIVE_READ_BIAS)
+ : "i" (RWSEM_ACTIVE_READ_BIAS), "1" (RWSEM_UNLOCKED_VALUE)
: "memory", "cc");
return result >= 0 ? 1 : 0;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists