linux-kernel - x86: avoid read-cycle on down_read

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LFD.2.00.1001121708100.17145@localhost.localdomain>
Date:	Tue, 12 Jan 2010 17:24:45 -0800 (PST)
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	"H. Peter Anvin" <hpa@...or.com>
cc:	Ingo Molnar <mingo@...e.hu>, Thomas Gleixner <tglx@...utronix.de>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: x86: avoid read-cycle on down_read_trylock

We don't want to start the lock sequence with a plain read, since that 
will cause the cacheline to be initially brought in as a shared line, only 
to then immediately afterwards need to be turned into an exclusive one.

So in order to avoid unnecessary bus traffic, just start off assuming
that the lock is unlocked, which is the common case anyway.  That way,
the first access to the lock will be the actual locked cycle.

This speeds up the lock ping-pong case, since it now has fewer bus cycles.

The reason down_read_trylock() is so important is that the main rwsem 
usage is mmap_sem, and the page fault case - which is the most common case 
by far - takes it with a "down_read_trylock()". That, in turn, is because 
in case it is locked we want to do the exception table lookup (so that we 
get a nice oops rather than a deadlock if we happen to get a page fault 
while holding the mmap lock for writing).

So why "trylock" is normally not a very common operation, for rwsems it 
ends up being the _normal_ way to get the lock.

Tested-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org>
---

This is on top of Peter's cleanup of my asm-cleanup patch.

On Hiroyuki-san's load, this trivial change improved his (admittedly 
_very_ artificial) page-fault benchmark by about 2%. The profile hit of 
down_read_trylock() went from 9.08% down to 7.73%. So the trylock itself 
seems to have improved by 15%+ from this.

All numbers above are meaningless, but the point is that the effect of 
this cacheline access pattern can be real.

diff --git a/arch/x86/include/asm/rwsem.h b/arch/x86/include/asm/rwsem.h
index 4136200..e9480be 100644
--- a/arch/x86/include/asm/rwsem.h
+++ b/arch/x86/include/asm/rwsem.h
@@ -123,7 +123,6 @@ static inline int __down_read_trylock(struct rw_semaphore *sem)
 {
 	__s32 result, tmp;
 	asm volatile("# beginning __down_read_trylock\n\t"
-		     "  mov          %0,%1\n\t"
 		     "1:\n\t"
 		     "  mov          %1,%2\n\t"
 		     "  add          %3,%2\n\t"
@@ -133,7 +132,7 @@ static inline int __down_read_trylock(struct rw_semaphore *sem)
 		     "2:\n\t"
 		     "# ending __down_read_trylock\n\t"
 		     : "+m" (sem->count), "=&a" (result), "=&r" (tmp)
-		     : "i" (RWSEM_ACTIVE_READ_BIAS)
+		     : "i" (RWSEM_ACTIVE_READ_BIAS), "1" (RWSEM_UNLOCKED_VALUE)
 		     : "memory", "cc");
 	return result >= 0 ? 1 : 0;
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/