lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Wed,  9 Sep 2015 23:36:40 +0200
From:	Rasmus Villemoes <linux@...musvillemoes.dk>
To:	Thomas Gleixner <tglx@...utronix.de>,
	Davidlohr Bueso <dave@...olabs.net>,
	kbuild test robot <fengguang.wu@...el.com>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>,
	Rasmus Villemoes <linux@...musvillemoes.dk>,
	linux-kernel@...r.kernel.org
Subject: [PATCH] futex: eliminate cache miss from futex_hash()

futex_hash() references two global variables: the base pointer
futex_queues and the size of the array futex_hashsize. The latter is
marked __read_mostly, while the former is not, so they are likely to
end up very far from each other. This means that futex_hash() is
likely to encounter two cache misses.

We could mark futex_queues as __read_mostly as well, but that doesn't
guarantee they'll end up next to each other (and even if they do, they
may still end up in different cache lines). So put the two variables
in a small singleton struct with sufficient alignment and mark that as
__read_mostly.

A diff of the disassembly shows what I'd expect:

 :      31 d1                   xor    %edx,%ecx
 :      c1 ca 12                ror    $0x12,%edx
 :      29 d1                   sub    %edx,%ecx
-:      48 8b 15 25 c8 e5 00    mov    0xe5c825(%rip),%rdx        # ffffffff81f149c8 <futex_hashsize>
+:      48 8b 15 35 c8 e5 00    mov    0xe5c835(%rip),%rdx        # ffffffff81f149d8 <__futex_data+0x8>
 :      31 c8                   xor    %ecx,%eax
 :      c1 c9 08                ror    $0x8,%ecx
 :      29 c8                   sub    %ecx,%eax
 :      48 83 ea 01             sub    $0x1,%rdx
 :      48 21 d0                and    %rdx,%rax
 :      48 c1 e0 06             shl    $0x6,%rax
-:      48 03 05 e4 5e 02 01    add    0x1025ee4(%rip),%rax        # ffffffff820de0a0 <futex_queues>
+:      48 03 05 14 c8 e5 00    add    0xe5c814(%rip),%rax        # ffffffff81f149d0 <__futex_data>
 :      c3                      retq
 :      0f 1f 00                nopl   (%rax)

Signed-off-by: Rasmus Villemoes <linux@...musvillemoes.dk>
---
Resending since this was never picked up - and I assume it's actually
ok. Also, this time the alignment is spelled 2*sizeof(long) to avoid
wasting 8 bytes on 32bit.

 kernel/futex.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index 6e443efc65f4..dfc86e93c31d 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -255,9 +255,18 @@ struct futex_hash_bucket {
 	struct plist_head chain;
 } ____cacheline_aligned_in_smp;
 
-static unsigned long __read_mostly futex_hashsize;
+/*
+ * The base of the bucket array and its size are always used together
+ * (after initialization only in hash_futex()), so ensure that they
+ * reside in the same cacheline.
+ */
+static struct {
+	struct futex_hash_bucket *queues;
+	unsigned long            hashsize;
+} __futex_data __read_mostly __aligned(2*sizeof(long));
+#define futex_queues   (__futex_data.queues)
+#define futex_hashsize (__futex_data.hashsize)
 
-static struct futex_hash_bucket *futex_queues;
 
 /*
  * Fault injections for futexes.
-- 
2.1.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ