lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 12 Sep 2015 11:59:36 +0200
From:	Ingo Molnar <mingo@...nel.org>
To:	Davidlohr Bueso <dave@...olabs.net>
Cc:	Rasmus Villemoes <linux@...musvillemoes.dk>,
	Thomas Gleixner <tglx@...utronix.de>,
	kbuild test robot <fengguang.wu@...el.com>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] futex: eliminate cache miss from futex_hash()


* Davidlohr Bueso <dave@...olabs.net> wrote:

> On Wed, 09 Sep 2015, Rasmus Villemoes wrote:
> 
> >futex_hash() references two global variables: the base pointer
> >futex_queues and the size of the array futex_hashsize. The latter is
> >marked __read_mostly, while the former is not, so they are likely to
> >end up very far from each other. This means that futex_hash() is
> >likely to encounter two cache misses.
> >
> >We could mark futex_queues as __read_mostly as well, but that doesn't
> >guarantee they'll end up next to each other (and even if they do, they
> >may still end up in different cache lines). So put the two variables
> >in a small singleton struct with sufficient alignment and mark that as
> >__read_mostly.
> 
> This really doesn't have much practical effect -- not even on larger
> boxes, where such things matter. For instance, I ran the patch on a
> 60-core IvyBridge with 'perf-bench futex', for which futex-hash
> particularly benefits in good data layout (ie our current smp alignment).
> 
> http://linux-scalability.org/futex-__futex_data/
> 
> I think we should leave it as is.

But ... given that these are shared-cached values (cached on all CPUs), this 
change would only be measurable in such a benchmark if the cache footprint of the 
test is just about to overflow the size of the CPU cache and the one extra cache 
line would cause cache trashing. That is very unlikely.

So such a change seems to make sense unless you can argue that it's _bad_ to move 
them closer to each other.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ