lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250417153451.b50sWh_Z@linutronix.de>
Date: Thu, 17 Apr 2025 17:34:51 +0200
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: linux-kernel@...r.kernel.org
Cc: André Almeida <andrealmeid@...lia.com>,
	Darren Hart <dvhart@...radead.org>,
	Davidlohr Bueso <dave@...olabs.net>, Ingo Molnar <mingo@...hat.com>,
	Juri Lelli <juri.lelli@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Valentin Schneider <vschneid@...hat.com>,
	Waiman Long <longman@...hat.com>
Subject: Re: [PATCH v11 15/19] futex: Implement FUTEX2_NUMA

On 2025-04-07 18:52:25 [+0200], To linux-kernel@...r.kernel.org wrote:
> On 2025-04-07 17:57:38 [+0200], To linux-kernel@...r.kernel.org wrote:
> > --- a/kernel/futex/core.c
> > +++ b/kernel/futex/core.c
> > @@ -332,15 +337,35 @@ __futex_hash(union futex_key *key, struct futex_private_hash *fph)
> …
> > +	if (node == FUTEX_NO_NODE) {
> > +		/*
> > +		 * In case of !FLAGS_NUMA, use some unused hash bits to pick a
> > +		 * node -- this ensures regular futexes are interleaved across
> > +		 * the nodes and avoids having to allocate multiple
> > +		 * hash-tables.
> > +		 *
> > +		 * NOTE: this isn't perfectly uniform, but it is fast and
> > +		 * handles sparse node masks.
> > +		 */
> > +		node = (hash >> futex_hashshift) % nr_node_ids;
> 
> forgot to mention earlier: This % nr_node_ids turns into div and it is
> visible in perf top while looking at __futex_hash(). We could round it
> down to a power-of-two (which should be the case in my 1, 2 and 4 based
> NUMA world) and then we could use AND instead.
> ARM does not support NUMA or div so it is not a concern.
> 
> Maybe a fast path for 1/2/4 would make sense since it is the most common
> one. In case you consider it I could run test to see how significant it
> is. It might be that it pops up in "perf bench futex hash" but not be
> significant in general use case. I had some hacks and those did not
> improve the numbers as much as I hoped for.

Since I'm cleaning up: I'm getting approx 1% improvement with this so I
am not considering it.

Sebastian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ