lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250407165224.z0FmVaXX@linutronix.de>
Date: Mon, 7 Apr 2025 18:52:24 +0200
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: linux-kernel@...r.kernel.org
Cc: André Almeida <andrealmeid@...lia.com>,
	Darren Hart <dvhart@...radead.org>,
	Davidlohr Bueso <dave@...olabs.net>, Ingo Molnar <mingo@...hat.com>,
	Juri Lelli <juri.lelli@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Valentin Schneider <vschneid@...hat.com>,
	Waiman Long <longman@...hat.com>
Subject: Re: [PATCH v11 15/19] futex: Implement FUTEX2_NUMA

On 2025-04-07 17:57:38 [+0200], To linux-kernel@...r.kernel.org wrote:
> --- a/kernel/futex/core.c
> +++ b/kernel/futex/core.c
> @@ -332,15 +337,35 @@ __futex_hash(union futex_key *key, struct futex_private_hash *fph)
…
> +	if (node == FUTEX_NO_NODE) {
> +		/*
> +		 * In case of !FLAGS_NUMA, use some unused hash bits to pick a
> +		 * node -- this ensures regular futexes are interleaved across
> +		 * the nodes and avoids having to allocate multiple
> +		 * hash-tables.
> +		 *
> +		 * NOTE: this isn't perfectly uniform, but it is fast and
> +		 * handles sparse node masks.
> +		 */
> +		node = (hash >> futex_hashshift) % nr_node_ids;

forgot to mention earlier: This % nr_node_ids turns into div and it is
visible in perf top while looking at __futex_hash(). We could round it
down to a power-of-two (which should be the case in my 1, 2 and 4 based
NUMA world) and then we could use AND instead.
ARM does not support NUMA or div so it is not a concern.

Maybe a fast path for 1/2/4 would make sense since it is the most common
one. In case you consider it I could run test to see how significant it
is. It might be that it pops up in "perf bench futex hash" but not be
significant in general use case. I had some hacks and those did not
improve the numbers as much as I hoped for.

> +		if (!node_possible(node)) {
> +			node = find_next_bit_wrap(node_possible_map.bits,
> +						  nr_node_ids, node);
> +		}
> +	}
> +
> +	return &futex_queues[node][hash & futex_hashmask];
>  }
>  
>  /**

Sebastian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ