[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241115172035.795842-1-bigeasy@linutronix.de>
Date: Fri, 15 Nov 2024 17:58:41 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: linux-kernel@...r.kernel.org
Cc: André Almeida <andrealmeid@...lia.com>,
Darren Hart <dvhart@...radead.org>,
Davidlohr Bueso <dave@...olabs.net>,
Ingo Molnar <mingo@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Valentin Schneider <vschneid@...hat.com>,
Waiman Long <longman@...hat.com>
Subject: [RFC PATCH v3 0/9] futex: Add support task local hash maps.
Hi,
this is a follow up on
https://lore.kernel.org/ZwVOMgBMxrw7BU9A@jlelli-thinkpadt14gen4.remote.csb
and adds support for task local futex_hash_bucket. It can be created via
prctl().
This version supports resize at runtime. This fun part is limited is to
FUTEX_LOCK_PI which means any other waiter will break.
I posted performance numbers of "perf bench futex hash"
https://lore.kernel.org/all/20241101110810.R3AnEqdu@linutronix.de/
I didn't do any new. While the performance of the 16 default bucket look
worse than the 512 (after that the performance hardly changes while
before that doubles) be aware those are now task local (and not shared
with others) and it seems to be sufficient in general.
For the systems with 512CPUs and one db application we can have the
resize. So either the application needs to resize it or we offer auto
resize based on threads and CPUs. But be aware that workloads like
"xz huge_file.tar" will happily acquire all CPUs in the system and only
use a few locks in total and not very often. So it would probably
perform with two hash buckets as good as 512 in this scenario.
v2…v3 https://lore.kernel.org/all/20241028121921.1264150-1-bigeasy@linutronix.de/
- The default auto size for auto creation is 16.
- For the private hash jhash2 is used and only for the address.
- My "perf bench futex hash" hacks have been added.
- The structure moved from signal's struct to mm.
- It is possible resize it at runtime.
v1…v2 https://lore.kernel.org/all/20241026224306.982896-1-bigeasy@linutronix.de/:
- Moved to struct signal_struct and is used process wide.
- Automaticly allocated once the first thread is created.
Sebastian
Powered by blists - more mailing lists