[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250416162921.513656-1-bigeasy@linutronix.de>
Date: Wed, 16 Apr 2025 18:29:00 +0200
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: linux-kernel@...r.kernel.org
Cc: André Almeida <andrealmeid@...lia.com>,
Darren Hart <dvhart@...radead.org>,
Davidlohr Bueso <dave@...olabs.net>,
Ingo Molnar <mingo@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Valentin Schneider <vschneid@...hat.com>,
Waiman Long <longman@...hat.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: [PATCH v12 00/21] futex: Add support task local hash maps, FUTEX2_NUMA and FUTEX2_MPOL
this is a follow up on
https://lore.kernel.org/ZwVOMgBMxrw7BU9A@jlelli-thinkpadt14gen4.remote.csb
and adds support for task local futex_hash_bucket.
This is the local hash map series with PeterZ FUTEX2_NUMA and
FUTEX2_MPOL. This went through some testing now with the selftests…
The complete tree is at
https://git.kernel.org/pub/scm/linux/kernel/git/bigeasy/staging.git/log/?h=futex_local_v12
https://git.kernel.org/pub/scm/linux/kernel/git/bigeasy/staging.git futex_local_v12
v11…v12: https://lore.kernel.org/all/20250407155742.968816-1-bigeasy@linutronix.de
- Moved futex_hash_put() in futex_lock_pi() before
rt_mutex_pre_schedule() for obvious reasons.
- Use __GFP_NOWARN while allocating the local hash to supress warnings
about failures especially if huge values were used and vmalloc
refuses.
- The "immutable" mode is its own patch. The basic infrastructure patch
enforces a "0" for prctl()'s arg4. The "immutable mode" allows only 0
(disabled) or 1 (enabled) as argument.
The "perf bench" bench adds the "bucket" and "immutable" support.
- The position of node member after the uaddr is computed in units of
u32. Added a cast to (void *) to get the math in right.
- Added FUTEX2_MPOL to FUTEX2_VALID_MASK assuming that we want to expose
it. However the mpol seems not to work here but it is likely that my
setup is proper.
- If the user specified FUTEX_NO_NODE as node then the node is updated
to a valid node number. The node value is only written back to the
user if it has been changed.
While this only avoids the unnecessary write back if the user supplied
a valid node number the whole interface is slighly race if
FUTEX_NO_NODE is supplied and two futex_wait() invocations are invoked
on parallel then the first invocation can set node to 0 and the send
to 1. The following callers will stick to node 1 but the first one
will remain waiting on the wrong node.
- Added selftests for private hash and the NUMA bits.
v10…v11: https://lore.kernel.org/all/20250312151634.2183278-1-bigeasy@linutronix.de
- PeterZ' fixups, changes to the local hash series have been folded
into the earlier patches so things are not added and renamed later
and the functionality is changed.
- vmalloc_huge() has been implemented on top of vmalloc_huge_node()
and the NOMMU bots have been adjusted. akpm asked for this.
- wake_up_var() has been removed from __futex_pivot_hash(). It is
enough to wake the userspace waiter after the final put so it can
perform the resize itself.
- Changed to logic in futex_pivot_pending() so it does not block for
the user. It waits for __futex_pivot_hash() which follows the logic
in __futex_pivot_hash().
- Updated kernel doc for __futex_hash().
- Patches 17+ are new:
- Wire up PR_FUTEX_HASH_SET_SLOTS in "perf bench futex"
- Add "immutable" mode to PR_FUTEX_HASH_SET_SLOTS to avoid resizing
the local hash any further. This avoids rcuref usage which is
noticeable in "perf bench futex hash"
Peter Zijlstra (8):
mm: Add vmalloc_huge_node()
futex: Move futex_queue() into futex_wait_setup()
futex: Pull futex_hash() out of futex_q_lock()
futex: Create hb scopes
futex: Create futex_hash() get/put class
futex: Create private_hash() get/put class
futex: Implement FUTEX2_NUMA
futex: Implement FUTEX2_MPOL
Sebastian Andrzej Siewior (13):
rcuref: Provide rcuref_is_dead()
futex: Acquire a hash reference in futex_wait_multiple_setup()
futex: Decrease the waiter count before the unlock operation
futex: Introduce futex_q_lockptr_lock()
futex: Create helper function to initialize a hash slot
futex: Add basic infrastructure for local task local hash
futex: Allow automatic allocation of process wide futex hash
futex: Allow to resize the private local hash
futex: Allow to make the private hash immutable
tools headers: Synchronize prctl.h ABI header
tools/perf: Allow to select the number of hash buckets
selftests/futex: Add futex_priv_hash
selftests/futex: Add futex_numa_mpol
include/linux/futex.h | 36 +-
include/linux/mm_types.h | 7 +-
include/linux/mmap_lock.h | 4 +
include/linux/rcuref.h | 22 +-
include/linux/vmalloc.h | 9 +-
include/uapi/linux/futex.h | 10 +-
include/uapi/linux/prctl.h | 6 +
init/Kconfig | 10 +
io_uring/futex.c | 4 +-
kernel/fork.c | 24 +
kernel/futex/core.c | 802 ++++++++++++++++--
kernel/futex/futex.h | 73 +-
kernel/futex/pi.c | 306 ++++---
kernel/futex/requeue.c | 480 +++++------
kernel/futex/waitwake.c | 201 +++--
kernel/sys.c | 4 +
mm/nommu.c | 18 +-
mm/vmalloc.c | 11 +-
tools/include/uapi/linux/prctl.h | 44 +-
tools/perf/bench/Build | 1 +
tools/perf/bench/futex-hash.c | 7 +
tools/perf/bench/futex-lock-pi.c | 5 +
tools/perf/bench/futex-requeue.c | 6 +
tools/perf/bench/futex-wake-parallel.c | 9 +-
tools/perf/bench/futex-wake.c | 4 +
tools/perf/bench/futex.c | 65 ++
tools/perf/bench/futex.h | 5 +
.../selftests/futex/functional/.gitignore | 6 +-
.../selftests/futex/functional/Makefile | 4 +-
.../futex/functional/futex_numa_mpol.c | 232 +++++
.../futex/functional/futex_priv_hash.c | 315 +++++++
.../testing/selftests/futex/functional/run.sh | 7 +
.../selftests/futex/include/futex2test.h | 34 +
33 files changed, 2199 insertions(+), 572 deletions(-)
create mode 100644 tools/perf/bench/futex.c
create mode 100644 tools/testing/selftests/futex/functional/futex_numa_mpol.c
create mode 100644 tools/testing/selftests/futex/functional/futex_priv_hash.c
--
2.49.0
Powered by blists - more mailing lists