[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <986dcbc0-0505-496a-ae75-e0c1bd7c2725@igalia.com>
Date: Thu, 8 May 2025 17:32:24 -0300
From: André Almeida <andrealmeid@...lia.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
linux-kernel@...r.kernel.org
Cc: Darren Hart <dvhart@...radead.org>, Davidlohr Bueso <dave@...olabs.net>,
Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
Peter Zijlstra <peterz@...radead.org>, Thomas Gleixner <tglx@...utronix.de>,
Valentin Schneider <vschneid@...hat.com>, Waiman Long <longman@...hat.com>
Subject: Re: [PATCH v12 14/21] futex: Allow to resize the private local hash
Em 16/04/2025 13:29, Sebastian Andrzej Siewior escreveu:
> The mm_struct::futex_hash_lock guards the futex_hash_bucket assignment/
> replacement. The futex_hash_allocate()/ PR_FUTEX_HASH_SET_SLOTS
> operation can now be invoked at runtime and resize an already existing
> internal private futex_hash_bucket to another size.
>
> The reallocation is based on an idea by Thomas Gleixner: The initial
> allocation of struct futex_private_hash sets the reference count
> to one. Every user acquires a reference on the local hash before using
> it and drops it after it enqueued itself on the hash bucket. There is no
> reference held while the task is scheduled out while waiting for the
> wake up.
> The resize process allocates a new struct futex_private_hash and drops
> the initial reference. Synchronized with mm_struct::futex_hash_lock it
> is checked if the reference counter for the currently used
> mm_struct::futex_phash is marked as DEAD. If so, then all users enqueued
> on the current private hash are requeued on the new private hash and the
> new private hash is set to mm_struct::futex_phash. Otherwise the newly
> allocated private hash is saved as mm_struct::futex_phash_new and the
> rehashing and reassigning is delayed to the futex_hash() caller once the
> reference counter is marked DEAD.
> The replacement is not performed at rcuref_put() time because certain
> callers, such as futex_wait_queue(), drop their reference after changing
> the task state. This change will be destroyed once the futex_hash_lock
> is acquired.
>
> The user can change the number slots with PR_FUTEX_HASH_SET_SLOTS
> multiple times. An increase and decrease is allowed and request blocks
> until the assignment is done.
>
> The private hash allocated at thread creation is changed from 16 to
> 16 <= 4 * number_of_threads <= global_hash_size
> where number_of_threads can not exceed the number of online CPUs. Should
> the user PR_FUTEX_HASH_SET_SLOTS then the auto scaling is disabled.
>
> [peterz: reorganize the code to avoid state tracking and simplify new
> object handling, block the user until changes are in effect, allow
> increase and decrease of the hash].
>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
> ---
> include/linux/futex.h | 3 +-
> include/linux/mm_types.h | 4 +-
> kernel/futex/core.c | 290 ++++++++++++++++++++++++++++++++++++---
> kernel/futex/requeue.c | 5 +
> 4 files changed, 281 insertions(+), 21 deletions(-)
>
[...]
> static int futex_hash_allocate(unsigned int hash_slots, bool custom)
> @@ -1273,16 +1442,23 @@ static int futex_hash_allocate(unsigned int hash_slots, bool custom)
> if (hash_slots && (hash_slots == 1 || !is_power_of_2(hash_slots)))
> return -EINVAL;
>
> - if (mm->futex_phash)
> - return -EALREADY;
> -
> - if (!thread_group_empty(current))
> - return -EINVAL;
> + /*
> + * Once we've disabled the global hash there is no way back.
> + */
> + scoped_guard(rcu) {
> + fph = rcu_dereference(mm->futex_phash);
> + if (fph && !fph->hash_mask) {
> + if (custom)
> + return -EBUSY;
> + return 0;
> + }
> + }
>
> fph = kvzalloc(struct_size(fph, queues, hash_slots), GFP_KERNEL_ACCOUNT | __GFP_NOWARN);
> if (!fph)
> return -ENOMEM;
>
> + rcuref_init(&fph->users, 1);
> fph->hash_mask = hash_slots ? hash_slots - 1 : 0;
> fph->custom = custom;
> fph->mm = mm;
> @@ -1290,26 +1466,102 @@ static int futex_hash_allocate(unsigned int hash_slots, bool custom)
> for (i = 0; i < hash_slots; i++)
> futex_hash_bucket_init(&fph->queues[i], fph);
>
> - mm->futex_phash = fph;
If (hash_slots == 0), do we still need to do all of this work bellow? I
thought that using the global hash would allow to skip this.
> + if (custom) {
> + /*
> + * Only let prctl() wait / retry; don't unduly delay clone().
> + */
> +again:
> + wait_var_event(mm, futex_pivot_pending(mm));
> + }
> +
> + scoped_guard(mutex, &mm->futex_hash_lock) {
> + struct futex_private_hash *free __free(kvfree) = NULL;
> + struct futex_private_hash *cur, *new;
> +
> + cur = rcu_dereference_protected(mm->futex_phash,
> + lockdep_is_held(&mm->futex_hash_lock));
> + new = mm->futex_phash_new;
> + mm->futex_phash_new = NULL;
> +
> + if (fph) {
> + if (cur && !new) {
> + /*
> + * If we have an existing hash, but do not yet have
> + * allocated a replacement hash, drop the initial
> + * reference on the existing hash.
> + */
> + futex_private_hash_put(cur);
> + }
> +
> + if (new) {
> + /*
> + * Two updates raced; throw out the lesser one.
> + */
> + if (futex_hash_less(new, fph)) {
> + free = new;
> + new = fph;
> + } else {
> + free = fph;
> + }
> + } else {
> + new = fph;
> + }
> + fph = NULL;
> + }
> +
> + if (new) {
> + /*
> + * Will set mm->futex_phash_new on failure;
> + * futex_private_hash_get() will try again.
> + */
> + if (!__futex_pivot_hash(mm, new) && custom)
> + goto again;
Is it safe to use a goto inside a scoped_guard(){}?
Powered by blists - more mailing lists