[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250326153703.n_PgVpun@linutronix.de>
Date: Wed, 26 Mar 2025 16:37:03 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: linux-kernel@...r.kernel.org
Cc: André Almeida <andrealmeid@...lia.com>,
Darren Hart <dvhart@...radead.org>,
Davidlohr Bueso <dave@...olabs.net>, Ingo Molnar <mingo@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Valentin Schneider <vschneid@...hat.com>,
Waiman Long <longman@...hat.com>
Subject: Re: [PATCH v10 18/21] futex: Rework SET_SLOTS
On 2025-03-12 16:16:31 [+0100], To linux-kernel@...r.kernel.org wrote:
I am folding and testing and
…
> +static bool futex_pivot_pending(struct mm_struct *mm)
> +{
> + struct futex_private_hash *fph;
> +
> + guard(rcu)();
> +
> + if (!mm->futex_phash_new)
> + return false;
> +
> + fph = rcu_dereference(mm->futex_phash);
> + return !rcuref_read(&fph->users);
> +}
…
> +static int futex_hash_allocate(unsigned int hash_slots, bool custom)
…
> /*
> - * Will set mm->futex_phash_new on failure;
> - * futex_get_private_hash() will try again.
> + * Only let prctl() wait / retry; don't unduly delay clone().
> */
> - __futex_pivot_hash(mm, fph);
> +again:
> + wait_var_event(mm, futex_pivot_pending(mm));
This wait condition should be !futex_pivot_pending(). Otherwise it
blocks. We want to wait until the current futex_phash_new assignment is
gone and the ::users counter is >0.
This brings me to the wake condition of which we have two:
> @@ -207,6 +203,7 @@ static bool __futex_pivot_hash(struct mm_struct *mm,
> }
> rcu_assign_pointer(mm->futex_phash, new);
> kvfree_rcu(fph, rcu);
> + wake_up_var(mm);
> return true;
> }
>
> @@ -262,7 +259,8 @@ void futex_private_hash_put(struct futex_private_hash *fph)
> * Ignore the result; the DEAD state is picked up
> * when rcuref_get() starts failing via rcuref_is_dead().
> */
> - bool __maybe_unused ignore = rcuref_put(&fph->users);
> + if (rcuref_put(&fph->users))
> + wake_up_var(fph->mm);
> }
The one in __futex_pivot_hash() makes sense because ::futex_phash_new is
NULL and the users counter is set to one.
The wake in futex_private_hash_put() doesn't make sense. At this point
we have ::futex_phash_new set and rcuref_read() returns 0. So we
schedule again after the wake.
Therefore we could remove the wake from futex_private_hash_put().
However, if there is no futex operation (unlikely) then we are stuck in
wait_var_event() forever. Therefore I would suggest to:
diff --git a/kernel/futex/core.c b/kernel/futex/core.c
index 65523f3cfe32e..64c7be8df955c 100644
--- a/kernel/futex/core.c
+++ b/kernel/futex/core.c
@@ -210,7 +210,6 @@ static bool __futex_pivot_hash(struct mm_struct *mm,
}
rcu_assign_pointer(mm->futex_phash, new);
kvfree_rcu(fph, rcu);
- wake_up_var(mm);
return true;
}
@@ -1522,10 +1521,10 @@ static bool futex_pivot_pending(struct mm_struct *mm)
guard(rcu)();
if (!mm->futex_phash_new)
- return false;
+ return true;
fph = rcu_dereference(mm->futex_phash);
- return !rcuref_read(&fph->users);
+ return rcuref_is_dead(&fph->users);
}
static bool futex_hash_less(struct futex_private_hash *a,
-> Attempt to replace if there no replacement pending (futex_phash_new == NULL).
-> If there is replacement (futex_phash_new != NULL) then wait until the
current private hash is DEAD. This happens once the last user is gone
and gives the wakeup.
Sebastian
Powered by blists - more mailing lists