[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87msh2b891.ffs@tglx>
Date: Wed, 11 Dec 2024 15:32:26 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
linux-kernel@...r.kernel.org
Cc: André Almeida <andrealmeid@...lia.com>, Darren Hart
<dvhart@...radead.org>, Davidlohr Bueso <dave@...olabs.net>, Ingo Molnar
<mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>, Peter Zijlstra
<peterz@...radead.org>, Valentin Schneider <vschneid@...hat.com>, Waiman
Long <longman@...hat.com>, Sebastian Andrzej Siewior
<bigeasy@...utronix.de>
Subject: Re: [PATCH v4 06/11] futex: Allow to re-allocate the private hash
bucket.
On Tue, Dec 10 2024 at 23:27, Thomas Gleixner wrote:
> Why does unqueue() work w/o a hash bucket reference?
>
> unqueue(q)
> {
This actually needs a
guard(rcu);
to protect against a concurrent rehashing.
> retry:
> lock_ptr = READ_ONCE(q->lock_ptr);
> // Wake up ?
> if (!lock_ptr)
> return 0;
>
> spin_lock(lock_ptr);
>
> // This covers both requeue and rehash operations
> if (lock_ptr != q->lock_ptr) {
> spin_unlock(lock_ptr);
> goto retry;
> }
>
> __unqueue(q);
> spin_unlock(lock_ptr);
> }
>
> Nothing in unqueue() requires a reference on the hash. The lock pointer
> logic covers both requeue and rehash operations. They are equivalent,
> no?
>
> wake() is not really different. It needs to change the way how the
> private retry works:
>
> wake_op()
> {
> retry:
> get_key(key1);
> get_ket(key2);
>
> retry_private:
> double_get_and_lock(&hb1, &hb2, &key1, &key2);
> .....
> double_unlock_and_put(&hb1, &hb2);
> .....
> }
>
> Moving retry private before the point where the hash bucket is retrieved
> and locked is required in some other place too. And some places use
> q.lock_ptr under the assumption that it can't change, which probably
> needs reevaluation of the hash bucket. Other stuff like lock_pi() needs
> a seperation of unlocking the hash bucket and dropping the reference.
>
> But that are all minor changes.
>
> All of them can be done on a per function basis before adding the actual
> private hash muck, which makes the whole thing reviewable. This patch
> definitely does not qualify for reviewable.
>
> All you need are implementations for hb_get_and_lock/unlock_and_put()
> plus the double variants and a hash_put() helper. Those implementations
> use the global hash until all places are mopped up and then you can add
> the private magic in exatly those places
>
> There is not a single place where you need magic state fixups in the
> middle of the functions or conditional locking, which turns out to be
> not sufficient.
>
> The required helpers are:
>
> hb_get_and_lock(key)
> {
> if (private(key))
> hb = private_hash(key); // Gets a reference
> else
> hb = hash_bucket(global_hash, key);
> hb_lock(hb);
> return hb;
> }
>
> hb_unlock_and_put(hb)
> {
> hb_unlock(hb);
> if (private(hb))
> hb_private_put(hb);
> }
>
> The double lock/unlock variants are equivalent.
>
> private_hash(key)
> {
> scoped_guard(rcu) {
> hash = rcu_deref(current->mm->futex.hash);
This actually requires:
if (!hash)
return global_hash;
otherwise this results in a NULL pointer dereference, aka. unpriviledged
DoS when a single threaded process invokes sys_futex(...) directly.
That begs the question whether current->mm->futex.hash should be
initialized with &global_hash in the first place and &global_hash having
a reference count too, which never can go to zero. That would simplify
the whole logic there.
Thanks,
tglx
Powered by blists - more mailing lists