[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4fcb119f-9a5d-417b-aeaa-977a3a794b4c@redhat.com>
Date: Wed, 9 Jul 2025 14:19:06 -0400
From: Waiman Long <llong@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Will Deacon <will@...nel.org>, Boqun Feng <boqun.feng@...il.com>,
Jonathan Corbet <corbet@....net>
Cc: linux-kernel@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>, Jann Horn <jannh@...gle.com>
Subject: Re: [PATCH] locking/mutex: Disable preemption in
__mutex_unlock_slowpath()
On 7/9/25 2:05 PM, Waiman Long wrote:
> Jann reported a possible UAF scenario where a task in the mutex_unlock()
> path had released the mutex and was about to acquire the wait_lock
> to check out the waiters. In the interim, another task could come in,
> acquire and release the mutex and then free the memory object holding
> the mutex after that.
>
> Thread A Thread B
> ======== ========
> eventpoll_release_file
> mutex_lock
> [success on trylock fastpath]
> __ep_remove
> ep_refcount_dec_and_test
> [drop refcount from 2 to 1]
>
> ep_eventpoll_release
> ep_clear_and_put
> mutex_lock
> __mutex_lock_slowpath
> __mutex_lock
> __mutex_lock_common
> __mutex_add_waiter
> [enqueue waiter]
> [set MUTEX_FLAG_WAITERS]
>
> mutex_unlock
> __mutex_unlock_slowpath
> atomic_long_try_cmpxchg_release
> [reads MUTEX_FLAG_WAITERS]
> [drops lock ownership]
>
> __mutex_trylock
> [success]
> __mutex_remove_waiter
> ep_refcount_dec_and_test
> [drop refcount from 1 to 0]
> mutex_unlock
> ep_free
> kfree(ep)
>
> raw_spin_lock_irqsave(&lock->wait_lock)
> *** UAF WRITE ***
>
> This race condition is possible especially if a preemption happens right
> after releasing the lock but before acquiring the wait_lock. Rwsem's
> __up_write() and __up_read() helpers have already disabled
> preemption to minimize this vulnernable time period, do the same for
> __mutex_unlock_slowpath() to minimize the chance of this race condition.
>
> Also add a note in Documentation/locking/mutex-design.rst to suggest
> that callers can use rcu_free() to delay the actual memory free to
> eliminate this UAF scenario.
>
> Signed-off-by: Waiman Long <longman@...hat.com>
> ---
> Documentation/locking/mutex-design.rst | 6 ++++--
> kernel/locking/mutex.c | 6 ++++++
> 2 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst
> index 7c30b4aa5e28..51a3a28ca830 100644
> --- a/Documentation/locking/mutex-design.rst
> +++ b/Documentation/locking/mutex-design.rst
> @@ -117,8 +117,10 @@ the structure anymore.
>
> The mutex user must ensure that the mutex is not destroyed while a
> release operation is still in progress - in other words, callers of
> -mutex_unlock() must ensure that the mutex stays alive until mutex_unlock()
> -has returned.
> +mutex_unlock() must ensure that the mutex stays alive until
> +mutex_unlock() has returned. One possible way to do that is to use
> +kfree_rcu() or its variants to delay the actual freeing the memory
> +object containing the mutex.
>
> Interfaces
> ----------
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index a39ecccbd106..d33f36d305fb 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -912,9 +912,15 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne
> * Release the lock before (potentially) taking the spinlock such that
> * other contenders can get on with things ASAP.
> *
> + * Preemption is disabled to minimize the time gap between releasing
> + * the lock and acquiring the wait_lock. Callers may consider using
> + * kfree_rcu() if the memory holding the mutex may be freed after
> + * another mutex_unlock() call to ensure that UAF will not happen.
> + *
> * Except when HANDOFF, in that case we must not clear the owner field,
> * but instead set it to the top waiter.
> */
> + guard(preempt)();
> owner = atomic_long_read(&lock->owner);
> for (;;) {
> MUTEX_WARN_ON(__owner_task(owner) != current);
Note that Linus' patch should fix this particular eventpoll UAF race
condition, but there may be other code paths where similar situations
can happen.
Cheers,
Longman
Powered by blists - more mailing lists