lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4fcb119f-9a5d-417b-aeaa-977a3a794b4c@redhat.com>
Date: Wed, 9 Jul 2025 14:19:06 -0400
From: Waiman Long <llong@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
 Will Deacon <will@...nel.org>, Boqun Feng <boqun.feng@...il.com>,
 Jonathan Corbet <corbet@....net>
Cc: linux-kernel@...r.kernel.org,
 Linus Torvalds <torvalds@...ux-foundation.org>, Jann Horn <jannh@...gle.com>
Subject: Re: [PATCH] locking/mutex: Disable preemption in
 __mutex_unlock_slowpath()

On 7/9/25 2:05 PM, Waiman Long wrote:
> Jann reported a possible UAF scenario where a task in the mutex_unlock()
> path had released the mutex and was about to acquire the wait_lock
> to check out the waiters. In the interim, another task could come in,
> acquire and release the mutex and then free the memory object holding
> the mutex after that.
>
> Thread A                      Thread B
> ========                      ========
>                                eventpoll_release_file
>                                  mutex_lock
>                                    [success on trylock fastpath]
>                                  __ep_remove
>                                    ep_refcount_dec_and_test
>                                      [drop refcount from 2 to 1]
>
> ep_eventpoll_release
>    ep_clear_and_put
>      mutex_lock
>        __mutex_lock_slowpath
>          __mutex_lock
>            __mutex_lock_common
>              __mutex_add_waiter
>                [enqueue waiter]
>                [set MUTEX_FLAG_WAITERS]
>
>                                  mutex_unlock
>                                    __mutex_unlock_slowpath
>                                      atomic_long_try_cmpxchg_release
>                                        [reads MUTEX_FLAG_WAITERS]
>                                        [drops lock ownership]
>
>              __mutex_trylock
>                [success]
>              __mutex_remove_waiter
>      ep_refcount_dec_and_test
>        [drop refcount from 1 to 0]
>      mutex_unlock
>      ep_free
>        kfree(ep)
>
>                                      raw_spin_lock_irqsave(&lock->wait_lock)
>                                        *** UAF WRITE ***
>
> This race condition is possible especially if a preemption happens right
> after releasing the lock but before acquiring the wait_lock. Rwsem's
> __up_write() and __up_read() helpers have already disabled
> preemption to minimize this vulnernable time period, do the same for
> __mutex_unlock_slowpath() to minimize the chance of this race condition.
>
> Also add a note in Documentation/locking/mutex-design.rst to suggest
> that callers can use rcu_free() to delay the actual memory free to
> eliminate this UAF scenario.
>
> Signed-off-by: Waiman Long <longman@...hat.com>
> ---
>   Documentation/locking/mutex-design.rst | 6 ++++--
>   kernel/locking/mutex.c                 | 6 ++++++
>   2 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst
> index 7c30b4aa5e28..51a3a28ca830 100644
> --- a/Documentation/locking/mutex-design.rst
> +++ b/Documentation/locking/mutex-design.rst
> @@ -117,8 +117,10 @@ the structure anymore.
>   
>   The mutex user must ensure that the mutex is not destroyed while a
>   release operation is still in progress - in other words, callers of
> -mutex_unlock() must ensure that the mutex stays alive until mutex_unlock()
> -has returned.
> +mutex_unlock() must ensure that the mutex stays alive until
> +mutex_unlock() has returned. One possible way to do that is to use
> +kfree_rcu() or its variants to delay the actual freeing the memory
> +object containing the mutex.
>   
>   Interfaces
>   ----------
> diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
> index a39ecccbd106..d33f36d305fb 100644
> --- a/kernel/locking/mutex.c
> +++ b/kernel/locking/mutex.c
> @@ -912,9 +912,15 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne
>   	 * Release the lock before (potentially) taking the spinlock such that
>   	 * other contenders can get on with things ASAP.
>   	 *
> +	 * Preemption is disabled to minimize the time gap between releasing
> +	 * the lock and acquiring the wait_lock. Callers may consider using
> +	 * kfree_rcu() if the memory holding the mutex may be freed after
> +	 * another mutex_unlock() call to ensure that UAF will not happen.
> +	 *
>   	 * Except when HANDOFF, in that case we must not clear the owner field,
>   	 * but instead set it to the top waiter.
>   	 */
> +	guard(preempt)();
>   	owner = atomic_long_read(&lock->owner);
>   	for (;;) {
>   		MUTEX_WARN_ON(__owner_task(owner) != current);

Note that Linus' patch should fix this particular eventpoll UAF race 
condition, but there may be other code paths where similar situations 
can happen.

Cheers,
Longman


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ