lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250709180550.147205-1-longman@redhat.com>
Date: Wed,  9 Jul 2025 14:05:50 -0400
From: Waiman Long <longman@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	Will Deacon <will@...nel.org>,
	Boqun Feng <boqun.feng@...il.com>,
	Jonathan Corbet <corbet@....net>
Cc: linux-kernel@...r.kernel.org,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Jann Horn <jannh@...gle.com>,
	Waiman Long <longman@...hat.com>
Subject: [PATCH] locking/mutex: Disable preemption in __mutex_unlock_slowpath()

Jann reported a possible UAF scenario where a task in the mutex_unlock()
path had released the mutex and was about to acquire the wait_lock
to check out the waiters. In the interim, another task could come in,
acquire and release the mutex and then free the memory object holding
the mutex after that.

Thread A                      Thread B
========                      ========
                              eventpoll_release_file
                                mutex_lock
                                  [success on trylock fastpath]
                                __ep_remove
                                  ep_refcount_dec_and_test
                                    [drop refcount from 2 to 1]

ep_eventpoll_release
  ep_clear_and_put
    mutex_lock
      __mutex_lock_slowpath
        __mutex_lock
          __mutex_lock_common
            __mutex_add_waiter
              [enqueue waiter]
              [set MUTEX_FLAG_WAITERS]

                                mutex_unlock
                                  __mutex_unlock_slowpath
                                    atomic_long_try_cmpxchg_release
                                      [reads MUTEX_FLAG_WAITERS]
                                      [drops lock ownership]

            __mutex_trylock
              [success]
            __mutex_remove_waiter
    ep_refcount_dec_and_test
      [drop refcount from 1 to 0]
    mutex_unlock
    ep_free
      kfree(ep)

                                    raw_spin_lock_irqsave(&lock->wait_lock)
                                      *** UAF WRITE ***

This race condition is possible especially if a preemption happens right
after releasing the lock but before acquiring the wait_lock. Rwsem's
__up_write() and __up_read() helpers have already disabled
preemption to minimize this vulnernable time period, do the same for
__mutex_unlock_slowpath() to minimize the chance of this race condition.

Also add a note in Documentation/locking/mutex-design.rst to suggest
that callers can use rcu_free() to delay the actual memory free to
eliminate this UAF scenario.

Signed-off-by: Waiman Long <longman@...hat.com>
---
 Documentation/locking/mutex-design.rst | 6 ++++--
 kernel/locking/mutex.c                 | 6 ++++++
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/Documentation/locking/mutex-design.rst b/Documentation/locking/mutex-design.rst
index 7c30b4aa5e28..51a3a28ca830 100644
--- a/Documentation/locking/mutex-design.rst
+++ b/Documentation/locking/mutex-design.rst
@@ -117,8 +117,10 @@ the structure anymore.
 
 The mutex user must ensure that the mutex is not destroyed while a
 release operation is still in progress - in other words, callers of
-mutex_unlock() must ensure that the mutex stays alive until mutex_unlock()
-has returned.
+mutex_unlock() must ensure that the mutex stays alive until
+mutex_unlock() has returned. One possible way to do that is to use
+kfree_rcu() or its variants to delay the actual freeing the memory
+object containing the mutex.
 
 Interfaces
 ----------
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index a39ecccbd106..d33f36d305fb 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -912,9 +912,15 @@ static noinline void __sched __mutex_unlock_slowpath(struct mutex *lock, unsigne
 	 * Release the lock before (potentially) taking the spinlock such that
 	 * other contenders can get on with things ASAP.
 	 *
+	 * Preemption is disabled to minimize the time gap between releasing
+	 * the lock and acquiring the wait_lock. Callers may consider using
+	 * kfree_rcu() if the memory holding the mutex may be freed after
+	 * another mutex_unlock() call to ensure that UAF will not happen.
+	 *
 	 * Except when HANDOFF, in that case we must not clear the owner field,
 	 * but instead set it to the top waiter.
 	 */
+	guard(preempt)();
 	owner = atomic_long_read(&lock->owner);
 	for (;;) {
 		MUTEX_WARN_ON(__owner_task(owner) != current);
-- 
2.50.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ