linux-kernel - Re: [PATCH] locking/mutex: Disable preemption in __mutex_unlock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wimw8A1ReDPMyAVPrB3rEzenkk-u21RN123BGmnGBwjiQ@mail.gmail.com>
Date: Wed, 9 Jul 2025 11:19:22 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Waiman Long <longman@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Will Deacon <will@...nel.org>, Boqun Feng <boqun.feng@...il.com>, Jonathan Corbet <corbet@....net>, 
	linux-kernel@...r.kernel.org, Jann Horn <jannh@...gle.com>
Subject: Re: [PATCH] locking/mutex: Disable preemption in __mutex_unlock_slowpath()

On Wed, 9 Jul 2025 at 11:06, Waiman Long <longman@...hat.com> wrote:
>
> This race condition is possible especially if a preemption happens right
> after releasing the lock but before acquiring the wait_lock. Rwsem's
> __up_write() and __up_read() helpers have already disabled
> preemption to minimize this vulnernable time period, do the same for
> __mutex_unlock_slowpath() to minimize the chance of this race condition.

I think this patch is actively detrimental, in that it only helps hide
the bug. The bug still exists, it's just harder to hit.

Maybe that is worth it as a "hardening" thing, but I feel this makes
people believe even *more* that they can use mutexes for object
lifetimes.

And that's a fundamentally buggy assumption. Locking is about mutual
exclusion, not lifetimes. There are very specific things where
"release the lock" also releases an object, but they should be
considered very very special.

All objects that aren't purely thread-local should have lifetimes that
depend *solely* on refcounts. Anything else is typically a serious
bug, or needs to be actively discouraged, not encouraged like this.

Even with things like RCU, the lifetime of the object should be about
refcounts, and the RCU thing should be purely about "I'm going
asynchronous lookups that aren't protected by any locks outside this
object, so I may see objects that are being torn down".

I absolutely detest the notion of "let's make locking be tied to
object lifetimes".

Note that locks *outside* the object is obviously very very normal,
but then the *lock* has a totally different lifetime entirely, and the
lifetime of the lock has nothing to do with the lifetime of the
object.

Please don't confuse the two. This was eventpoll being completely
broken. As usual. We've had less eventpoll breakage lately than we
historically used to have, but that's hopefully because that horrid
pile is not actively developed any more, and is slowly - oh so very
slowly - getting fixed.

I'm very sad that io_uring ended up with eventpoll support, but that
was sold on the premise that it makes it easier to incrementally turn
some eventpoll user into an io_uring user. I certainly hope it leads
to less epoll use in the long run, rather than more.

             Linus