[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wj4gifTA94-11JXKj5Q5TSieu2LXgOauNDC9gbOQRcZeg@mail.gmail.com>
Date: Fri, 11 Jul 2025 15:30:05 -0700
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Boqun Feng <boqun.feng@...il.com>
Cc: Waiman Long <longman@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>, linux-kernel@...r.kernel.org,
Jann Horn <jannh@...gle.com>
Subject: Re: [PATCH] locking/mutex: Add debug code to help catching violation
of mutex lifetime rule
On Fri, 11 Jul 2025 at 15:20, Boqun Feng <boqun.feng@...il.com> wrote:
>
> Meta question: are we able to construct a case that shows this can help
> detect the issue?
Well, the thing that triggered this was hopefully fixed by
8c2e52ebbe88 ("eventpoll: don't decrement ep refcount while still
holding the ep mutex"), but I think Jann figured that one out by code
inspection.
I doubt it can be triggered in real life without something like
Waiman's patch, but *with* Waiman's patch, and commit 8c2e52ebbe88
reverted (and obviously with CONFIG_KASAN and CONFIG_DEBUG_MUTEXES
enabled), doing lots of concurrent epoll closes would hopefully then
trigger the warning.
Of course, to then find *other* potential bugs would be the whole
point, and some of these kinds of bugs are definitely of the kind
where the race condition doesn't actually trigger in any real load,
because it's unlikely that real loads end up doing that kind of
"release all these objects concurrently".
But it might be interesting to try that "can you even recreate the bug
fixed by 8c2e52ebbe88" with this. Because if that one *known* bug
can't be found by this, then it's obviously unlikely to help find
others.
That said, it does seem like an obvious trivial thing to stress, which
is why that patch by Waiman has my suggested-by...
Linus
Powered by blists - more mailing lists