linux-kernel - Re: [PATCH] locking/mutex: Add debug code to help catching violation of mutex lifetime rule

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aHHHZJ5sKGscTCqo@tardis.local>
Date: Fri, 11 Jul 2025 19:24:36 -0700
From: Boqun Feng <boqun.feng@...il.com>
To: Waiman Long <llong@...hat.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
	linux-kernel@...r.kernel.org, Jann Horn <jannh@...gle.com>
Subject: Re: [PATCH] locking/mutex: Add debug code to help catching violation
 of mutex lifetime rule

On Fri, Jul 11, 2025 at 09:48:13PM -0400, Waiman Long wrote:
> On 7/11/25 8:42 PM, Waiman Long wrote:
> > 
> > On 7/11/25 7:28 PM, Boqun Feng wrote:
> > > On Fri, Jul 11, 2025 at 03:30:05PM -0700, Linus Torvalds wrote:
> > > > On Fri, 11 Jul 2025 at 15:20, Boqun Feng <boqun.feng@...il.com> wrote:
> > > > > Meta question: are we able to construct a case that shows
> > > > > this can help
> > > > > detect the issue?
> > > > Well, the thing that triggered this was hopefully fixed by
> > > > 8c2e52ebbe88 ("eventpoll: don't decrement ep refcount while still
> > > > holding the ep mutex"), but I think Jann figured that one out by code
> > > > inspection.
> > > > 
> > > > I doubt it can be triggered in real life without something like
> > > > Waiman's patch, but *with* Waiman's patch, and commit 8c2e52ebbe88
> > > > reverted (and obviously with CONFIG_KASAN and CONFIG_DEBUG_MUTEXES
> > > > enabled), doing lots of concurrent epoll closes would hopefully then
> > > > trigger the warning.
> > > > 
> > > > Of course, to then find *other* potential bugs would be the whole
> > > > point, and some of these kinds of bugs are definitely of the kind
> > > > where the race condition doesn't actually trigger in any real load,
> > > > because it's unlikely that real loads end up doing that kind of
> > > > "release all these objects concurrently".
> > > > 
> > > > But it might be interesting to try that "can you even recreate the bug
> > > > fixed by 8c2e52ebbe88" with this. Because if that one *known* bug
> > > > can't be found by this, then it's obviously unlikely to help find
> > > > others.
> > > > 
> > > Yeah, I guess I asked the question because there is no clear link from
> > > the bug scenario to an extra context switch, that is, even if the
> > > context switch didn't happen, the bug would trigger if
> > > __mutex_unlock_slowpath() took too long after giving the ownership to
> > > someone else. So my instinct was: would cond_resched() be slow enough
> > > ;-)
> > > 
> > > But I agree it's a trivel thing to do, and I think another thing we can
> > > do is adding a kasan_check_byte(lock) at the end of
> > > __mutex_unlock_slowpath(), because conceptually the mutex should be
> > > valid throughout the whole __mutex_unlock_slowpath() function, i.e.
> > > 
> > >     void __mutex_unlock_slowpath(...)
> > >     {
> > >         ...
> > >         raw_spin_unlock_irqrestore_wake(&lock->wait_lock, flags,
> > > &wake_q);
> > >         // <- conceptually "lock" should still be valid here.
> > >         // so if anyone free the memory of the mutex, it's going
> > >         // to be a problem.
> > >         kasan_check_byte(lock);
> > >     }
> > > 
> > > I think this may also give us a good chance of finding more bugs, one of
> > > the reasons is that raw_spin_unlock_irqrestore_wake() has a
> > > preempt_enable() at last, which may trigger a context switch.
> > > 
> > > Regards,
> > > Boqun
> > 
> > I think this is a good idea. We should extend that to add the check in
> > rwsem as well. Will a post a patch to do that.
> 
> Digging into it some more, I think adding kasan_check_byte() may not be
> necessary. If KASAN is enabled, it will instrument the locking code
> including __mutex_unlock_slowpath(). I checked the generated assembly code,
> it has 2 __kasan_check_read() and 4 __kasan_check_write() calls. Adding an

The point is we want to check the memory at the end of
__mutex_unlock_slowpath(), so it's an extra checking.

Also since kasan will instrument all memory accesses, what you saw may
not be the instrument on "lock" but something else, for example,
wake_q_init() in raw_spin_unlock_irqrestore_wake().

Actually, I have 3 extension to the idea:

First it occurs to me that we could just put the kasan_check_byte() at
the outermost thing, for example, mutex_unlock().

Second I wonder whether kasan has a way to tag a pointer parameter of a
function, for example for mutex_unlock():

	void mutex_unlock(struct mutex * __ref lock)
	{
		...
	}

a kasan_check_byte(lock) will auto generate whenever the function
returns.

I actually tried to use __cleanup to implement __ref, like

	#define __ref __cleanup(kasan_check_byte)

but seems the "cleanup" attritube doesn't work on function parameters ;(

Third, I went to implement a always_alive():

	#define always_alive(ptr)                                                              \
	       typeof(ptr) __UNIQUE_ID(always_alive_guard) __cleanup(kasan_check_byte) = ptr;

and you can use in mutex_unlock():

	void mutex_unlock(struct mutex *lock)
	{
		always_alive(lock);
		...
	}

This also guarantee we emit a kasan_check_byte() at the very end.

Regards,
Boqun

> extra kasan_check_byte() can be redundant.
> 
> Cheers,
> Longman
>