lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aHHipJuS1yyhhkbT@tardis-2.local>
Date: Fri, 11 Jul 2025 21:20:52 -0700
From: Boqun Feng <boqun.feng@...il.com>
To: Waiman Long <llong@...hat.com>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
	linux-kernel@...r.kernel.org, Jann Horn <jannh@...gle.com>
Subject: Re: [PATCH] locking/mutex: Add debug code to help catching violation
 of mutex lifetime rule

On Fri, Jul 11, 2025 at 11:16:43PM -0400, Waiman Long wrote:
> > > > > I think this may also give us a good chance of finding more bugs, one of
> > > > > the reasons is that raw_spin_unlock_irqrestore_wake() has a
> > > > > preempt_enable() at last, which may trigger a context switch.
> > > > > 
> > > > > Regards,
> > > > > Boqun
> > > > I think this is a good idea. We should extend that to add the check in
> > > > rwsem as well. Will a post a patch to do that.
> > > Digging into it some more, I think adding kasan_check_byte() may not be
> > > necessary. If KASAN is enabled, it will instrument the locking code
> > > including __mutex_unlock_slowpath(). I checked the generated assembly code,
> > > it has 2 __kasan_check_read() and 4 __kasan_check_write() calls. Adding an
> > The point is we want to check the memory at the end of
> > __mutex_unlock_slowpath(), so it's an extra checking.
> 
> It is likely that the instrumented kasan_check* calls can be invoked near
> the beginning when the lock is first accessed, as I don't see any
> kasan_check*() around the inlined raw_spin_unlock_irqrestore_wake() call.
> 

One thing to notice is that __kasan_check_{read,write}() are the
instrument of atomic operations (because compiler cannot genenarte the
instrument for asm), try_cmpxchg() would instrument two
kasan_check_write()s and atomic_long_read() would instrument one
kasan_check_read(), and we have 2 try_cmpxchg() and 2 atomic_long_read()
in __mutex_unlock_slowpath(), that's why you saw 4 kasan_check_write()
and 2 kasan_check_read().

Compiler intrumentation is using function like:

	__asan_report_store8_noabort()
	__asan_report_load8_noabort()

And I checked the asmebly code of __mutex_unlock_slowpath(), and it
turns out it does instrument the memory operations on stack:

    3b1c: 94000000      bl      0x3b1c <__mutex_unlock_slowpath+0x188>
                0000000000003b1c:  R_AARCH64_CALL26     wake_up_q
    3b20: 52800028      mov     w8, #0x1                // =1
    3b24: f9000be8      str     x8, [sp, #0x10]
    3b28: 38776b08      ldrb    w8, [x24, x23]
    3b2c: 34000068      cbz     w8, 0x3b38 <__mutex_unlock_slowpath+0x1a4>
    3b30: aa1303e0      mov     x0, x19
    3b34: 94000000      bl      0x3b34 <__mutex_unlock_slowpath+0x1a0>
                0000000000003b34:  R_AARCH64_CALL26     __asan_report_store8_noabort
    3b38: f9000ff4      str     x20, [sp, #0x18]

    	^ this is the "head->lastp = &head->first;" in wake_q_init()
	and x19 is sp + 0x18, if you believe me ;-)

> So if we want a check at the end, we may have to manually add one.

But yes, we want to add a conceptually check at the very end, as if "the
mutex must be valid in the whole mutex_unlock() function"

> 
> > 
> > Also since kasan will instrument all memory accesses, what you saw may
> > not be the instrument on "lock" but something else, for example,
> > wake_q_init() in raw_spin_unlock_irqrestore_wake().
> 
> The wake_q memory is from stack which I don't believe the compiler will
> generate kasan_check for that. I also don't see any kasan_check*() call when
> the wake_q is being manipulated.
> 

See above.

> > Actually, I have 3 extension to the idea:
> > 
> > First it occurs to me that we could just put the kasan_check_byte() at
> > the outermost thing, for example, mutex_unlock().
> > 
> > Second I wonder whether kasan has a way to tag a pointer parameter of a
> > function, for example for mutex_unlock():
> > 
> > 	void mutex_unlock(struct mutex * __ref lock)
> > 	{
> > 		...
> > 	}
> > 
> > a kasan_check_byte(lock) will auto generate whenever the function
> > returns.
> > 

I'm more curious whether kasan can support this.

> > I actually tried to use __cleanup to implement __ref, like
> > 
> > 	#define __ref __cleanup(kasan_check_byte)
> > 
> > but seems the "cleanup" attritube doesn't work on function parameters ;(
> > 
> > Third, I went to implement a always_alive():
> > 
> > 	#define always_alive(ptr)                                                              \
> > 	       typeof(ptr) __UNIQUE_ID(always_alive_guard) __cleanup(kasan_check_byte) = ptr;
> > 
> > and you can use in mutex_unlock():
> > 
> > 	void mutex_unlock(struct mutex *lock)
> > 	{
> > 		always_alive(lock);
> > 		...
> > 	}
> > 
> > This also guarantee we emit a kasan_check_byte() at the very end.
> 
> Adding a kasan_check_byte() test at the end of unlock is a locking specific
> problem that we don't have that many instances where a check is needed. So
> it may not be worth the effort to devise a special mechanism just for that.
> Adding a simple macro to abstract it may be enough. Anyway, it is your call.
> 

Well, we do have a lot of cases where a foo() takes a "struct bar *" and
the expectation is the pointer is always valid in foo(), so maybe there
are other usages.

But sure, we can start using always_alive() in lock primitives only at
first.

Regards,
Boqun

> Cheers,
> Longman
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ