lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250618160333.PdGB89yt@linutronix.de>
Date: Wed, 18 Jun 2025 18:03:33 +0200
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Calvin Owens <calvin@...nvd.org>
Cc: linux-kernel@...r.kernel.org, linux-tip-commits@...r.kernel.org,
	"Lai, Yi" <yi1.lai@...ux.intel.com>,
	"Peter Zijlstra (Intel)" <peterz@...radead.org>, x86@...nel.org
Subject: Re: [tip: locking/urgent] futex: Allow to resize the private local
 hash

On 2025-06-17 09:11:06 [-0700], Calvin Owens wrote:
> Actually got an oops this time:
> 
>     Oops: general protection fault, probably for non-canonical address 0xfdd92c90843cf111: 0000 [#1] SMP
>     CPU: 3 UID: 1000 PID: 323127 Comm: cargo Not tainted 6.16.0-rc2-lto-00024-g9afe652958c3 #1 PREEMPT 
>     Hardware name: ASRock B850 Pro-A/B850 Pro-A, BIOS 3.11 11/12/2024
>     RIP: 0010:queued_spin_lock_slowpath+0x12a/0x1d0
…
>     Call Trace:
>      <TASK>
>      futex_unqueue+0x2e/0x110
>      __futex_wait+0xc5/0x130
>      futex_wait+0xee/0x180
>      do_futex+0x86/0x120
>      __se_sys_futex+0x16d/0x1e0
>      do_syscall_64+0x47/0x170
>      entry_SYSCALL_64_after_hwframe+0x4b/0x53
>     RIP: 0033:0x7f086e918779

The lock_ptr is pointing to invalid memory. It explodes within
queued_spin_lock_slowpath() which looks like decode_tail() returned a
wrong pointer/ offset.

futex_queue() adds a local futex_q to the list and its lock_ptr points
to the hb lock. Then we do schedule() and after the wakeup the lock_ptr
is NULL after a successful wake.  Otherwise it still points to the
futex_hash_bucket::lock.

Since futex_unqueue() attempts to acquire the lock, then there was no
wakeup but a timeout or a signal that ended the wait. The lock_ptr can
change during resize.
During the resize futex_rehash_private() moves the futex_q members from
the old queue to the new one. The lock is accessed within RCU and the
lock_ptr value is compared against the old value after locking. That
means it is accessed either before the rehash moved it the new hash
bucket or afterwards.
I don't see how this pointer can become invalid. RCU protects against
cleanup and the pointer compare ensures that it is the "current"
pointer.
I've been looking at clang's assembly of futex_unqueue() and it looks
correct. And futex_rehash_private() iterates over all slots.

> This is a giant Yocto build, but the comm is always cargo, so hopefully
> I can run those bits in isolation and hit it more quickly.

If it still explodes without LTO, would you mind trying gcc?

> Thanks,
> Calvin

Sebastian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ