linux-kernel - Re: [tip: locking/urgent] futex: Allow to resize the private local hash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aFGTmn_CFkuTbP4i@mozart.vkv.me>
Date: Tue, 17 Jun 2025 09:11:06 -0700
From: Calvin Owens <calvin@...nvd.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: linux-kernel@...r.kernel.org, linux-tip-commits@...r.kernel.org,
	"Lai, Yi" <yi1.lai@...ux.intel.com>,
	"Peter Zijlstra (Intel)" <peterz@...radead.org>, x86@...nel.org
Subject: Re: [tip: locking/urgent] futex: Allow to resize the private local
 hash

On Tuesday 06/17 at 11:50 +0200, Sebastian Andrzej Siewior wrote:
> On 2025-06-17 02:23:08 [-0700], Calvin Owens wrote:
> > Ugh, I'm sorry, I was in too much of a hurry this morning... cargo is
> > obviously not calling PR_FUTEX_HASH which is new in 6.16 :/
> No worries.
> 
> > > This is with LTO enabled.
> > 
> > Full lto with llvm-20.1.7.
> > 
> …
> > Nothing showed up in the logs but the RCU stalls on CPU16, always in
> > queued_spin_lock_slowpath().
> > 
> > I'll run the build it was doing when it happened in a loop overnight and
> > see if I can trigger it again.

Actually got an oops this time:

    Oops: general protection fault, probably for non-canonical address 0xfdd92c90843cf111: 0000 [#1] SMP
    CPU: 3 UID: 1000 PID: 323127 Comm: cargo Not tainted 6.16.0-rc2-lto-00024-g9afe652958c3 #1 PREEMPT 
    Hardware name: ASRock B850 Pro-A/B850 Pro-A, BIOS 3.11 11/12/2024
    RIP: 0010:queued_spin_lock_slowpath+0x12a/0x1d0
    Code: c8 c1 e8 10 66 87 47 02 66 85 c0 74 48 0f b7 c0 49 c7 c0 f8 ff ff ff 89 c6 c1 ee 02 83 e0 03 49 8b b4 f0 00 21 67 83 c1 e0 04 <48> 89 94 30 00 f1 4a 84 83 7a 08 00 75 10 0f 1f 84 00 00 00 00 00
    RSP: 0018:ffffc9002c953d20 EFLAGS: 00010256
    RAX: 0000000000000000 RBX: ffff88814e78be40 RCX: 0000000000100000
    RDX: ffff88901fce5100 RSI: fdd92c90fff20011 RDI: ffff8881c2ae9384
    RBP: 000000000000002b R08: fffffffffffffff8 R09: 00000000002ab900
    R10: 000000000000b823 R11: 0000000000000c00 R12: ffff88814e78be40
    R13: ffffc9002c953d48 R14: ffffc9002c953d48 R15: ffff8881c2ae9384
    FS:  00007f086efb6600(0000) GS:ffff88909b836000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000055ced9c42650 CR3: 000000034b88e000 CR4: 0000000000750ef0
    PKRU: 55555554
    Call Trace:
     <TASK>
     futex_unqueue+0x2e/0x110
     __futex_wait+0xc5/0x130
     ? __futex_wake_mark+0xc0/0xc0
     futex_wait+0xee/0x180
     ? hrtimer_setup_sleeper_on_stack+0xe0/0xe0
     do_futex+0x86/0x120
     __se_sys_futex+0x16d/0x1e0
     ? __x64_sys_write+0xba/0xc0
     do_syscall_64+0x47/0x170
     entry_SYSCALL_64_after_hwframe+0x4b/0x53
    RIP: 0033:0x7f086e918779
    Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4f 86 0d 00 f7 d8 64 89 01 48
    RSP: 002b:00007ffc5815f678 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
    RAX: ffffffffffffffda RBX: 00007f086e918760 RCX: 00007f086e918779
    RDX: 000000000000002b RSI: 0000000000000089 RDI: 00005636f9fb60d0
    RBP: 00007ffc5815f6d0 R08: 0000000000000000 R09: 00007ffcffffffff
    R10: 00007ffc5815f690 R11: 0000000000000246 R12: 000000001dcd6401
    R13: 00007f086e833fd0 R14: 00005636f9fb60d0 R15: 000000000000002b
     </TASK>
    ---[ end trace 0000000000000000 ]---
    RIP: 0010:queued_spin_lock_slowpath+0x12a/0x1d0
    Code: c8 c1 e8 10 66 87 47 02 66 85 c0 74 48 0f b7 c0 49 c7 c0 f8 ff ff ff 89 c6 c1 ee 02 83 e0 03 49 8b b4 f0 00 21 67 83 c1 e0 04 <48> 89 94 30 00 f1 4a 84 83 7a 08 00 75 10 0f 1f 84 00 00 00 00 00
    RSP: 0018:ffffc9002c953d20 EFLAGS: 00010256
    RAX: 0000000000000000 RBX: ffff88814e78be40 RCX: 0000000000100000
    RDX: ffff88901fce5100 RSI: fdd92c90fff20011 RDI: ffff8881c2ae9384
    RBP: 000000000000002b R08: fffffffffffffff8 R09: 00000000002ab900
    R10: 000000000000b823 R11: 0000000000000c00 R12: ffff88814e78be40
    R13: ffffc9002c953d48 R14: ffffc9002c953d48 R15: ffff8881c2ae9384
    FS:  00007f086efb6600(0000) GS:ffff88909b836000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000055ced9c42650 CR3: 000000034b88e000 CR4: 0000000000750ef0
    PKRU: 55555554
    Kernel panic - not syncing: Fatal exception
    Kernel Offset: disabled
    ---[ end Kernel panic - not syncing: Fatal exception ]---

This is a giant Yocto build, but the comm is always cargo, so hopefully
I can run those bits in isolation and hit it more quickly.

> Please check if you can reproduce it and if so if it also happens
> without lto.
> I have no idea why one spinlock_t remains locked. It is either locked or
> some stray memory.
> Oh. Lockdep adds quite some overhead but it should complain that a
> spinlock_t is still locked while returning to userland.

I'll report back when I've tried :)

I'll also try some of the mm debug configs.

Thanks,
Calvin