linux-kernel - Re: [tip: locking/urgent] futex: Allow to resize the private local hash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFIhSYmDvzRgShIy@mozart.vkv.me>
Date: Tue, 17 Jun 2025 19:15:37 -0700
From: Calvin Owens <calvin@...nvd.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: linux-kernel@...r.kernel.org, linux-tip-commits@...r.kernel.org,
	"Lai, Yi" <yi1.lai@...ux.intel.com>,
	"Peter Zijlstra (Intel)" <peterz@...radead.org>, x86@...nel.org
Subject: Re: [tip: locking/urgent] futex: Allow to resize the private local
 hash

On Tuesday 06/17 at 09:11 -0700, Calvin Owens wrote:
> On Tuesday 06/17 at 11:50 +0200, Sebastian Andrzej Siewior wrote:
> > On 2025-06-17 02:23:08 [-0700], Calvin Owens wrote:
> > > Ugh, I'm sorry, I was in too much of a hurry this morning... cargo is
> > > obviously not calling PR_FUTEX_HASH which is new in 6.16 :/
> > No worries.
> >
> > > > This is with LTO enabled.
> > >
> > > Full lto with llvm-20.1.7.
> > >
> > …
> > > Nothing showed up in the logs but the RCU stalls on CPU16, always in
> > > queued_spin_lock_slowpath().
> > >
> > > I'll run the build it was doing when it happened in a loop overnight and
> > > see if I can trigger it again.
>
> Actually got an oops this time:
>
> <snip>
>
> This is a giant Yocto build, but the comm is always cargo, so hopefully
> I can run those bits in isolation and hit it more quickly.
>
> > Please check if you can reproduce it and if so if it also happens
> > without lto.

It takes longer with LTO disabled, but I'm still seeing some crashes.

First this WARN:

    ------------[ cut here ]------------
    WARNING: CPU: 2 PID: 1866190 at mm/slub.c:4753 free_large_kmalloc+0xa5/0xc0
    CPU: 2 UID: 1000 PID: 1866190 Comm: python3 Not tainted 6.16.0-rc2-nolto-00024-g9afe652958c3 #1 PREEMPT 
    Hardware name: ASRock B850 Pro-A/B850 Pro-A, BIOS 3.11 11/12/2024
    RIP: 0010:free_large_kmalloc+0xa5/0xc0
    Code: 02 00 00 74 01 fb 83 7b 30 ff 74 07 c7 43 30 ff ff ff ff f0 ff 4b 34 75 08 48 89 df e8 84 dd f9 ff 48 83 c4 08 5b 41 5e 5d c3 <0f> 0b 48 89 df 48 c7 c6 46 92 f5 82 48 83 c4 08 5b 41 5e 5d e9 42
    RSP: 0018:ffffc90024d67ce8 EFLAGS: 00010206
    RAX: 00000000ff000000 RBX: ffffea00051d5700 RCX: ffffea00042f2208
    RDX: 0000000000053a55 RSI: ffff88814755c000 RDI: ffffea00051d5700
    RBP: 0000000000000000 R08: fffffffffffdfce5 R09: ffffffff83d52928
    R10: ffffea00047ae080 R11: 0000000000000003 R12: ffff8882cae5cd00
    R13: ffff88819bb19c08 R14: ffff88819bb194c0 R15: ffff8883a24df900
    FS:  0000000000000000(0000) GS:ffff88909bf54000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 000055842ea1e3f0 CR3: 0000000d82b9d000 CR4: 0000000000750ef0
    PKRU: 55555554
    Call Trace:
     <TASK>
     futex_hash_free+0x10/0x40
     __mmput+0xb4/0xd0
     exec_mmap+0x1e2/0x210
     begin_new_exec+0x491/0x6c0
     load_elf_binary+0x25d/0x1050
     ? load_misc_binary+0x19a/0x2d0
     bprm_execve+0x1d5/0x370
     do_execveat_common+0x29e/0x300
     __x64_sys_execve+0x33/0x40
     do_syscall_64+0x48/0xfb0
     entry_SYSCALL_64_after_hwframe+0x4b/0x53
    RIP: 0033:0x7fd8ec8e7dd7
    Code: Unable to access opcode bytes at 0x7fd8ec8e7dad.
    RSP: 002b:00007fd8adff9e88 EFLAGS: 00000206 ORIG_RAX: 000000000000003b
    RAX: ffffffffffffffda RBX: 00007fd8adffb6c0 RCX: 00007fd8ec8e7dd7
    RDX: 000055842ed3ce60 RSI: 00007fd8eaea3870 RDI: 00007fd8eae87940
    RBP: 00007fd8adff9e90 R08: 00000000ffffffff R09: 0000000000000000
    R10: 0000000000000008 R11: 0000000000000206 R12: 00007fd8ed12da28
    R13: 00007fd8eae87940 R14: 00007fd8eaea3870 R15: 0000000000000001
     </TASK>
    ---[ end trace 0000000000000000 ]---
    page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x13d1507b pfn:0x14755c
    flags: 0x2000000000000000(node=0|zone=1)
    raw: 2000000000000000 ffffea00042f2208 ffff88901fd66b00 0000000000000000
    raw: 0000000013d1507b 0000000000000000 00000000ffffffff 0000000000000000
    page dumped because: Not a kmalloc allocation
    page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x13d1507b pfn:0x14755c
    flags: 0x2000000000000000(node=0|zone=1)
    raw: 2000000000000000 ffffea00042f2208 ffff88901fd66b00 0000000000000000
    raw: 0000000013d1507b 0000000000000000 00000000ffffffff 0000000000000000
    page dumped because: Not a kmalloc allocation

...and then it oopsed (same stack as my last mail) about twenty minutes
later when I hit Ctrl+C to stop the build:

    BUG: unable to handle page fault for address: 00000008849281a9
    #PF: supervisor write access in kernel mode
    #PF: error_code(0x0002) - not-present page
    PGD 0 P4D 0 
    Oops: Oops: 0002 [#1] SMP
    CPU: 13 UID: 1000 PID: 1864338 Comm: python3 Tainted: G        W           6.16.0-rc2-nolto-00024-g9afe652958c3 #1 PREEMPT 
    Tainted: [W]=WARN
    Hardware name: ASRock B850 Pro-A/B850 Pro-A, BIOS 3.11 11/12/2024
    RIP: 0010:queued_spin_lock_slowpath+0x112/0x1a0
    Code: c8 c1 e8 10 66 87 47 02 66 85 c0 74 40 0f b7 c0 49 c7 c0 f8 ff ff ff 89 c6 c1 ee 02 83 e0 03 49 8b b4 f0 40 8b 06 83 c1 e0 04 <48> 89 94 30 00 12 d5 83 83 7a 08 00 75 08 f3 90 83 7a 08 00 74 f8
    RSP: 0018:ffffc9002b35fd20 EFLAGS: 00010212
    RAX: 0000000000000020 RBX: ffffc9002b35fd50 RCX: 0000000000380000
    RDX: ffff88901fde5200 RSI: 0000000900bd6f89 RDI: ffff88814755d204
    RBP: 0000000000000000 R08: fffffffffffffff8 R09: 00000000002ab900
    R10: 0000000000000065 R11: 0000000000001000 R12: ffff88906c343e40
    R13: ffffc9002b35fd50 R14: ffff88814755d204 R15: 00007fd8eb6feac0
    FS:  00007fd8eb6ff6c0(0000) GS:ffff88909c094000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000008849281a9 CR3: 0000001fcf611000 CR4: 0000000000750ef0
    PKRU: 55555554
    Call Trace:
     <TASK>
     futex_unqueue+0x21/0x90
     __futex_wait+0xb7/0x120
     ? __futex_wake_mark+0x40/0x40
     futex_wait+0x5b/0xd0
     do_futex+0x86/0x120
     __se_sys_futex+0x10d/0x180
     do_syscall_64+0x48/0xfb0
     entry_SYSCALL_64_after_hwframe+0x4b/0x53
    RIP: 0033:0x7fd8ec8a49ee
    Code: 08 0f 85 f5 4b ff ff 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 80 00 00 00 00 48 83 ec 08
    RSP: 002b:00007fd8eb6fe9b8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
    RAX: ffffffffffffffda RBX: 00007fd8eb6ff6c0 RCX: 00007fd8ec8a49ee
    RDX: 0000000000000000 RSI: 0000000000000189 RDI: 00007fd8eb6feac0
    RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000ffffffff
    R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd8eb6fea00
    R13: 0000000000001de0 R14: 00007fd8ececa240 R15: 00000000000000ef
     </TASK>
    CR2: 00000008849281a9
    ---[ end trace 0000000000000000 ]---

I enabled lockdep and I've got it running again.

I set up a little git repo with a copy of all the traces so far, and the
kconfigs I'm running:

    https://github.com/jcalvinowens/lkml-debug-616

...and I pushed the actual vmlinux binaries here:

    https://github.com/jcalvinowens/lkml-debug-616/releases/tag/20250617

There were some block warnings on another machine running the same
workload, but of course they aren't necessarily related.

> > I have no idea why one spinlock_t remains locked. It is either locked or
> > some stray memory.
> > Oh. Lockdep adds quite some overhead but it should complain that a
> > spinlock_t is still locked while returning to userland.
> 
> I'll report back when I've tried :)
> 
> I'll also try some of the mm debug configs.
> 
> Thanks,
> Calvin