[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFIhSYmDvzRgShIy@mozart.vkv.me>
Date: Tue, 17 Jun 2025 19:15:37 -0700
From: Calvin Owens <calvin@...nvd.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: linux-kernel@...r.kernel.org, linux-tip-commits@...r.kernel.org,
"Lai, Yi" <yi1.lai@...ux.intel.com>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>, x86@...nel.org
Subject: Re: [tip: locking/urgent] futex: Allow to resize the private local
hash
On Tuesday 06/17 at 09:11 -0700, Calvin Owens wrote:
> On Tuesday 06/17 at 11:50 +0200, Sebastian Andrzej Siewior wrote:
> > On 2025-06-17 02:23:08 [-0700], Calvin Owens wrote:
> > > Ugh, I'm sorry, I was in too much of a hurry this morning... cargo is
> > > obviously not calling PR_FUTEX_HASH which is new in 6.16 :/
> > No worries.
> >
> > > > This is with LTO enabled.
> > >
> > > Full lto with llvm-20.1.7.
> > >
> > …
> > > Nothing showed up in the logs but the RCU stalls on CPU16, always in
> > > queued_spin_lock_slowpath().
> > >
> > > I'll run the build it was doing when it happened in a loop overnight and
> > > see if I can trigger it again.
>
> Actually got an oops this time:
>
> <snip>
>
> This is a giant Yocto build, but the comm is always cargo, so hopefully
> I can run those bits in isolation and hit it more quickly.
>
> > Please check if you can reproduce it and if so if it also happens
> > without lto.
It takes longer with LTO disabled, but I'm still seeing some crashes.
First this WARN:
------------[ cut here ]------------
WARNING: CPU: 2 PID: 1866190 at mm/slub.c:4753 free_large_kmalloc+0xa5/0xc0
CPU: 2 UID: 1000 PID: 1866190 Comm: python3 Not tainted 6.16.0-rc2-nolto-00024-g9afe652958c3 #1 PREEMPT
Hardware name: ASRock B850 Pro-A/B850 Pro-A, BIOS 3.11 11/12/2024
RIP: 0010:free_large_kmalloc+0xa5/0xc0
Code: 02 00 00 74 01 fb 83 7b 30 ff 74 07 c7 43 30 ff ff ff ff f0 ff 4b 34 75 08 48 89 df e8 84 dd f9 ff 48 83 c4 08 5b 41 5e 5d c3 <0f> 0b 48 89 df 48 c7 c6 46 92 f5 82 48 83 c4 08 5b 41 5e 5d e9 42
RSP: 0018:ffffc90024d67ce8 EFLAGS: 00010206
RAX: 00000000ff000000 RBX: ffffea00051d5700 RCX: ffffea00042f2208
RDX: 0000000000053a55 RSI: ffff88814755c000 RDI: ffffea00051d5700
RBP: 0000000000000000 R08: fffffffffffdfce5 R09: ffffffff83d52928
R10: ffffea00047ae080 R11: 0000000000000003 R12: ffff8882cae5cd00
R13: ffff88819bb19c08 R14: ffff88819bb194c0 R15: ffff8883a24df900
FS: 0000000000000000(0000) GS:ffff88909bf54000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055842ea1e3f0 CR3: 0000000d82b9d000 CR4: 0000000000750ef0
PKRU: 55555554
Call Trace:
<TASK>
futex_hash_free+0x10/0x40
__mmput+0xb4/0xd0
exec_mmap+0x1e2/0x210
begin_new_exec+0x491/0x6c0
load_elf_binary+0x25d/0x1050
? load_misc_binary+0x19a/0x2d0
bprm_execve+0x1d5/0x370
do_execveat_common+0x29e/0x300
__x64_sys_execve+0x33/0x40
do_syscall_64+0x48/0xfb0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7fd8ec8e7dd7
Code: Unable to access opcode bytes at 0x7fd8ec8e7dad.
RSP: 002b:00007fd8adff9e88 EFLAGS: 00000206 ORIG_RAX: 000000000000003b
RAX: ffffffffffffffda RBX: 00007fd8adffb6c0 RCX: 00007fd8ec8e7dd7
RDX: 000055842ed3ce60 RSI: 00007fd8eaea3870 RDI: 00007fd8eae87940
RBP: 00007fd8adff9e90 R08: 00000000ffffffff R09: 0000000000000000
R10: 0000000000000008 R11: 0000000000000206 R12: 00007fd8ed12da28
R13: 00007fd8eae87940 R14: 00007fd8eaea3870 R15: 0000000000000001
</TASK>
---[ end trace 0000000000000000 ]---
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x13d1507b pfn:0x14755c
flags: 0x2000000000000000(node=0|zone=1)
raw: 2000000000000000 ffffea00042f2208 ffff88901fd66b00 0000000000000000
raw: 0000000013d1507b 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: Not a kmalloc allocation
page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x13d1507b pfn:0x14755c
flags: 0x2000000000000000(node=0|zone=1)
raw: 2000000000000000 ffffea00042f2208 ffff88901fd66b00 0000000000000000
raw: 0000000013d1507b 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: Not a kmalloc allocation
...and then it oopsed (same stack as my last mail) about twenty minutes
later when I hit Ctrl+C to stop the build:
BUG: unable to handle page fault for address: 00000008849281a9
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 0 P4D 0
Oops: Oops: 0002 [#1] SMP
CPU: 13 UID: 1000 PID: 1864338 Comm: python3 Tainted: G W 6.16.0-rc2-nolto-00024-g9afe652958c3 #1 PREEMPT
Tainted: [W]=WARN
Hardware name: ASRock B850 Pro-A/B850 Pro-A, BIOS 3.11 11/12/2024
RIP: 0010:queued_spin_lock_slowpath+0x112/0x1a0
Code: c8 c1 e8 10 66 87 47 02 66 85 c0 74 40 0f b7 c0 49 c7 c0 f8 ff ff ff 89 c6 c1 ee 02 83 e0 03 49 8b b4 f0 40 8b 06 83 c1 e0 04 <48> 89 94 30 00 12 d5 83 83 7a 08 00 75 08 f3 90 83 7a 08 00 74 f8
RSP: 0018:ffffc9002b35fd20 EFLAGS: 00010212
RAX: 0000000000000020 RBX: ffffc9002b35fd50 RCX: 0000000000380000
RDX: ffff88901fde5200 RSI: 0000000900bd6f89 RDI: ffff88814755d204
RBP: 0000000000000000 R08: fffffffffffffff8 R09: 00000000002ab900
R10: 0000000000000065 R11: 0000000000001000 R12: ffff88906c343e40
R13: ffffc9002b35fd50 R14: ffff88814755d204 R15: 00007fd8eb6feac0
FS: 00007fd8eb6ff6c0(0000) GS:ffff88909c094000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000008849281a9 CR3: 0000001fcf611000 CR4: 0000000000750ef0
PKRU: 55555554
Call Trace:
<TASK>
futex_unqueue+0x21/0x90
__futex_wait+0xb7/0x120
? __futex_wake_mark+0x40/0x40
futex_wait+0x5b/0xd0
do_futex+0x86/0x120
__se_sys_futex+0x10d/0x180
do_syscall_64+0x48/0xfb0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7fd8ec8a49ee
Code: 08 0f 85 f5 4b ff ff 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 80 00 00 00 00 48 83 ec 08
RSP: 002b:00007fd8eb6fe9b8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: ffffffffffffffda RBX: 00007fd8eb6ff6c0 RCX: 00007fd8ec8a49ee
RDX: 0000000000000000 RSI: 0000000000000189 RDI: 00007fd8eb6feac0
RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000ffffffff
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd8eb6fea00
R13: 0000000000001de0 R14: 00007fd8ececa240 R15: 00000000000000ef
</TASK>
CR2: 00000008849281a9
---[ end trace 0000000000000000 ]---
I enabled lockdep and I've got it running again.
I set up a little git repo with a copy of all the traces so far, and the
kconfigs I'm running:
https://github.com/jcalvinowens/lkml-debug-616
...and I pushed the actual vmlinux binaries here:
https://github.com/jcalvinowens/lkml-debug-616/releases/tag/20250617
There were some block warnings on another machine running the same
workload, but of course they aren't necessarily related.
> > I have no idea why one spinlock_t remains locked. It is either locked or
> > some stray memory.
> > Oh. Lockdep adds quite some overhead but it should complain that a
> > spinlock_t is still locked while returning to userland.
>
> I'll report back when I've tried :)
>
> I'll also try some of the mm debug configs.
>
> Thanks,
> Calvin
Powered by blists - more mailing lists