[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aKkLEtoDXKxAAWju@google.com>
Date: Fri, 22 Aug 2025 17:28:02 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>, Borislav Petkov <bp@...en8.de>,
Thomas Gleixner <tglx@...utronix.de>, Peter Zijlstra <peterz@...radead.org>, x86-ml <x86@...nel.org>,
lkml <linux-kernel@...r.kernel.org>
Subject: Re: [GIT PULL] locking/urgent for v6.17-rc1
On Fri, Aug 22, 2025, Sebastian Andrzej Siewior wrote:
> On 2025-08-21 12:45:52 [-0700], Sean Christopherson wrote:
> > Piggybacking the futex private hashing attention, the new fanciness is causing
> > crashes in my testing. The crashes are 100% reproducible, but my reproducer is
> > simply running a variety of tests in parallel, i.e. isn't very debug-friendly,
> > and the code itself is black magic to me, so all I've done is bisect.
> >
> > I reported the issue on the original thread, but haven't seen any follow-up.
> >
> > https://lore.kernel.org/all/aJ_vEP2EHj6l0xRT@google.com
>
> I somehow missed it. Can you try rc2 with the patch I just sent?
No dice, fails with the same signature.
I got a trimmed down reproduer. Load KVM, run this in the background (in a loop)
to constantly trigger try_to_wake_up() on relevant tasks (needs to be run as root):
echo Y > /sys/module/kvm/parameters/nx_huge_pages
echo N > /sys/module/kvm/parameters/nx_huge_pages
sleep .2
and then run the hardware_disable_test KVM selftest (from
tools/testing/selftests/kvm/hardware_disable_test.c).
Strace on hardware_disable_test spewed a whole pile of these
wait4(32861, 0x7ffc66475dec, WNOHANG, NULL) = 0
futex(0x7fb735c43000, FUTEX_WAIT_BITSET|FUTEX_CLOCK_REALTIME, 0, {tv_sec=1, tv_nsec=0}, FUTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
immediately before the crash. I assume it corresponds to this:
/* Child is still running, keep waiting. */
if (pid != waitpid(pid, &status, WNOHANG))
continue;
I also got a new splat on the "WARN_ON_ONCE(ret < 0);" at the end of __futex_ref_atomic_end().
This happened during boot; AFAICT our userspace was setting up cgroups. In this
case, the system hung and I had to reboot.
------------[ cut here ]------------
WARNING: CPU: 45 PID: 0 at kernel/futex/core.c:1604 futex_ref_rcu+0xbf/0xf0
Modules linked in: vfat fat i2c_mux_pca954x i2c_mux spidev cdc_acm xhci_pci xhci_hcd gq(O) sha3_generic
CPU: 45 UID: 0 PID: 0 Comm: swapper/45 Tainted: G S O 6.17.0-smp--1278d576b27d-futex #886 NONE
Tainted: [S]=CPU_OUT_OF_SPEC, [O]=OOT_MODULE
Hardware name: Google LLC Indus/Indus_QC_03, BIOS 30.110.0 09/13/2024
RIP: 0010:futex_ref_rcu+0xbf/0xf0
Code: c7 04 0a 00 00 00 00 48 ff c0 eb c2 65 ff 01 89 e8 4c 01 f0 48 ff c0 48 89 c1 f0 48 0f c1 8b 48 01 00 00 48 01 c1 74 06 79 0c <0f> 0b eb 08 48 89 df e8 55 0a f9 ff 48 89 df 5b 41 5e 5d e9 f9 5c
RSP: 0018:ffffa43c8d440ec8 EFLAGS: 00010286
RAX: 8000000000000000 RBX: ffff933782245080 RCX: ffffffffffffffff
RDX: 0000000000000060 RSI: 0000000000000060 RDI: ffffffffac840520
RBP: 0000000000000000 R08: ffff933680044d00 R09: ffffffff00000000
R10: ffff9365c9b59e00 R11: ffff9365c9b59e00 R12: ffffffffab77ac10
R13: ffff9337822451b8 R14: 7fffffffffffffff R15: ffff9365c749de00
FS: 0000000000000000(0000) GS:ffff9395514f2000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fad8a21cf38 CR3: 00000055c062b002 CR4: 00000000007706f0
PKRU: 55555554
Call Trace:
<IRQ>
rcu_do_batch+0x250/0x7e0
rcu_core+0x12f/0x230
handle_softirqs+0xc8/0x280
__irq_exit_rcu+0x48/0x100
sysvec_apic_timer_interrupt+0x74/0x80
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20
RIP: 0010:cpuidle_enter_state+0xfb/0x290
Code: bb f6 ff ff 49 89 c4 8b 73 04 bf ff ff ff ff e8 9b 68 d8 ff 31 ff e8 f4 32 48 ff 80 7c 24 04 00 74 05 e8 c8 68 d8 ff fb 85 ed <0f> 88 ba 00 00 00 89 e9 48 6b f9 68 4c 8b 44 24 08 49 8b 54 38 30
RSP: 0018:ffffa43c803d3e80 EFLAGS: 00000206
RAX: ffff9395514f2000 RBX: ffff9394ff776548 RCX: 000000000000001f
RDX: 000000000018ec50 RSI: 000000000000002d RDI: 0000000000000000
RBP: 0000000000000003 R08: 0000000000000002 R09: 0000000000000002
R10: 00000000000003dc R11: 0000000000000389 R12: 00000010fb32644d
R13: 00000010fb2333f7 R14: ffffffffad276f68 R15: 0000000000000003
cpuidle_enter+0x2c/0x40
do_idle+0x1ac/0x250
cpu_startup_entry+0x2a/0x30
start_secondary+0x80/0x80
common_startup_64+0x13e/0x140
</TASK>
---[ end trace 0000000000000000 ]---
Heh, and two more when booting a different system. Guess it's my lucky day.
This time whatever went sideways didn't appear to be fatal as the system booted
and I could ssh in. One is the same WARN as above, and the second WARN on the
system hit the
WARN_ON_ONCE(atomic_long_read(&mm->futex_atomic) != 0);
in futex_hash_allocate().
------------[ cut here ]------------
WARNING: CPU: 120 PID: 11779 at kernel/futex/core.c:1553 futex_hash_allocate+0x436/0x450
Modules linked in: vfat fat ccp k10temp i2c_piix4 cdc_acm xhci_pci xhci_hcd gq(O) sha3_generic
CPU: 120 UID: 0 PID: 11779 Comm: borglet Tainted: G U W O 6.17.0-smp--1278d576b27d-futex #886 NONE
Tainted: [U]=USER, [W]=WARN, [O]=OOT_MODULE
Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 34.64.2-0 12/26/2024
RIP: 0010:futex_hash_allocate+0x436/0x450
Code: 31 ff 65 48 8b 05 ba bc ae 02 48 3b 44 24 48 75 20 44 89 f8 48 83 c4 50 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b e9 9d fe ff ff <0f> 0b e9 c0 fe ff ff e8 ce 99 af 00 66 66 66 66 66 2e 0f 1f 84 00
RSP: 0018:ffffbbc0f1237d10 EFLAGS: 00010286
RAX: 0000000000000001 RBX: 0000000000000000 RCX: ffffa25747532180
RDX: 0000000000000400 RSI: 000000000000ffc0 RDI: 00000000000039b8
RBP: ffffa296a2620000 R08: 00000000004029c0 R09: 00000000ffffffff
R10: 00000000ffffffff R11: 0000000000010040 R12: ffffa2571336b700
R13: ffffa2571336b600 R14: ffffa2571336b600 R15: ffffa296b9270000
FS: 00007f6bbd3297c0(0000) GS:ffffa2d5a31b2000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6bae810f38 CR3: 00000001330a4001 CR4: 0000000000770ef0
PKRU: 55555554
Call Trace:
<TASK>
? cgroup_can_fork+0x258/0x420
copy_process+0xae3/0xff0
kernel_clone+0x99/0x320
__x64_sys_clone+0xc8/0xf0
do_syscall_64+0x6f/0x1f0
? arch_exit_to_user_mode_prepare+0x9/0x50
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7f6bbd466051
Code: 48 85 ff 74 3d 48 85 f6 74 38 48 83 ee 10 48 89 4e 08 48 89 3e 48 89 d7 4c 89 c2 4d 89 c8 4c 8b 54 24 08 b8 38 00 00 00 0f 05 <48> 85 c0 7c 13 74 01 c3 31 ed 58 5f ff d0 48 89 c7 b8 3c 00 00 00
RSP: 002b:00007fffff2eda98 EFLAGS: 00000206 ORIG_RAX: 0000000000000038
RAX: ffffffffffffffda RBX: 00007f6bae812700 RCX: 00007f6bbd466051
RDX: 00007f6bae8129d0 RSI: 00007f6bae810f30 RDI: 00000000003d0f00
RBP: 00007fffff2edad0 R08: 00007f6bae812700 R09: 00007f6bae812700
R10: 00007f6bae8129d0 R11: 0000000000000206 R12: 00007f6bae8129d0
R13: 00007fffff2edb66 R14: 00007fffff2edbd0 R15: 00007f6bae810f40
</TASK>
---[ end trace 0000000000000000 ]---
Powered by blists - more mailing lists