[<prev] [next>] [day] [month] [year] [list]
Message-ID: <758991c1.13f67.197f9cccf9b.Coremail.baishuoran@hrbeu.edu.cn>
Date: Fri, 11 Jul 2025 22:03:58 +0800 (GMT+08:00)
From: 白烁冉 <baishuoran@...eu.edu.cn>
To: "Peter Zijlstra" <peterz@...radead.org>,
"Thomas Gleixner" <tglx@...utronix.de>
Cc: "Kun Hu" <huk23@...udan.edu.cn>, "Jiaji Qin" <jjtan24@...udan.edu.cn>,
linux-kernel@...r.kernel.org
Subject: possible deadlock in smp_call_function_many_cond
Dear Maintainers,
When using our customized Syzkaller to fuzz the latest Linux kernel, the following crash (122th)was triggered.
HEAD commit: 6537cfb395f352782918d8ee7b7f10ba2cc3cbf2
git tree: upstream
Output:https://github.com/pghk13/Kernel-Bug/blob/main/0702_6.14/INFO%3A%20rcu%20detected%20stall%20in%20sys_select/122report.txt
Kernel config:https://github.com/pghk13/Kernel-Bug/blob/main/0305_6.14rc3/config.txt
C reproducer:https:https://github.com/pghk13/Kernel-Bug/blob/main/0702_6.14/INFO%3A%20rcu%20detected%20stall%20in%20sys_select/122repro.c
Syzlang reproducer: https://github.com/pghk13/Kernel-Bug/blob/main/0702_6.14/INFO%3A%20rcu%20detected%20stall%20in%20sys_select/122repro.txt
Our reproducer uses mounts a constructed filesystem image.
The error occurred around line 880 of the code, specifically during the call to csd_lock_wait. The status of CPU 1 (RCU GP kthread): executing the perf_event_open system call, needs to update tracepoint calls on all CPUs, and smp_call_function_many_cond is stuck waiting for CPU 2 to respond to the IPI.
We have reproduced this issue several times on 6.14 again.
If you fix this issue, please add the following tag to the commit:
Reported-by: Kun Hu <huk23@...udan.edu.cn>, Jiaji Qin <jjtan24@...udan.edu.cn>, Shuoran Bai <baishuoran@...eu.edu.cn>
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: 2-...!: (3 GPs behind) idle=b834/1/0x4000000000000000 softirq=23574/23574 fqs=5
rcu: (detected by 1, t=10502 jiffies, g=19957, q=594 ncpus=4)
Sending NMI from CPU 1 to CPUs 2:
NMI backtrace for cpu 2
CPU: 2 UID: 0 PID: 9461 Comm: sshd Not tainted 6.14.0 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:__lock_acquire+0x106/0x46b0
Code: ff df 4c 89 ea 48 c1 ea 03 80 3c 02 00 0f 85 ec 35 00 00 49 8b 45 00 48 3d a0 c7 8a 93 0f 84 29 0f 00 00 44 8b 05 2a dc 74 0c <45> 85 c0 0f 84 ad 06 00 00 48 3d e0 c7 8a 93 0f 84 a1 06 00 00 41
RSP: 0018:ffffc90000568ac8 EFLAGS: 00000002
RAX: ffffffff9aab9a20 RBX: 0000000000000000 RCX: 1ffff920000ad16c
RDX: 1ffffffff35692cf RSI: 0000000000000000 RDI: ffffffff9ab49678
RBP: ffff8880201aa480 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000001 R11: ffffffff90617d17 R12: 0000000000000000
R13: ffffffff9ab49678 R14: 0000000000000000 R15: 0000000000000000
FS: 00007fa644657900(0000) GS:ffff88802b900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f0fa92178a9 CR3: 0000000000e90000 CR4: 0000000000750ef0
PKRU: 55555554
Call Trace:
<NMI>
</NMI>
<IRQ>
lock_acquire+0x1b6/0x570
_raw_spin_lock_irqsave+0x3d/0x60
debug_object_deactivate+0x139/0x390
__hrtimer_run_queues+0x416/0xc30
hrtimer_interrupt+0x398/0x890
__sysvec_apic_timer_interrupt+0x114/0x400
sysvec_apic_timer_interrupt+0xa3/0xc0
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20
RIP: 0010:lock_acquire+0x1f7/0x570
Code: c1 05 95 c0 6b 7e 83 f8 01 0f 85 fd 02 00 00 9c 58 f6 c4 02 0f 85 e8 02 00 00 4d 85 e4 74 01 fb 48 b8 00 00 00 00 00 fc ff df <48> 01 c3 48 c7 03 00 00 00 00 48 c7 43 08 00 00 00 00 48 8b 84 24
RSP: 0018:ffffc900091dfad8 EFLAGS: 00000206
RAX: dffffc0000000000 RBX: 1ffff9200123bf5e RCX: ffffffff819598fe
RDX: 1ffff110040355ed RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000001 R08: 0000000000000001 R09: fffffbfff2de79a0
R10: fffffbfff2de799f R11: ffffffff96f3ccff R12: 0000000000000200
R13: ffff88804b37d1e0 R14: 0000000000000000 R15: 0000000000000000
__might_fault+0x118/0x190
core_sys_select+0x82c/0xa90
kern_select+0x140/0x1c0
__x64_sys_select+0xbe/0x160
do_syscall_64+0xcf/0x250
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fa644b1de76
Code: 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 17 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 62 c3 90 48 83 ec 38 4c 89 44 24 28 48 89 54
RSP: 002b:00007ffe372cf848 EFLAGS: 00000246 ORIG_RAX: 0000000000000017
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa644b1de76
RDX: 000055cceaf08240 RSI: 000055cceaf0da70 RDI: 000000000000000f
RBP: 000055cceaef8290 R08: 0000000000000000 R09: 00000000002f0000
R10: 0000000000000000 R11: 0000000000000246 R12: 000055ccc2d99768
R13: 0000000000000000 R14: 0000000000000004 R15: 000055ccc2d4eac0
</TASK>
rcu: rcu_preempt kthread starved for 10492 jiffies! g19957 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1
rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt state:R running task stack:26792 pid:18 tgid:18 ppid:2 task_flags:0x208040 flags:0x00004000
Call Trace:
<TASK>
__schedule+0x1074/0x4d30
schedule+0xd4/0x210
schedule_timeout+0x11b/0x280
rcu_gp_fqs_loop+0x624/0xa30
rcu_gp_kthread+0x258/0x360
kthread+0x42a/0x880
ret_from_fork+0x48/0x80
ret_from_fork_asm+0x1a/0x30
</TASK>
rcu: Stack dump where RCU GP kthread last ran:
CPU: 1 UID: 0 PID: 16674 Comm: syz.1.110 Not tainted 6.14.0 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:smp_call_function_many_cond+0x65d/0x19d0
Code: e6 e8 07 29 0c 00 45 85 e4 74 48 48 8b 04 24 49 89 c5 83 e0 07 49 c1 ed 03 49 89 c4 4d 01 f5 41 83 c4 03 e8 d5 26 0c 00 f3 90 <41> 0f b6 45 00 41 38 c4 7c 08 84 c0 0f 85 af 10 00 00 8b 45 08 31
RSP: 0018:ffffc9000a9af610 EFLAGS: 00000246
RAX: 0000000000080000 RBX: 0000000000000002 RCX: 0000000000080000
RDX: ffffc900213a1000 RSI: ffff888071aba480 RDI: 0000000000000002
RBP: ffff88802b9469c0 R08: 0000000000000001 R09: fffffbfff2de7999
R10: fffffbfff2de7998 R11: 0000000000000001 R12: 0000000000000003
R13: ffffed1005728d39 R14: dffffc0000000000 R15: 0000000000000001
FS: 00007faad2189700(0000) GS:ffff88807ee00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007faad1560380 CR3: 000000002a4ba000 CR4: 0000000000750ef0
PKRU: 80000000
Call Trace:
<IRQ>
</IRQ>
<TASK>
on_each_cpu_cond_mask+0x5a/0xa0
text_poke_bp_batch+0x216/0x730
text_poke_bp+0x91/0xc0
__static_call_transform+0x334/0x720
arch_static_call_transform+0x5d/0xb0
__static_call_update+0xd5/0x610
tracepoint_update_call+0xc0/0x120
tracepoint_add_func+0x950/0xca0
tracepoint_probe_register_prio+0xa5/0xf0
trace_event_reg+0x297/0x350
perf_trace_event_init+0x53f/0xac0
perf_trace_init+0x1a4/0x2f0
perf_tp_event_init+0xa6/0x120
perf_try_init_event+0x13a/0xcb0
perf_event_alloc+0x1056/0x3e80
__do_sys_perf_event_open+0x5c7/0x29e0
do_syscall_64+0xcf/0x250
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7faad13acadd
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007faad2188ba8 EFLAGS: 00000246 ORIG_RAX: 000000000000012a
RAX: ffffffffffffffda RBX: 00007faad15a6080 RCX: 00007faad13acadd
RDX: ffffffffffffffff RSI: 0000000000000000 RDI: 0000000020000040
RBP: 00007faad142ab8f R08: 0000000000000000 R09: 0000000000000000
R10: ffffffffffffffff R11: 0000000000000246 R12: 0000000000000000
R13: 00007faad15a608c R14: 00007faad15a6118 R15: 00007faad2188d40
</TASK>
thanks,
Kun Hu
Powered by blists - more mailing lists