[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAOU40uCe07E+jSONsnFXWfdPHPQjcvEoFX-QdJ2eAw2DqXZ=sg@mail.gmail.com>
Date: Wed, 16 Jul 2025 17:23:29 +0800
From: Xianying Wang <wangxianying546@...il.com>
To: kuniyu@...gle.com
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org,
pabeni@...hat.com, horms@...nel.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: [BUG] INFO: rcu detected stall in unix_stream_connect
Hi,
I discovered a kernel panic using the Syzkaller framework, described
as INFO: rcu detected stall. This issue was reproduced on kernel
version 6.16.0-rc5.
>From the dmesg log, RCU detects a stall on CPU 0. The NMI backtrace,
which shows what the CPU was actually doing, reveals it was stuck in a
tight loop within the timer interrupt handler. The CPU appears to be
spinning in functions like lapic_next_deadline
(arch/x86/kernel/apic/apic.c:429) while processing a timer softirq in
run_timer_softirq (kernel/time/timer.c:2403).
Meanwhile, the task that was running on CPU 0 before it got stuck in
the interrupt is blocked in the unix_stream_connect function
(net/unix/af_unix.c:1683). The syzkaller reproducer appears to create
a deadlock scenario by having a listening UNIX socket attempt to
connect to its own endpoint.
I suspect this is a complex race condition or deadlock within the
kernel's core timer subsystem. The stress and unusual blocking state
induced by the UNIX socket operations, combined with concurrent POSIX
timer usage, likely exposes a latent bug in the hrtimer or tick
management. This causes the CPU to spin with interrupts disabled,
which in turn triggers the RCU stall.
This can be reproduced on:
HEAD commit:
d7b8f8e20813f0179d8ef519541a3527e7661d3a
report: https://pastebin.com/raw/N3GD5hL7
console output : https://pastebin.com/raw/RCZfTKCb
kernel config : https://pastebin.com/raw/xAVw5DnH
C reproducer :https://pastebin.com/raw/Z1B1ray5
Let me know if you need more details or testing.
Best regards,
Xianying
Powered by blists - more mailing lists