linux-kernel - [BUG] INFO: rcu detected stall in unix_stream

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <CAOU40uCe07E+jSONsnFXWfdPHPQjcvEoFX-QdJ2eAw2DqXZ=sg@mail.gmail.com>
Date: Wed, 16 Jul 2025 17:23:29 +0800
From: Xianying Wang <wangxianying546@...il.com>
To: kuniyu@...gle.com
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, 
	pabeni@...hat.com, horms@...nel.org, netdev@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: [BUG] INFO: rcu detected stall in unix_stream_connect

Hi,

I discovered a kernel panic using the Syzkaller framework, described
as INFO: rcu detected stall. This issue was reproduced on kernel
version 6.16.0-rc5.

>From the dmesg log, RCU detects a stall on CPU 0. The NMI backtrace,
which shows what the CPU was actually doing, reveals it was stuck in a
tight loop within the timer interrupt handler. The CPU appears to be
spinning in functions like lapic_next_deadline
(arch/x86/kernel/apic/apic.c:429) while processing a timer softirq in
run_timer_softirq (kernel/time/timer.c:2403).

Meanwhile, the task that was running on CPU 0 before it got stuck in
the interrupt is blocked in the unix_stream_connect function
(net/unix/af_unix.c:1683). The syzkaller reproducer appears to create
a deadlock scenario by having a listening UNIX socket attempt to
connect to its own endpoint.

I suspect this is a complex race condition or deadlock within the
kernel's core timer subsystem. The stress and unusual blocking state
induced by the UNIX socket operations, combined with concurrent POSIX
timer usage, likely exposes a latent bug in the hrtimer or tick
management. This causes the CPU to spin with interrupts disabled,
which in turn triggers the RCU stall.

This can be reproduced on:

HEAD commit:

d7b8f8e20813f0179d8ef519541a3527e7661d3a

report: https://pastebin.com/raw/N3GD5hL7

console output : https://pastebin.com/raw/RCZfTKCb

kernel config : https://pastebin.com/raw/xAVw5DnH

C reproducer :https://pastebin.com/raw/Z1B1ray5

Let me know if you need more details or testing.

Best regards,

Xianying