linux-kernel - [BUG] rcu: stall detected in sys_mmap with PREEMPT(full) involving timer softirq and DRM vblank disable

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <6b7e2fac.a60e.19b97034116.Coremail.23009200614@stu.xidian.edu.cn>
Date: Wed, 7 Jan 2026 13:52:11 +0800 (GMT+08:00)
From: 王志 <23009200614@....xidian.edu.cn>
To: linux-kernel@...r.kernel.org
Cc: rcu@...r.kernel.org, linux-mm@...ck.org, dri-devel@...ts.freedesktop.org,
	x86@...nel.org, paulmck@...nel.org, tglx@...utronix.de,
	peterz@...radead.org, akpm@...ux-foundation.org,
	syzkaller-bugs@...glegroups.com
Subject: [BUG] rcu: stall detected in sys_mmap with PREEMPT(full) involving
 timer softirq and DRM vblank disable

Hello,

I am reporting an RCU stall detected during syzkaller-style fuzz testing.
The stall is reported while executing sys_mmap(), with the
rcu_preempt grace-period kthread starved for over 10 seconds. The
observed stacks involve memory fault handling, timer softirq processing,
and DRM vblank disable paths. With PREEMPT(full) enabled, the RCU grace
period fails to complete.

=== Summary ===
The kernel reports:
INFO: rcu detected stall in sys_mmap
rcu_preempt kthread starved for over 10000 jiffies
RCU reports that all quiescent states have been seen, yet the
grace-period kthread does not receive sufficient CPU time to advance
the grace period.

=== Environment ===
Kernel: 6.18.0 (locally built)
Config: PREEMPT(full)
Arch: x86_64
Hardware: QEMU Standard PC (i440FX + PIIX)
Workload: syz-executor (syzkaller-style fuzzing)

=== Triggering context ===
The stall is detected while a syzkaller executor issues sys_mmap()
calls. The main task context involves page fault handling and memory
allocation:
sys_mmap
vm_mmap_pgoff
__mm_populate
populate_vma_page_range
__get_user_pages
handle_mm_fault
do_pte_missing
get_page_from_freelist
Concurrently, timer softirq processing executes DRM vblank disable
logic.

=== Warning details ===
RCU reports:
INFO: rcu_preempt detected stalls on CPUs/tasks
rcu_preempt kthread timer wakeup didn't happen for ~10498 jiffies
rcu_preempt kthread starved for over ~10498 jiffies
Possible timer handling issue on cpu=3
Unless rcu_preempt kthread gets sufficient CPU time,
OOM is now expected behavior.

=== Call trace ===
CPU 3 (timer softirq / IRQ context):
__lock_acquire
_raw_spin_lock_irqsave
hrtimer_try_to_cancel
hrtimer_cancel
drm_vblank_disable_and_save
vblank_disable_fn
call_timer_fn
run_timer_softirq
__irq_exit_rcu
sysvec_apic_timer_interrupt
CPU 3 (task context):
lock_release
__set_page_owner
post_alloc_hook
get_page_from_freelist
do_pte_missing
handle_mm_fault
__get_user_pages
populate_vma_page_range
__mm_populate
vm_mmap_pgoff
__x64_sys_mmap
RCU GP kthread:
rcu_gp_fqs_loop
rcu_gp_kthread

=== Observations ===
The stall appears to involve an interaction between:
sys_mmap() page fault and memory allocation paths
Timer softirq processing
DRM vblank disable logic acquiring spinlocks
PREEMPT(full) scheduling and lockdep instrumentation
The RCU grace-period kthread reports that all quiescent states have been
observed, but remains starved of CPU time for over 10 seconds, suggesting
system-wide scheduling or interrupt/softirq interference rather than a
single blocked CPU.

=== Reproducer ===
No standalone reproducer is currently available.
The issue was observed during syzkaller-style fuzz testing.

=== Expected behavior ===
Memory management operations such as sys_mmap() should not lead to
prolonged RCU stalls, even under adversarial or malformed userspace
workloads.

=== Actual behavior ===
RCU reports prolonged stalls, the rcu_preempt grace-period kthread is
starved for over 10 seconds, and the kernel warns that OOM behavior may
occur.

=== Notes ===
This issue has been observed repeatedly under fuzzing workloads.
Additional logs, kernel configuration, or further traces can be provided
upon request.

Reported-by:
Zhi Wang