[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <6964c137.050a0220.eaf7.0097.GAE@google.com>
Date: Mon, 12 Jan 2026 01:39:03 -0800
From: syzbot <syzbot+d8d4c31d40f868eaea30@...kaller.appspotmail.com>
To: linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com
Subject: Forwarded: Private message regarding: [syzbot] [mm?] INFO: rcu
detected stall in purge_vmap_node
For archival purposes, forwarding an incoming command email to
linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com.
***
Subject: Private message regarding: [syzbot] [mm?] INFO: rcu detected stall in purge_vmap_node
Author: kapoorarnav43@...il.com
From: Arnav Kapoor <kapoorarnav43@...il.com>
Date: Sun, 12 Jan 2026 15:30:00 +0000
Subject: [PATCH] mm/kasan: add cond_resched() in shadow page table walk
#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
master
Syzbot reported RCU stalls during vmalloc cleanup:
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks
task:kworker/0:17 state:R running task
purge_vmap_node+0x1ba/0xad0 mm/vmalloc.c:2299
When CONFIG_PAGE_OWNER is enabled, freeing KASAN shadow pages during
vmalloc cleanup triggers expensive stack unwinding via save_stack() ->
unwind_next_frame(), which acquires RCU read locks. Processing large
vmalloc regions can free thousands of shadow pages without yielding,
causing the worker to monopolize CPU for 10+ seconds, leading to RCU
stalls and potential OOM.
The issue occurs in this call chain:
purge_vmap_node()
-> kasan_release_vmalloc_node()
-> kasan_release_vmalloc() [for each vmap_area]
-> __kasan_release_vmalloc()
-> apply_to_existing_page_range()
-> kasan_depopulate_vmalloc_pte() [for each PTE]
-> __free_page()
-> __reset_page_owner() [CONFIG_PAGE_OWNER]
-> save_stack()
-> unwind_next_frame() [RCU read lock held]
Each shadow page free triggers stack unwinding under RCU lock. A single
large vmalloc region can have thousands of shadow pages, creating an
unbounded RCU critical section.
The previous attempt to fix this added cond_resched() between
processing each vmap_area in kasan_release_vmalloc_node(), but that's
insufficient because a single vmap_area can still contain many pages.
Fix this by adding cond_resched() in the page table walk callback
kasan_depopulate_vmalloc_pte() after every 32 pages. This ensures
regular scheduling points during large shadow region depopulation while
minimizing overhead for typical cases.
The batch size of 32 is chosen to:
- Amortize cond_resched() overhead (typically ~100ns) over multiple pages
- Limit worst-case non-preemptible time to ~3ms on typical hardware
(32 pages × ~100μs per stack unwind)
- Match common TLB and cache behavior
Note: We can't use need_resched() alone because under light CPU load,
need_resched() may remain false while RCU grace periods starve. The
batch count provides a guaranteed upper bound.
Reported-by: syzbot+d8d4c31d40f868eaea30@...kaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d8d4c31d40f868eaea30
Signed-off-by: Arnav Kapoor <kapoorarnav43@...il.com>
---
mm/kasan/shadow.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
index 000000000000..111111111111 100644
--- a/mm/kasan/shadow.c
+++ b/mm/kasan/shadow.c
@@ -468,9 +468,23 @@ static int kasan_depopulate_vmalloc_pte(pte_t *ptep,
unsigned long addr,
void *unused)
{
pte_t pte;
int none;
+ static DEFINE_PER_CPU(unsigned int, depopulate_batch_count);
+ unsigned int *batch = this_cpu_ptr(&depopulate_batch_count);
arch_leave_lazy_mmu_mode();
+ /*
+ * With CONFIG_PAGE_OWNER, each page free triggers expensive stack
+ * unwinding under RCU lock. Yield periodically to prevent RCU stalls
+ * when processing large vmalloc regions with thousands of shadow pages.
+ */
+ if (++(*batch) >= 32) {
+ *batch = 0;
+ cond_resched();
+ arch_enter_lazy_mmu_mode();
+ }
+
spin_lock(&init_mm.page_table_lock);
pte = ptep_get(ptep);
none = pte_none(pte);
On Monday, 12 January 2026 at 14:10:07 UTC+5:30 syzbot wrote:
Hello,
syzbot has tested the proposed patch but the reproducer is still triggering
an issue:
INFO: rcu detected stall in unwind_next_frame
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: Tasks blocked on level-0 rcu_node (CPUs 0-1): P6892/1:b..l
P6893/3:b..l P6746/1:b..l
rcu: (detected by 1, t=10502 jiffies, g=16737, q=586 ncpus=2)
task:kworker/u8:18 state:R running task stack:24088 pid:6746 tgid:6746
ppid:2 task_flags:0x4208060 flags:0x00080000
Workqueue: kvfree_rcu_reclaim kfree_rcu_monitor
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5256 [inline]
__schedule+0x1139/0x6150 kernel/sched/core.c:6863
preempt_schedule_irq+0x51/0x90 kernel/sched/core.c:7190
irqentry_exit+0x1d8/0x8c0 kernel/entry/common.c:216
asm_sysvec_apic_timer_interrupt+0x1a/0x20
arch/x86/include/asm/idtentry.h:697
RIP: 0010:lock_acquire+0x62/0x330 kernel/locking/lockdep.c:5872
Code: b4 18 12 83 f8 07 0f 87 a2 02 00 00 89 c0 48 0f a3 05 e2 c1 ee 0e 0f
82 74 02 00 00 8b 35 7a f2 ee 0e 85 f6 0f 85 8d 00 00 00 <48> 8b 44 24 30
65 48 2b 05 f9 b3 18 12 0f 85 ad 02 00 00 48 83 c4
RSP: 0018:ffffc90003fbf5b8 EFLAGS: 00000206
RAX: 0000000000000046 RBX: ffffffff8e3c96a0 RCX: 00000000993b8195
RDX: 0000000000000000 RSI: ffffffff8daa8a1d RDI: ffffffff8bf2b400
RBP: 0000000000000002 R08: 00000000e61a05bb R09: 00000000be61a05b
R10: 0000000000000002 R11: ffff888029058b30 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
rcu_read_lock include/linux/rcupdate.h:867 [inline]
class_rcu_constructor include/linux/rcupdate.h:1195 [inline]
unwind_next_frame+0xd1/0x20b0 arch/x86/kernel/unwind_orc.c:495
arch_stack_walk+0x94/0x100 arch/x86/kernel/stacktrace.c:25
stack_trace_save+0x8e/0xc0 kernel/stacktrace.c:122
kasan_save_stack+0x33/0x60 mm/kasan/common.c:57
kasan_save_track+0x14/0x30 mm/kasan/common.c:78
kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:584
poison_slab_object mm/kasan/common.c:253 [inline]
__kasan_slab_free+0x5f/0x80 mm/kasan/common.c:285
kasan_slab_free include/linux/kasan.h:235 [inline]
slab_free_hook mm/slub.c:2540 [inline]
slab_free_freelist_hook mm/slub.c:2569 [inline]
slab_free_bulk mm/slub.c:6703 [inline]
kmem_cache_free_bulk mm/slub.c:7390 [inline]
kmem_cache_free_bulk+0x2bf/0x680 mm/slub.c:7369
kfree_bulk include/linux/slab.h:830 [inline]
kvfree_rcu_bulk+0x1b7/0x1e0 mm/slab_common.c:1523
kvfree_rcu_drain_ready mm/slab_common.c:1728 [inline]
kfree_rcu_monitor+0x1d0/0x2f0 mm/slab_common.c:1801
process_one_work+0x9ba/0x1b20 kernel/workqueue.c:3257
process_scheduled_works kernel/workqueue.c:3340 [inline]
worker_thread+0x6c8/0xf10 kernel/workqueue.c:3421
kthread+0x3c5/0x780 kernel/kthread.c:463
ret_from_fork+0x983/0xb10 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
</TASK>
task:sed state:R running task stack:25800 pid:6893 tgid:6893 ppid:6890
task_flags:0x400000 flags:0x00080000
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5256 [inline]
__schedule+0x1139/0x6150 kernel/sched/core.c:6863
preempt_schedule_common+0x44/0xc0 kernel/sched/core.c:7047
preempt_schedule_thunk+0x16/0x30 arch/x86/entry/thunk.S:12
__raw_spin_unlock include/linux/spinlock_api_smp.h:143 [inline]
_raw_spin_unlock+0x3e/0x50 kernel/locking/spinlock.c:186
spin_unlock include/linux/spinlock.h:391 [inline]
filemap_map_pages+0x1194/0x1e00 mm/filemap.c:3931
do_fault_around mm/memory.c:5713 [inline]
do_read_fault mm/memory.c:5746 [inline]
do_fault+0x9cd/0x1ad0 mm/memory.c:5889
do_pte_missing mm/memory.c:4401 [inline]
handle_pte_fault mm/memory.c:6273 [inline]
__handle_mm_fault+0x1919/0x2bb0 mm/memory.c:6411
handle_mm_fault+0x3fe/0xad0 mm/memory.c:6580
do_user_addr_fault+0x60c/0x1370 arch/x86/mm/fault.c:1336
handle_page_fault arch/x86/mm/fault.c:1476 [inline]
exc_page_fault+0x64/0xc0 arch/x86/mm/fault.c:1532
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:618
RIP: 0033:0x7fc6d7657c50
RSP: 002b:00007ffe6008c528 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 00007fc6d768e490 RCX: 00007ffe6008c560
RDX: 00007fc6d7689d63 RSI: 00007fc6d7689d36 RDI: 00007ffe6008c748
RBP: 0000000000000041 R08: 00007ffe6008c550 R09: 00007ffe6008c558
R10: 0000000000000004 R11: 0000000000000246 R12: 00007ffe6008c748
R13: 00007ffe6008c550 R14: 00007fc6d76ce000 R15: 00005602504c1d98
</TASK>
task:udevd state:R running task stack:28152 pid:6892 tgid:6892 ppid:5186
task_flags:0x400140 flags:0x00080000
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5256 [inline]
__schedule+0x1139/0x6150 kernel/sched/core.c:6863
preempt_schedule_common+0x44/0xc0 kernel/sched/core.c:7047
preempt_schedule_thunk+0x16/0x30 arch/x86/entry/thunk.S:12
__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:152 [inline]
_raw_spin_unlock_irqrestore+0x61/0x80 kernel/locking/spinlock.c:194
sock_def_readable+0x15b/0x5d0 net/core/sock.c:3611
unix_dgram_sendmsg+0xcbd/0x1830 net/unix/af_unix.c:2286
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg net/socket.c:742 [inline]
sock_write_iter+0x566/0x610 net/socket.c:1195
new_sync_write fs/read_write.c:593 [inline]
vfs_write+0x7d3/0x11d0 fs/read_write.c:686
ksys_write+0x1f8/0x250 fs/read_write.c:738
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f6502bdf407
RSP: 002b:00007ffc4b535850 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 00007f6502b53880 RCX: 00007f6502bdf407
RDX: 0000000000000000 RSI: 00007ffc4b5358f7 RDI: 000000000000000a
RBP: 000000000000000a R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000202 R12: 00007f6502b536e8
R13: 0000000000000000 R14: 0000000000000000 R15: 000055685a0a1150
</TASK>
rcu: rcu_preempt kthread starved for 10572 jiffies! g16737 f0x0
RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now
expected behavior.
rcu: RCU grace-period kthread stack dump:
task:rcu_preempt state:R running task stack:28120 pid:16 tgid:16 ppid:2
task_flags:0x208040 flags:0x00080000
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5256 [inline]
__schedule+0x1139/0x6150 kernel/sched/core.c:6863
__schedule_loop kernel/sched/core.c:6945 [inline]
schedule+0xe7/0x3a0 kernel/sched/core.c:6960
schedule_timeout+0x123/0x290 kernel/time/sleep_timeout.c:99
rcu_gp_fqs_loop+0x1ea/0xaf0 kernel/rcu/tree.c:2083
rcu_gp_kthread+0x26d/0x380 kernel/rcu/tree.c:2285
kthread+0x3c5/0x780 kernel/kthread.c:463
ret_from_fork+0x983/0xb10 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:246
</TASK>
rcu: Stack dump where RCU GP kthread last ran:
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 10/25/2025
RIP: 0010:pv_native_safe_halt+0xf/0x20 arch/x86/kernel/paravirt.c:82
Code: a6 5f 02 c3 cc cc cc cc 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90
90 90 90 90 f3 0f 1e fa 66 90 0f 00 2d 13 19 12 00 fb f4 <e9> cc 35 03 00
66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90
RSP: 0018:ffffffff8e007df8 EFLAGS: 000002c6
RAX: 0000000000186859 RBX: 0000000000000000 RCX: ffffffff8b7846d9
RDX: 0000000000000000 RSI: ffffffff8daceab2 RDI: ffffffff8bf2b400
RBP: fffffbfff1c12f68 R08: 0000000000000001 R09: ffffed101708673d
R10: ffff8880b84339eb R11: ffffffff8e098670 R12: 0000000000000000
R13: ffffffff8e097b40 R14: ffffffff9088bdd0 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8881248f5000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055555eaef588 CR3: 00000000522b3000 CR4: 00000000003526f0
Call Trace:
<TASK>
arch_safe_halt arch/x86/include/asm/paravirt.h:107 [inline]
default_idle+0x13/0x20 arch/x86/kernel/process.c:767
default_idle_call+0x6c/0xb0 kernel/sched/idle.c:122
cpuidle_idle_call kernel/sched/idle.c:191 [inline]
do_idle+0x38d/0x510 kernel/sched/idle.c:332
cpu_startup_entry+0x4f/0x60 kernel/sched/idle.c:430
rest_init+0x16b/0x2b0 init/main.c:757
start_kernel+0x3ef/0x4d0 init/main.c:1206
x86_64_start_reservations+0x18/0x30 arch/x86/kernel/head64.c:310
x86_64_start_kernel+0x130/0x190 arch/x86/kernel/head64.c:291
common_startup_64+0x13e/0x148
</TASK>
Tested on:
commit: 0f61b186 Linux 6.19-rc5
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1397199a580000
kernel config: https://syzkaller.appspot.com/x/.config?x=1859476832863c41
dashboard link: https://syzkaller.appspot.com/bug?extid=d8d4c31d40f868eaea30
compiler: gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for
Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=1704399a580000
Powered by blists - more mailing lists