linux-kernel - Forwarded: Private message regarding: [syzbot] [mm?] INFO: rcu detected stall in sys

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <696d6477.a70a0220.34546f.0352.GAE@google.com>
Date: Sun, 18 Jan 2026 14:53:43 -0800
From: syzbot <syzbot+8bb3e2bee8a429cc76dd@...kaller.appspotmail.com>
To: linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com
Subject: Forwarded: Private message regarding: [syzbot] [mm?] INFO: rcu
 detected stall in sys_execve (6)

For archival purposes, forwarding an incoming command email to
linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com.

***

Subject: Private message regarding: [syzbot] [mm?] INFO: rcu detected stall in sys_execve (6)
Author: kapoorarnav43@...il.com

#syz test

>From 533b3d1bb14517adf13a2a99aedb60ecf9fb8402 Mon Sep 17 00:00:00 2001
From: Arnav Kapoor <kapoorarnav43@...il.com>
Date: Mon, 19 Jan 2026 04:22:49 +0530
Subject: [PATCH] netfilter: nf_conntrack: limit buckets processed per
 gc_worker call

The gc_worker may process many hash buckets in a single call, leading
to long execution times and workqueue lockups. Limit the number of
buckets processed per call to 10 to ensure timely completion and
rescheduling.

This complements the existing time-based limit and cond_resched()
calls to prevent stalls.

Reported-by: syzbot+8bb3e2bee8a429cc76dd@...kaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=8bb3e2bee8a429cc76dd
---
 net/netfilter/nf_conntrack_core.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/netfilter/nf_conntrack_core.c 
b/net/netfilter/nf_conntrack_core.c
index a3ef8eae7..8a2cdd172 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1517,6 +1517,7 @@ static void gc_worker(struct work_struct *work)
        struct conntrack_gc_work *gc_work;
        unsigned int expired_count = 0;
        unsigned long next_run;
+        unsigned int bucket_count = 0;
        s32 delta_time;
        long count;
 
@@ -1617,6 +1618,7 @@ static void gc_worker(struct work_struct *work)
                 */
                rcu_read_unlock();
                cond_resched();
+                bucket_count++;
                i++;
 
                delta_time = nfct_time_stamp - end_time;
@@ -1626,6 +1628,10 @@ static void gc_worker(struct work_struct *work)
                        gc_work->next_bucket = i;
                        next_run = 0;
                        goto early_exit;
+                if (bucket_count > 10) {
+                        gc_work->next_bucket = i;
+                        goto early_exit;
+                }
                }
        } while (i < hashsz);
 
-- 
2.43.0

On Monday, 19 January 2026 at 04:19:03 UTC+5:30 syzbot wrote:

Hello, 

syzbot has tested the proposed patch but the reproducer is still triggering 
an issue: 
BUG: workqueue lockup 

BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=-20 stuck for 
141s! 
Showing busy workqueues and worker pools: 
workqueue events: flags=0x100 
pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=6 refcnt=7 
pending: 3*nsim_dev_hwstats_traffic_work, psi_avgs_work, vmstat_shepherd, 
ovs_dp_masks_rebalance 
pwq 6: cpus=1 node=0 flags=0x2 nice=0 active=4 refcnt=5 
in-flight: 5940:nsim_fib_event_work nsim_fib_event_work 
,39:nsim_fib_event_work nsim_fib_event_work 
workqueue events_long: flags=0x100 
pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=4 refcnt=5 
pending: 4*defense_work_handler 
workqueue events_unbound: flags=0x2 
pwq 8: cpus=0-1 flags=0x6 nice=0 active=2 refcnt=3 
in-flight: 3887:toggle_allocation_gate 
pending: flush_memcg_stats_dwork 
workqueue events_unbound: flags=0x2 
pwq 8: cpus=0-1 flags=0x6 nice=0 active=8 refcnt=9 
in-flight: 60:cfg80211_wiphy_work ,3910:nsim_dev_trap_report_work 
,1136:nsim_dev_trap_report_work ,4325:nsim_dev_trap_report_work 
,3517:cfg80211_wiphy_work ,1101:nsim_dev_trap_report_work ,3469:crng_reseed 
pending: nsim_dev_trap_report_work 
workqueue events_freezable: flags=0x104 
pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=1 refcnt=2 
pending: update_balloon_stats_func 
workqueue events_power_efficient: flags=0x180 
pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=8 refcnt=9 
in-flight: 794:reg_check_chans_work 
pending: neigh_managed_work, neigh_periodic_work, 2*check_lifetime, 
do_cache_clean, 2*check_lifetime 
pwq 6: cpus=1 node=0 flags=0x2 nice=0 active=2 refcnt=3 
in-flight: 5865:neigh_periodic_work ,24:gc_worker 
workqueue kvfree_rcu_reclaim: flags=0xa 
pwq 8: cpus=0-1 flags=0x6 nice=0 active=2 refcnt=3 
in-flight: 1013:kfree_rcu_monitor 
pending: kfree_rcu_monitor 
pwq 8: cpus=0-1 flags=0x6 nice=0 active=1 refcnt=2 
in-flight: 1141:kfree_rcu_monitor 
workqueue mm_percpu_wq: flags=0x8 
pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=1 refcnt=2 
pending: vmstat_update 
workqueue writeback: flags=0x4a 
pwq 8: cpus=0-1 flags=0x6 nice=0 active=1 refcnt=2 
in-flight: 4346:wb_workfn 
workqueue kblockd: flags=0x18 
pwq 3: cpus=0 node=0 flags=0x0 nice=-20 active=1 refcnt=2 
pending: blk_mq_run_work_fn 
pwq 7: cpus=1 node=0 flags=0x0 nice=-20 active=2 refcnt=3 
pending: blk_mq_timeout_work, blk_mq_requeue_work 
workqueue ipv6_addrconf: flags=0x6000a 
pwq 8: cpus=0-1 flags=0x6 nice=0 active=1 refcnt=231 
in-flight: 340:addrconf_dad_work 
inactive: 221*addrconf_dad_work, addrconf_verify_work, addrconf_dad_work, 
4*addrconf_verify_work 
workqueue krxrpcd: flags=0x2001a 
pwq 9: cpus=0-1 node=0 flags=0x4 nice=-20 active=1 refcnt=9 
pending: rxrpc_peer_keepalive_worker 
inactive: 5*rxrpc_peer_keepalive_worker 
workqueue bat_events: flags=0x6000a 
pwq 8: cpus=0-1 flags=0x6 nice=0 active=1 refcnt=40 
pending: batadv_mcast_mla_update 
inactive: 4*batadv_mcast_mla_update, 
7*batadv_iv_send_outstanding_bat_ogm_packet, 5*batadv_purge_orig, 
5*batadv_iv_send_outstanding_bat_ogm_packet, 5*batadv_tt_purge, 
batadv_dat_purge, 2*batadv_bla_periodic_work, batadv_dat_purge, 
batadv_bla_periodic_work, batadv_dat_purge, batadv_bla_periodic_work, 
batadv_dat_purge, batadv_bla_periodic_work, batadv_dat_purge 
workqueue hci0: flags=0x20012 
pwq 9: cpus=0-1 node=0 flags=0x4 nice=-20 active=1 refcnt=4 
pending: hci_conn_timeout 
workqueue hci2: flags=0x20012 
pwq 9: cpus=0-1 node=0 flags=0x4 nice=-20 active=1 refcnt=4 
pending: hci_conn_timeout 
workqueue wg-kex-wg0: flags=0x124 
pwq 6: cpus=1 node=0 flags=0x2 nice=0 active=1 refcnt=2 
pending: wg_packet_handshake_receive_worker 
workqueue wg-kex-wg0: flags=0x6 
pwq 8: cpus=0-1 flags=0x6 nice=0 active=1 refcnt=2 
pending: wg_packet_handshake_send_worker 
workqueue wg-crypt-wg0: flags=0x128 
pwq 6: cpus=1 node=0 flags=0x2 nice=0 active=1 refcnt=2 
pending: wg_packet_encrypt_worker 
workqueue wg-crypt-wg1: flags=0x128 
pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=1 refcnt=2 
in-flight: 9:wg_packet_tx_worker 
workqueue wg-kex-wg2: flags=0x6 
pwq 8: cpus=0-1 flags=0x6 nice=0 active=1 refcnt=2 
pending: wg_packet_handshake_send_worker 
workqueue wg-crypt-wg2: flags=0x128 
pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=2 refcnt=3 
in-flight: 5963:wg_packet_tx_worker 
pending: wg_packet_encrypt_worker 
pwq 6: cpus=1 node=0 flags=0x2 nice=0 active=5 refcnt=6 
in-flight: 6465:wg_packet_encrypt_worker wg_packet_encrypt_worker 
,5964:wg_packet_tx_worker wg_packet_tx_worker 
pending: wg_packet_decrypt_worker 
workqueue wg-kex-wg0: flags=0x6 
pwq 8: cpus=0-1 flags=0x6 nice=0 active=3 refcnt=4 
in-flight: 1045:wg_packet_handshake_send_worker 
,13:wg_packet_handshake_send_worker wg_packet_handshake_send_worker 
workqueue wg-crypt-wg1: flags=0x128 
pwq 6: cpus=1 node=0 flags=0x2 nice=0 active=2 refcnt=3 
pending: wg_packet_tx_worker, wg_packet_encrypt_worker 
pool 2: cpus=0 node=0 flags=0x0 nice=0 hung=64s workers=6 idle: 5889 5941 
10 
pool 6: cpus=1 node=0 flags=0x2 nice=0 hung=65s workers=7 manager: 128 
pool 8: cpus=0-1 flags=0x6 nice=0 hung=65s workers=18 manager: 36 idle: 12 
1341 50 
Showing backtraces of running workers in stalled CPU-bound worker pools: 


Tested on: 

commit: f40ddcc0 Revert "nfc/nci: Add the inconsistency check .. 
git tree: net 
console output: https://syzkaller.appspot.com/x/log.txt?x=15a7db9a580000 
kernel config: https://syzkaller.appspot.com/x/.config?x=323fe5bdde2384a5 
dashboard link: https://syzkaller.appspot.com/bug?extid=8bb3e2bee8a429cc76dd 
compiler: Debian clang version 20.1.8 
(++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 
20.1.8 
patch: https://syzkaller.appspot.com/x/patch.diff?x=143ff522580000