[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <696d6477.a70a0220.34546f.0352.GAE@google.com>
Date: Sun, 18 Jan 2026 14:53:43 -0800
From: syzbot <syzbot+8bb3e2bee8a429cc76dd@...kaller.appspotmail.com>
To: linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com
Subject: Forwarded: Private message regarding: [syzbot] [mm?] INFO: rcu
detected stall in sys_execve (6)
For archival purposes, forwarding an incoming command email to
linux-kernel@...r.kernel.org, syzkaller-bugs@...glegroups.com.
***
Subject: Private message regarding: [syzbot] [mm?] INFO: rcu detected stall in sys_execve (6)
Author: kapoorarnav43@...il.com
#syz test
>From 533b3d1bb14517adf13a2a99aedb60ecf9fb8402 Mon Sep 17 00:00:00 2001
From: Arnav Kapoor <kapoorarnav43@...il.com>
Date: Mon, 19 Jan 2026 04:22:49 +0530
Subject: [PATCH] netfilter: nf_conntrack: limit buckets processed per
gc_worker call
The gc_worker may process many hash buckets in a single call, leading
to long execution times and workqueue lockups. Limit the number of
buckets processed per call to 10 to ensure timely completion and
rescheduling.
This complements the existing time-based limit and cond_resched()
calls to prevent stalls.
Reported-by: syzbot+8bb3e2bee8a429cc76dd@...kaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=8bb3e2bee8a429cc76dd
---
net/netfilter/nf_conntrack_core.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/netfilter/nf_conntrack_core.c
b/net/netfilter/nf_conntrack_core.c
index a3ef8eae7..8a2cdd172 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1517,6 +1517,7 @@ static void gc_worker(struct work_struct *work)
struct conntrack_gc_work *gc_work;
unsigned int expired_count = 0;
unsigned long next_run;
+ unsigned int bucket_count = 0;
s32 delta_time;
long count;
@@ -1617,6 +1618,7 @@ static void gc_worker(struct work_struct *work)
*/
rcu_read_unlock();
cond_resched();
+ bucket_count++;
i++;
delta_time = nfct_time_stamp - end_time;
@@ -1626,6 +1628,10 @@ static void gc_worker(struct work_struct *work)
gc_work->next_bucket = i;
next_run = 0;
goto early_exit;
+ if (bucket_count > 10) {
+ gc_work->next_bucket = i;
+ goto early_exit;
+ }
}
} while (i < hashsz);
--
2.43.0
On Monday, 19 January 2026 at 04:19:03 UTC+5:30 syzbot wrote:
Hello,
syzbot has tested the proposed patch but the reproducer is still triggering
an issue:
BUG: workqueue lockup
BUG: workqueue lockup - pool cpus=1 node=0 flags=0x0 nice=-20 stuck for
141s!
Showing busy workqueues and worker pools:
workqueue events: flags=0x100
pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=6 refcnt=7
pending: 3*nsim_dev_hwstats_traffic_work, psi_avgs_work, vmstat_shepherd,
ovs_dp_masks_rebalance
pwq 6: cpus=1 node=0 flags=0x2 nice=0 active=4 refcnt=5
in-flight: 5940:nsim_fib_event_work nsim_fib_event_work
,39:nsim_fib_event_work nsim_fib_event_work
workqueue events_long: flags=0x100
pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=4 refcnt=5
pending: 4*defense_work_handler
workqueue events_unbound: flags=0x2
pwq 8: cpus=0-1 flags=0x6 nice=0 active=2 refcnt=3
in-flight: 3887:toggle_allocation_gate
pending: flush_memcg_stats_dwork
workqueue events_unbound: flags=0x2
pwq 8: cpus=0-1 flags=0x6 nice=0 active=8 refcnt=9
in-flight: 60:cfg80211_wiphy_work ,3910:nsim_dev_trap_report_work
,1136:nsim_dev_trap_report_work ,4325:nsim_dev_trap_report_work
,3517:cfg80211_wiphy_work ,1101:nsim_dev_trap_report_work ,3469:crng_reseed
pending: nsim_dev_trap_report_work
workqueue events_freezable: flags=0x104
pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=1 refcnt=2
pending: update_balloon_stats_func
workqueue events_power_efficient: flags=0x180
pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=8 refcnt=9
in-flight: 794:reg_check_chans_work
pending: neigh_managed_work, neigh_periodic_work, 2*check_lifetime,
do_cache_clean, 2*check_lifetime
pwq 6: cpus=1 node=0 flags=0x2 nice=0 active=2 refcnt=3
in-flight: 5865:neigh_periodic_work ,24:gc_worker
workqueue kvfree_rcu_reclaim: flags=0xa
pwq 8: cpus=0-1 flags=0x6 nice=0 active=2 refcnt=3
in-flight: 1013:kfree_rcu_monitor
pending: kfree_rcu_monitor
pwq 8: cpus=0-1 flags=0x6 nice=0 active=1 refcnt=2
in-flight: 1141:kfree_rcu_monitor
workqueue mm_percpu_wq: flags=0x8
pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=1 refcnt=2
pending: vmstat_update
workqueue writeback: flags=0x4a
pwq 8: cpus=0-1 flags=0x6 nice=0 active=1 refcnt=2
in-flight: 4346:wb_workfn
workqueue kblockd: flags=0x18
pwq 3: cpus=0 node=0 flags=0x0 nice=-20 active=1 refcnt=2
pending: blk_mq_run_work_fn
pwq 7: cpus=1 node=0 flags=0x0 nice=-20 active=2 refcnt=3
pending: blk_mq_timeout_work, blk_mq_requeue_work
workqueue ipv6_addrconf: flags=0x6000a
pwq 8: cpus=0-1 flags=0x6 nice=0 active=1 refcnt=231
in-flight: 340:addrconf_dad_work
inactive: 221*addrconf_dad_work, addrconf_verify_work, addrconf_dad_work,
4*addrconf_verify_work
workqueue krxrpcd: flags=0x2001a
pwq 9: cpus=0-1 node=0 flags=0x4 nice=-20 active=1 refcnt=9
pending: rxrpc_peer_keepalive_worker
inactive: 5*rxrpc_peer_keepalive_worker
workqueue bat_events: flags=0x6000a
pwq 8: cpus=0-1 flags=0x6 nice=0 active=1 refcnt=40
pending: batadv_mcast_mla_update
inactive: 4*batadv_mcast_mla_update,
7*batadv_iv_send_outstanding_bat_ogm_packet, 5*batadv_purge_orig,
5*batadv_iv_send_outstanding_bat_ogm_packet, 5*batadv_tt_purge,
batadv_dat_purge, 2*batadv_bla_periodic_work, batadv_dat_purge,
batadv_bla_periodic_work, batadv_dat_purge, batadv_bla_periodic_work,
batadv_dat_purge, batadv_bla_periodic_work, batadv_dat_purge
workqueue hci0: flags=0x20012
pwq 9: cpus=0-1 node=0 flags=0x4 nice=-20 active=1 refcnt=4
pending: hci_conn_timeout
workqueue hci2: flags=0x20012
pwq 9: cpus=0-1 node=0 flags=0x4 nice=-20 active=1 refcnt=4
pending: hci_conn_timeout
workqueue wg-kex-wg0: flags=0x124
pwq 6: cpus=1 node=0 flags=0x2 nice=0 active=1 refcnt=2
pending: wg_packet_handshake_receive_worker
workqueue wg-kex-wg0: flags=0x6
pwq 8: cpus=0-1 flags=0x6 nice=0 active=1 refcnt=2
pending: wg_packet_handshake_send_worker
workqueue wg-crypt-wg0: flags=0x128
pwq 6: cpus=1 node=0 flags=0x2 nice=0 active=1 refcnt=2
pending: wg_packet_encrypt_worker
workqueue wg-crypt-wg1: flags=0x128
pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=1 refcnt=2
in-flight: 9:wg_packet_tx_worker
workqueue wg-kex-wg2: flags=0x6
pwq 8: cpus=0-1 flags=0x6 nice=0 active=1 refcnt=2
pending: wg_packet_handshake_send_worker
workqueue wg-crypt-wg2: flags=0x128
pwq 2: cpus=0 node=0 flags=0x0 nice=0 active=2 refcnt=3
in-flight: 5963:wg_packet_tx_worker
pending: wg_packet_encrypt_worker
pwq 6: cpus=1 node=0 flags=0x2 nice=0 active=5 refcnt=6
in-flight: 6465:wg_packet_encrypt_worker wg_packet_encrypt_worker
,5964:wg_packet_tx_worker wg_packet_tx_worker
pending: wg_packet_decrypt_worker
workqueue wg-kex-wg0: flags=0x6
pwq 8: cpus=0-1 flags=0x6 nice=0 active=3 refcnt=4
in-flight: 1045:wg_packet_handshake_send_worker
,13:wg_packet_handshake_send_worker wg_packet_handshake_send_worker
workqueue wg-crypt-wg1: flags=0x128
pwq 6: cpus=1 node=0 flags=0x2 nice=0 active=2 refcnt=3
pending: wg_packet_tx_worker, wg_packet_encrypt_worker
pool 2: cpus=0 node=0 flags=0x0 nice=0 hung=64s workers=6 idle: 5889 5941
10
pool 6: cpus=1 node=0 flags=0x2 nice=0 hung=65s workers=7 manager: 128
pool 8: cpus=0-1 flags=0x6 nice=0 hung=65s workers=18 manager: 36 idle: 12
1341 50
Showing backtraces of running workers in stalled CPU-bound worker pools:
Tested on:
commit: f40ddcc0 Revert "nfc/nci: Add the inconsistency check ..
git tree: net
console output: https://syzkaller.appspot.com/x/log.txt?x=15a7db9a580000
kernel config: https://syzkaller.appspot.com/x/.config?x=323fe5bdde2384a5
dashboard link: https://syzkaller.appspot.com/bug?extid=8bb3e2bee8a429cc76dd
compiler: Debian clang version 20.1.8
(++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD
20.1.8
patch: https://syzkaller.appspot.com/x/patch.diff?x=143ff522580000
Powered by blists - more mailing lists