linux-kernel - Re: [syzbot] [btrfs?] INFO: task hung in __alloc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aJozN9LVgaPFX9dX@slm.duckdns.org>
Date: Mon, 11 Aug 2025 08:15:19 -1000
From: Tejun Heo <tj@...nel.org>
To: Qu Wenruo <wqu@...e.com>
Cc: syzbot <syzbot+ead9101689c4ca30dbe8@...kaller.appspotmail.com>,
	anna-maria@...utronix.de, clm@...com, dsterba@...e.com,
	frederic@...nel.org, josef@...icpanda.com,
	linux-btrfs@...r.kernel.org, linux-kernel@...r.kernel.org,
	syzkaller-bugs@...glegroups.com, tglx@...utronix.de,
	jiangshanlai@...il.com
Subject: Re: [syzbot] [btrfs?] INFO: task hung in __alloc_workqueue (2)

Hello,

On Mon, Aug 11, 2025 at 08:02:40AM +0930, Qu Wenruo wrote:
...
> > Call Trace:
> >   <TASK>
> >   context_switch kernel/sched/core.c:5357 [inline]
> >   __schedule+0x16f3/0x4c20 kernel/sched/core.c:6961
> >   __schedule_loop kernel/sched/core.c:7043 [inline]
> >   schedule+0x165/0x360 kernel/sched/core.c:7058
> >   schedule_timeout+0x9a/0x270 kernel/time/sleep_timeout.c:75
> >   do_wait_for_common kernel/sched/completion.c:100 [inline]
> >   __wait_for_common kernel/sched/completion.c:121 [inline]
> >   wait_for_common kernel/sched/completion.c:132 [inline]
> >   wait_for_completion+0x2bf/0x5d0 kernel/sched/completion.c:153
> >   kthread_flush_worker+0x1c6/0x240 kernel/kthread.c:1563
> 
> This is flushing pwq_release_worker during error handling, and I didn't see
> anything btrfs specific except btrfs is allocating an ordered workqueue
> which utilizes WQ_UNBOUND flag.
> 
> And that WQ_UNBOUND flag is pretty widely used among other filesystems,
> maybe it's just btrfs have too many workqueues triggering this?
> 
> Adding workqueue maintainers.

That flush stall likely is a secondary effect of another failure. The
kthread worker is already spun up and running and the work function
pwq_release_workfn() grabs several locks. Maybe someone else is stalling on
those unless the kthread is somehow not allowed to run? Continued below.

> > Showing all locks held in the system:
> > 1 lock held by khungtaskd/38:
> >   #0: ffffffff8d9a8b80 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
> >   #0: ffffffff8d9a8b80 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:841 [inline]
> >   #0: ffffffff8d9a8b80 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x2e/0x180 kernel/locking/lockdep.c:6775
> > 1 lock held by udevd/5207:
> >   #0: ffff8880358bfa18 (&ep->lock){++++}-{3:3}, at: write_lock_irq include/linux/rwlock_rt.h:104 [inline]
> >   #0: ffff8880358bfa18 (&ep->lock){++++}-{3:3}, at: ep_poll fs/eventpoll.c:2127 [inline]
> >   #0: ffff8880358bfa18 (&ep->lock){++++}-{3:3}, at: do_epoll_wait+0x84d/0xbb0 fs/eventpoll.c:2560
> > 2 locks held by getty/5598:
> >   #0: ffff88823bfae8a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:243
> >   #1: ffffc90003e8b2e0 (&ldata->atomic_read_lock){+.+.}-{4:4}, at: n_tty_read+0x444/0x1410 drivers/tty/n_tty.c:2222
> > 3 locks held by kworker/u8:3/5911:
> > 3 locks held by kworker/u8:7/5942:
> > 6 locks held by udevd/6060:
> > 1 lock held by udevd/6069:
> > 1 lock held by udevd/6190:
> > 6 locks held by udevd/6237:
~~~trimmed~~~

That's a lot of locks to be held, so something's not going right for sure.

> > Sending NMI from CPU 1 to CPUs 0:
> > NMI backtrace for cpu 0
> > CPU: 0 UID: 0 PID: 5911 Comm: kworker/u8:3 Tainted: G        W           6.16.0-syzkaller-11852-g479058002c32 #0 PREEMPT_{RT,(full)}
> > Tainted: [W]=WARN
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025
> > Workqueue: bat_events batadv_iv_send_outstanding_bat_ogm_packet
> > RIP: 0010:get_timer_this_cpu_base kernel/time/timer.c:939 [inline]
> > RIP: 0010:__mod_timer+0x81c/0xf60 kernel/time/timer.c:1101
> > Code: 01 00 00 00 48 8b 5c 24 20 41 0f b6 44 2d 00 84 c0 0f 85 72 06 00 00 8b 2b e8 f0 bb 49 09 41 89 c5 89 c3 bf 08 00 00 00 89 c6 <e8> 0f c1 12 00 41 83 fd 07 44 89 34 24 0f 87 69 06 00 00 e8 4c bc
> > RSP: 0018:ffffc90004fff680 EFLAGS: 00000082
> > RAX: 0000000000000000 RBX: 0000000000000000 RCX: f9fab87ca5ec6a00
> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000008
> > RBP: 0000000000200000 R08: 0000000000000000 R09: 0000000000000000
> > R10: dffffc0000000000 R11: fffff520009ffeac R12: ffff8880b8825a80
> > R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000802
> > FS:  0000000000000000(0000) GS:ffff8881268cd000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f46b6524000 CR3: 000000003afb2000 CR4: 00000000003526f0
> > Call Trace:
> >   <TASK>
> >   queue_delayed_work_on+0x18b/0x280 kernel/workqueue.c:2559
> >   queue_delayed_work include/linux/workqueue.h:684 [inline]
> >   batadv_forw_packet_queue+0x239/0x2a0 net/batman-adv/send.c:691
> >   batadv_iv_ogm_schedule_buff net/batman-adv/bat_iv_ogm.c:842 [inline]
> >   batadv_iv_ogm_schedule+0x892/0xf00 net/batman-adv/bat_iv_ogm.c:874
> >   batadv_iv_send_outstanding_bat_ogm_packet+0x6c6/0x7e0 net/batman-adv/bat_iv_ogm.c:1714
> >   process_one_work kernel/workqueue.c:3236 [inline]
> >   process_scheduled_works+0xae1/0x17b0 kernel/workqueue.c:3319
> >   worker_thread+0x8a0/0xda0 kernel/workqueue.c:3400
> >   kthread+0x711/0x8a0 kernel/kthread.c:463
> >   ret_from_fork+0x3f9/0x770 arch/x86/kernel/process.c:148
> >   ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> >   </TASK>

So, task hung watchdog triggered and making all cpus dump their backtraces
and it's only one CPU. Is this a two CPU setup in a live lockup? I don't see
anything apparent and these are time based conditions that can be triggered
by severely overloading the system too. If you can reproduce the conditions,
seeing how the system is doing in general and whether the system can unwind
itself after killing the workload may be useful.

Thanks.

-- 
tejun