linux-kernel - Re: [REGRESSION] 6.12: Workqueue lockups in inode_switch_wbs_work

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <xyivos2a76rpgmyp6kvvpskmuhheo2wtaqs5s4qvvbn6p3f3lb@3sc7xufujt57>
Date: Tue, 13 Jan 2026 13:02:53 +0100
From: Jan Kara <jack@...e.cz>
To: Matt Fleming <matt@...dmodwrite.com>
Cc: Jan Kara <jack@...e.cz>, cgroups@...r.kernel.org, 
	linux-kernel@...r.kernel.org, Tejun Heo <tj@...nel.org>, 
	Christian Brauner <brauner@...nel.org>, linux-fsdevel@...r.kernel.org, kernel-team@...udflare.com
Subject: Re: [REGRESSION] 6.12: Workqueue lockups in inode_switch_wbs_work_fn
 (suspect commit 66c14dccd810)

On Tue 13-01-26 11:46:35, Matt Fleming wrote:
> On Mon, Jan 12, 2026 at 06:04:50PM +0100, Jan Kara wrote:
> > 
> > I agree we are CPU bound in inode_switch_wbs_work_fn() but I don't think we
> > are really hogging the CPU. The backtrace below indicates the worker just
> > got rescheduled in cond_resched() to give other tasks a chance to run. Is
> > the machine dying completely or does it eventually finish the cgroup
> > teardown?
>  
> Yeah you're right, the CPU isn't hogged but the interaction with the
> workqueue subsystem leads to the machine choking. I've seen 150+
> instances of inode_switch_wbs_work_fn() queued up in the workqueue
> subsystem:
> 
>   [1437017.446174][    C0]     in-flight: 3139338:inode_switch_wbs_work_fn ,2420392:inode_switch_wbs_work_fn ,2914179:inode_switch_wbs_work_fn
>   [1437017.446181][    C0]     pending: 11*inode_switch_wbs_work_fn
>   [1437017.446185][    C0]   pwq 6: cpus=1 node=0 flags=0x2 nice=0 active=23 refcnt=24
>   [1437017.446186][    C0]     in-flight: 2723771:inode_switch_wbs_work_fn ,1710617:inode_switch_wbs_work_fn ,3228683:inode_switch_wbs_work_fn ,3149692:inode_switch_wbs_work_fn ,3224195:inode_switch_wbs_work_fn
>   [1437017.446193][    C0]     pending: 18*inode_switch_wbs_work_fn
>   [1437017.446195][    C0]   pwq 10: cpus=2 node=0 flags=0x2 nice=0 active=17 refcnt=18
>   [1437017.446196][    C0]     in-flight: 3224135:inode_switch_wbs_work_fn ,3193118:inode_switch_wbs_work_fn ,3224106:inode_switch_wbs_work_fn ,3228725:inode_switch_wbs_work_fn ,3087195:inode_switch_wbs_work_fn ,1853835:inode_switch_wbs_work_fn
>   [1437017.446204][    C0]     pending: 11*inode_switch_wbs_work_fn
> 
> It sometimes finishes the cgroup teardown and sometimes hard locks up.
> When workqueue items aren't completing things get really bad :) 
> 
> > Well, these changes were introduced because some services are switching
> > over 1m inodes on their exit and they were softlocking up the machine :).
> > So there's some commonality, just something in that setup behaves
> > differently from your setup. Are the inodes clean, dirty, or only with
> > dirty timestamps?
> 
> Good question. I don't know but I'll get back to you.
> 
> > Also since you mention 6.12 kernel but this series was
> > only merged in 6.18, do you carry full series ending with merge commit
> > 9426414f0d42f?
>  
> We always run the latest 6.12 LTS release and it looks like only these
> two commits got backported:
> 
>   9a6ebbdbd412 ("writeback: Avoid excessively long inode switching times")
>   66c14dccd810 ("writeback: Avoid softlockup when switching many inodes")

Ah, OK. Then you're missing e1b849cfa6b61f ("writeback: Avoid contention on
wb->list_lock when switching inodes") which might explain why my system
behaves differently from your one because that commit *heavily* reduces
contention on wb->list_lock when switching inodes and also avoids hogging
multiple workers with the switching works when only one of them can proceed
at a time (others are just spinning on the list_lock). So I'd suggest you
backport that commit and try whether it fixes your issues.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR