[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <eiilrap7jcpk7bneqvovbrqu6hdtzo2xra5tgqbg3wje2emzha@q3may6rqs5zl>
Date: Tue, 13 Jan 2026 11:46:35 +0000
From: Matt Fleming <matt@...dmodwrite.com>
To: Jan Kara <jack@...e.cz>
Cc: cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
Tejun Heo <tj@...nel.org>, Christian Brauner <brauner@...nel.org>,
linux-fsdevel@...r.kernel.org, kernel-team@...udflare.com
Subject: Re: [REGRESSION] 6.12: Workqueue lockups in inode_switch_wbs_work_fn
(suspect commit 66c14dccd810)
On Mon, Jan 12, 2026 at 06:04:50PM +0100, Jan Kara wrote:
>
> I agree we are CPU bound in inode_switch_wbs_work_fn() but I don't think we
> are really hogging the CPU. The backtrace below indicates the worker just
> got rescheduled in cond_resched() to give other tasks a chance to run. Is
> the machine dying completely or does it eventually finish the cgroup
> teardown?
Yeah you're right, the CPU isn't hogged but the interaction with the
workqueue subsystem leads to the machine choking. I've seen 150+
instances of inode_switch_wbs_work_fn() queued up in the workqueue
subsystem:
[1437017.446174][ C0] in-flight: 3139338:inode_switch_wbs_work_fn ,2420392:inode_switch_wbs_work_fn ,2914179:inode_switch_wbs_work_fn
[1437017.446181][ C0] pending: 11*inode_switch_wbs_work_fn
[1437017.446185][ C0] pwq 6: cpus=1 node=0 flags=0x2 nice=0 active=23 refcnt=24
[1437017.446186][ C0] in-flight: 2723771:inode_switch_wbs_work_fn ,1710617:inode_switch_wbs_work_fn ,3228683:inode_switch_wbs_work_fn ,3149692:inode_switch_wbs_work_fn ,3224195:inode_switch_wbs_work_fn
[1437017.446193][ C0] pending: 18*inode_switch_wbs_work_fn
[1437017.446195][ C0] pwq 10: cpus=2 node=0 flags=0x2 nice=0 active=17 refcnt=18
[1437017.446196][ C0] in-flight: 3224135:inode_switch_wbs_work_fn ,3193118:inode_switch_wbs_work_fn ,3224106:inode_switch_wbs_work_fn ,3228725:inode_switch_wbs_work_fn ,3087195:inode_switch_wbs_work_fn ,1853835:inode_switch_wbs_work_fn
[1437017.446204][ C0] pending: 11*inode_switch_wbs_work_fn
It sometimes finishes the cgroup teardown and sometimes hard locks up.
When workqueue items aren't completing things get really bad :)
> Well, these changes were introduced because some services are switching
> over 1m inodes on their exit and they were softlocking up the machine :).
> So there's some commonality, just something in that setup behaves
> differently from your setup. Are the inodes clean, dirty, or only with
> dirty timestamps?
Good question. I don't know but I'll get back to you.
> Also since you mention 6.12 kernel but this series was
> only merged in 6.18, do you carry full series ending with merge commit
> 9426414f0d42f?
We always run the latest 6.12 LTS release and it looks like only these
two commits got backported:
9a6ebbdbd412 ("writeback: Avoid excessively long inode switching times")
66c14dccd810 ("writeback: Avoid softlockup when switching many inodes")
Powered by blists - more mailing lists