lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <isa6ohzad6b6l55kbdqa35r5fsp4wnifpncx3kit6m35266d7z@463ckwplt5w3>
Date: Mon, 12 Jan 2026 18:04:50 +0100
From: Jan Kara <jack@...e.cz>
To: Matt Fleming <matt@...dmodwrite.com>
Cc: Jan Kara <jack@...e.cz>, cgroups@...r.kernel.org, 
	linux-kernel@...r.kernel.org, Tejun Heo <tj@...nel.org>, 
	Christian Brauner <brauner@...nel.org>, linux-fsdevel@...r.kernel.org, kernel-team@...udflare.com
Subject: Re: [REGRESSION] 6.12: Workqueue lockups in inode_switch_wbs_work_fn
 (suspect commit 66c14dccd810)

Hi Matt!

On Mon 12-01-26 11:18:04, Matt Fleming wrote:
> I’m writing to report a regression we are observing in our production
> environment running kernel 6.12. We are seeing severe workqueue lockups
> that appear to be triggered by high-volume cgroup destruction. We have
> isolated the issue to 66c14dccd810 ("writeback: Avoid softlockup when
> switching many inodes").
> 
> We're seeing stalled tasks in the inode_switch_wbs workqueue. The worker
> appears to be CPU-bound within inode_switch_wbs_work_fn, leading to RCU
> stalls and eventual system lockups.

I agree we are CPU bound in inode_switch_wbs_work_fn() but I don't think we
are really hogging the CPU. The backtrace below indicates the worker just
got rescheduled in cond_resched() to give other tasks a chance to run. Is
the machine dying completely or does it eventually finish the cgroup
teardown?

> Here is a representative trace from a stalled CPU-bound worker pool:
> 
> [1437023.584832][    C0] Showing backtraces of running workers in stalled CPU-bound worker pools:
> [1437023.733923][    C0] pool 358:
> [1437023.733924][    C0] task:kworker/89:0    state:R  running task     stack:0     pid:3136989 tgid:3136989 ppid:2      task_flags:0x4208060 flags:0x00004000
> [1437023.733929][    C0] Workqueue: inode_switch_wbs inode_switch_wbs_work_fn
> [1437023.733933][    C0] Call Trace:
> [1437023.733934][    C0]  <TASK>
> [1437023.733937][    C0]  __schedule+0x4fb/0xbf0
> [1437023.733942][    C0]  __cond_resched+0x33/0x60
> [1437023.733944][    C0]  inode_switch_wbs_work_fn+0x481/0x710
> [1437023.733948][    C0]  process_one_work+0x17b/0x330
> [1437023.733950][    C0]  worker_thread+0x2ce/0x3f0
> 
> Our environment makes heavy use of cgroup-based services. When these
> services -- specifically our caching layer -- are shut down, they can
> trigger the offlining of a massive number of inodes (approx. 200k-250k+
> inodes per service).

Well, these changes were introduced because some services are switching
over 1m inodes on their exit and they were softlocking up the machine :).
So there's some commonality, just something in that setup behaves
differently from your setup. Are the inodes clean, dirty, or only with
dirty timestamps? Also since you mention 6.12 kernel but this series was
only merged in 6.18, do you carry full series ending with merge commit
9426414f0d42f?

> We have verified that reverting 66c14dccd810 completely eliminates these
> lockups in our production environment.
> 
> I am currently working on creating a synthetic reproduction case in the
> lab to replicate the inode/cgroup density required to trigger this on
> demand. In the meantime, I wanted to share these findings to see if you
> have any insights.

Yes, having the reproducer would certainly simplify debugging what exactly
is going on that your system is locking up. Because I was able to tear down
a cgroup doing switching of millions of inodes in couple of seconds without
any issue in my testing...

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ