linux-kernel - Re: [External] Re: [PATCH v2 2/2] memcg: Don't trigger hung task warnings when memcg is releasing resources.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250924082823.GV3245006@noisy.programming.kicks-ass.net>
Date: Wed, 24 Sep 2025 10:28:23 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Julian Sun <sunjunchao@...edance.com>
Cc: cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
	akpm@...ux-foundation.org, lance.yang@...ux.dev,
	mhiramat@...nel.org, yangyicong@...ilicon.com, will@...nel.org,
	dianders@...omium.org, mingo@...nel.org, lihuafei1@...wei.com,
	hannes@...xchg.org, mhocko@...nel.org, roman.gushchin@...ux.dev,
	shakeel.butt@...ux.dev, muchun.song@...ux.dev, tj@...nel.org
Subject: Re: [External] Re: [PATCH v2 2/2] memcg: Don't trigger hung task
 warnings when memcg is releasing resources.

On Wed, Sep 24, 2025 at 03:50:41PM +0800, Julian Sun wrote:
> On 9/24/25 2:32 PM, Peter Zijlstra wrote:
> > On Wed, Sep 24, 2025 at 11:41:00AM +0800, Julian Sun wrote:
> > > Hung task warning in mem_cgroup_css_free() is undesirable and
> > > unnecessary since the behavior of waiting for a long time is
> > > expected.
> > > 
> > > Use touch_hung_task_detector() to eliminate the possible
> > > hung task warning.
> > > 
> > > Signed-off-by: Julian Sun <sunjunchao@...edance.com>
> > 
> > Still hate this. It is not tied to progress. If progress really stalls,
> > no warning will be given.
> 
> Hi, peter
> 
> Thanks for your review and comments.
> 
> I did take a look at your solution provided yesterday, and get your point.
> However AFAICS it can't resolve the unexpected warnings here. Because it
> only works after we reach the finish_writeback_work(), and the key point
> here is, it *already* takes a long time before we reach
> finish_writeback_work(), and there is true progress before finish the
> writeback work that hung task detector still can not know.

But wb_split_bdi_pages() should already split things into smaller chunks
if there is low bandwidth, right? And we call finish_writeback_work()
for each chunk.

If a chunk is still taking too long, surely the solution is to use
smaller chunks?

> If we want to make the hung task detector to known the progress of writeback
> work, we need to add some code within do_writepages(): after each finish of
> a_ops->writepages(), we need to make detector to known there's progress.
> Something like this:
> 
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 3e248d1c3969..49572a83c47b 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -2635,6 +2635,10 @@ int do_writepages(struct address_space *mapping,
> struct writeback_control *wbc)
>                 else
>                         /* deal with chardevs and other special files */
>                         ret = 0;
> +               /* Make hung task detector to known there's progress. */
> +               if (force_wake)
> +                       wake_up_all(waitq);
> +
>                 if (ret != -ENOMEM || wbc->sync_mode != WB_SYNC_ALL)
>                         break;
> 
> which has a big impact on current code - I don't want to introduce this.

You sure? It looks to me like the next level down is wb_writeback() and
writeback_sb_inodes(), and those already have time based breaks in and
still have access to wb_writeback_work::done, while do_writepages() no
longer has that context.

> Yes, the behavior in this patch does have the possibility to paper cover the
> real warnings, and what I want to argue is that the essence of this patch is
> the same as the current touch_nmi_watchdog() and touch_softlockup_watchdog()
> - these functions are used only in specific scenarios we known and only
> affect a single event. And there seems no report that
> touch_nmi/softlockup_watchdog() will paper cover the real warnings (do we?).
> 
> Correct me if there's anything I'm missing or misunderstanding.

The thing with touch_nmi_watchdog() is that you need to keep doing it.
The moment you stop calling touch_nmi_watchdog(), you will cause it to
fire.

That is very much in line with the thing I proposed, and rather unlike
your proposal that blanket kill reporting for the task, irrespective of
how long it sits there waiting.