[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9425363e-944f-4f37-bc5b-2586e44a5c5d@linux.dev>
Date: Mon, 22 Sep 2025 21:12:44 +0800
From: Lance Yang <lance.yang@...ux.dev>
To: Julian Sun <sunjunchao@...edance.com>, mhiramat@...nel.org
Cc: viro@...iv.linux.org.uk, brauner@...nel.org, jack@...e.cz,
mingo@...hat.com, peterz@...radead.org, juri.lelli@...hat.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com, rostedt@...dmis.org,
bsegall@...gle.com, mgorman@...e.de, vschneid@...hat.com,
akpm@...ux-foundation.org, agruenba@...hat.com, hannes@...xchg.org,
mhocko@...nel.org, roman.gushchin@...ux.dev, shakeel.butt@...ux.dev,
muchun.song@...ux.dev, linux-kernel@...r.kernel.org,
cgroups@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 0/3] Suppress undesirable hung task warnings.
On 2025/9/22 20:40, Julian Sun wrote:
> On 9/22/25 7:38 PM, Lance Yang wrote:
>
> Hi, Lance
>
> Thanks for your review and comments.
>
>> Hi Julian
>>
>> Thanks for the patch series!
>>
>> On 2025/9/22 17:41, Julian Sun wrote:
>>> As suggested by Andrew Morton in [1], we need a general mechanism
>>> that allows the hung task detector to ignore unnecessary hung
>>
>> Yep, I understand the goal is to suppress what can be a benign hung task
>> warning during memcg teardown.
>>
>>> tasks. This patch set implements this functionality.
>>>
>>> Patch 1 introduces a PF_DONT_HUNG flag. The hung task detector will
>>> ignores all tasks that have the PF_DONT_HUNG flag set.
>>
>> However, I'm concerned that the PF_DONT_HUNG flag is a bit too powerful
>> and might mask real, underlying hangs.
>
> The flag takes effect only when wait_event_no_hung() or
> wb_wait_for_completion_no_hung() is called, and its effect is limited to
> a single wait event, without affecting subsequent wait events. So AFAICS
> it will not mask real hang warnings.>
Emm... the risk of future misuse is what worries me. I would rather have
call sites actively "pet the watchdog" by periodically calling a helper
like touch_hung_task_detector(), instead of passively ignoring the detector.
>>>
>>> Patch 2 introduces wait_event_no_hung() and
>>> wb_wait_for_completion_no_hung(),
>>> which enable the hung task detector to ignore hung tasks caused by these
>>> wait events.
>>
>> Instead of making the detector ignore the task, what if we just change
>> the waiting mechanism? Looking at wb_wait_for_completion(), we could
>> introduce a new helper that internally uses wait_event_timeout() in a
>> loop.
>>
>> Something simple like this:
>>
>> void wb_wait_for_completion_no_hung(struct wb_completion *done)
>> {
>> atomic_dec(&done->cnt);
>> while (atomic_read(&done->cnt))
>> wait_event_timeout(*done->waitq, !atomic_read(&done-
>> >cnt), timeout);
>> }
>>
>> The periodic wake-ups from wait_event_timeout() would naturally prevent
>> the detector from complaining about slow but eventually completing
>> writeback.
>
> Yeah, this could definitely eliminate the hung task warning complained
> here.
> However what I aim to provide is a general mechanism for waiting on
> events. Of course, we could use code similar to the following, but this
> would introduce additional overhead from waking tasks and multiple
> operations on wq_head—something I don't want to introduce.
Yeah, I agree there's some overhead with a polling approach, but
mem_cgroup_css_free() should be an infrequent operation. So, I think it's
an acceptable trade-off :)
>
> +#define wait_event_no_hung(wq_head, condition) \
> +do { \
> + while (!(condition)) \
> + wait_event_timeout(wq_head, condition, timeout); \
> +}
>
> But I can try this approach or do not introcude wait_event_no_hung() if
> you want.>
Well, let's see what other folks think ;)
Cheers,
Lance
>>>
>>> Patch 3 uses wb_wait_for_completion_no_hung() in the final phase of
>>> memcg
>>> teardown to eliminate the hung task warning.
>>>
>>> Julian Sun (3):
>>> sched: Introduce a new flag PF_DONT_HUNG.
>>> writeback: Introduce wb_wait_for_completion_no_hung().
>>> memcg: Don't trigger hung task when memcg is releasing.
>>>
>>> fs/fs-writeback.c | 15 +++++++++++++++
>>> include/linux/backing-dev.h | 1 +
>>> include/linux/sched.h | 12 +++++++++++-
>>> include/linux/wait.h | 15 +++++++++++++++
>>> kernel/hung_task.c | 6 ++++++
>>> mm/memcontrol.c | 2 +-
>>> 6 files changed, 49 insertions(+), 2 deletions(-)
>>>
>>
>
> Thanks,
Powered by blists - more mailing lists