[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aW9O5ocB1XdzVgr0@pathway.suse.cz>
Date: Tue, 20 Jan 2026 10:46:14 +0100
From: Petr Mladek <pmladek@...e.com>
To: Lance Yang <lance.yang@...ux.dev>
Cc: Aaron Tomlin <atomlin@...mlin.com>, sean@...e.io,
linux-kernel@...r.kernel.org, gregkh@...uxfoundation.org,
akpm@...ux-foundation.org, joel.granados@...nel.org,
mhiramat@...nel.org
Subject: Re: [v6 PATCH 0/2] hung_task: Provide runtime reset interface for
hung task detector
On Fri 2026-01-16 10:22:34, Lance Yang wrote:
>
>
> On 2026/1/16 02:18, Aaron Tomlin wrote:
> > On Thu, Jan 15, 2026 at 11:24:13AM +0800, Lance Yang wrote:
> > > IIUC, we should just do:
> > > - Patch 1: Full cmpxchg-based counting (Petr's POC), sysctl read-only
> > > - Patch 2: Add write handler for userspace reset
> > >
> > > That way Patch 1 is the real logic change, and Patch 2 is just adding
> > > the userspace interface.
> >
> > Hi Lance,
> >
> > Thank you for your feedback.
> > If I am not mistaken, Joel suggested the following structure [1]:
> >
> > 1. Create a preparatory patch to change the data type to atomic_long_t
> > 2. Introduce the required functionality to support a reset to "0"
> >
> > [1]: https://lore.kernel.org/lkml/d4vx6k7d4tlagesq55yrbma26i2nyt4tpijkme6ckioeyfqfec@txrs27niaj2m/
> >
>
> Yeah, either way works :)
>
> But that way (changing to atomic with the old logic first, then
> rewriting to the new logic) seems like it creates more churn
> and makes review harder.
I agree that adding the atomic and keeping the old logic is not
good. I would prefer to split it into two patches the following
way:
1. Reshufle the code so that "sysctl_hung_task_detect_count"
gets incremented in check_hung_uninterruptible_tasks()
and hung_task_info() will just get "this_round_count".
Plus convert "sysctl_hung_task_detect_count" to atomic.
It is the change that I suggested at
https://lore.kernel.org/lkml/aWTzhLSWQRIGt8Xu@pathway.suse.cz/
This way, it would be clear why the reshufling was done.
And the atomic operations will get the right acquire/release
semantic right away.
2. Add support to reset the couter to "0".
It should be a quite simple patch easy to review.
I think that this is how Joel meant it. We could even have 3 patches:
1. Move "sysctl_hung_task_detect_count" increment to
check_hung_uninterruptible_tasks().
2. Convert the counter to atomic operations.
3. Add reset to "0" support.
But I think that two patches might be good enough.
Best Regards,
Petr
Powered by blists - more mailing lists