[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3783efcc-f5f3-433d-a812-dc95030c87f1@linux.dev>
Date: Thu, 15 Jan 2026 11:24:13 +0800
From: Lance Yang <lance.yang@...ux.dev>
To: Aaron Tomlin <atomlin@...mlin.com>
Cc: sean@...e.io, linux-kernel@...r.kernel.org, pmladek@...e.com,
gregkh@...uxfoundation.org, akpm@...ux-foundation.org,
joel.granados@...nel.org, mhiramat@...nel.org
Subject: Re: [v6 PATCH 0/2] hung_task: Provide runtime reset interface for
hung task detector
On 2026/1/15 10:32, Aaron Tomlin wrote:
> Hi Lance, Greg, Petr, Joel, Andrew,
>
> This series introduces the ability to reset
> /proc/sys/kernel/hung_task_detect_count.
>
> Writing a "0" value to this file atomically resets the counter of detected
> hung tasks. This functionality provides system administrators with the
> means to clear the cumulative diagnostic history following incident
> resolution, thereby simplifying subsequent monitoring without necessitating
> a system restart.
>
> The updated logic ensures that the long-running scan (which is inherently
> preemptible and subject to rcu_lock_break()) does not become desynchronised
> from the global state. By treating the initial read as a "version snapshot"
> the kernel can guarantee that the cumulative count only updates if the
> underlying state remained stable throughout the duration of the
> scan.
>
> Please let me know your thoughts.
There is a mismatch here with what Joel and Petr suggested ...
IIUC, we should just do:
- Patch 1: Full cmpxchg-based counting (Petr's POC), sysctl read-only
- Patch 2: Add write handler for userspace reset
That way Patch 1 is the real logic change, and Patch 2 is just adding
the userspace interface.
Thanks,
Lance
Powered by blists - more mailing lists