[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4db98cdc-132b-4034-a6f1-25a3df6d0d01@linux.dev>
Date: Mon, 26 Jan 2026 13:23:01 +0800
From: Lance Yang <lance.yang@...ux.dev>
To: Aaron Tomlin <atomlin@...mlin.com>
Cc: neelx@...e.com, sean@...e.io, pmladek@...e.com, mhiramat@...nel.org,
akpm@...ux-foundation.org, joel.granados@...nel.org, mproche@...il.com,
chjohnst@...il.com, nick.lange@...il.com, gregkh@...uxfoundation.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] hung_task: Skip scan on idle systems
Hi Aaron,
Keep one patch or series under review at a time, especially in the
same subsystem ...
Maintainers/Reviewers have limited bandwidth and can focus better
on one thing at a time.
Please, be patient! Just wait for it to be merged or rejected before
sending the next.
On 2026/1/26 11:45, Aaron Tomlin wrote:
> At present, the hung task detector behaves in an unoptimised manner: it
> wakes up periodically (every check_interval_secs, defaulting to 120
> seconds) and performs an O(N) scan of the entire process list,
> regardless of the system's actual state. On idle embedded devices,
> virtual machines, or large servers with no activity, this behaviour
> unnecessarily consumes CPU cycles and memory bandwidth, hindering
> power-saving states.
>
> To rectify this, this patch introduces an adaptive "green" polling
> mechanism. The detector will now verify whether the system is
> effectively idle before committing to a full process scan.
>
> To implement this, we utilise the standard get_avenrun() API to verify
> the global system load. Tasks in the TASK_UNINTERRUPTIBLE (D) state
> explicitly contribute to the system load average; consequently, if the
> 1-minute load average is zero, we can confidently infer that no tasks
> are currently hung, allowing us to bypass the expensive process scan.
>
> Crucially, we invoke get_avenrun(load, 0, 0) with both the offset and
> shift parameters set to zero. This configuration is deliberate and
> necessary for safety:
>
> 1. Zero Offset: Prevents the application of any artificial
> rounding bias usually intended for human-readable display.
>
> 2. Zero Shift: Retrieves the raw fixed-point value (where 1.0
> load = 2048) rather than shifting it down to an integer.
>
> This ensures maximum sensitivity: even a microscopic fractional load
> (e.g., a single task entering D state momentarily) will register as a
> non-zero raw value. This guarantees that we never encounter a false
> negative where a valid hung task is ignored due to integer truncation or
> rounding errors.
>
> This heuristic significantly minimises the detector's footprint on
> healthy systems whilst maintaining robust reliability for genuine hangs.
>
> Signed-off-by: Aaron Tomlin <atomlin@...mlin.com>
> ---
> kernel/hung_task.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index d2254c91450b..7b9f5c1bd35e 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -17,6 +17,7 @@
> #include <linux/export.h>
> #include <linux/panic_notifier.h>
> #include <linux/sysctl.h>
> +#include <linux/sched/loadavg.h>
> #include <linux/suspend.h>
> #include <linux/utsname.h>
> #include <linux/sched/signal.h>
> @@ -503,6 +504,7 @@ static int watchdog(void *dummy)
> for ( ; ; ) {
> unsigned long timeout = sysctl_hung_task_timeout_secs;
> unsigned long interval = sysctl_hung_task_check_interval_secs;
> + unsigned long load[3];
> long t;
>
> if (interval == 0)
> @@ -511,8 +513,12 @@ static int watchdog(void *dummy)
> t = hung_timeout_jiffies(hung_last_checked, interval);
> if (t <= 0) {
> if (!atomic_xchg(&reset_hung_task, 0) &&
> - !hung_detector_suspended)
> - check_hung_uninterruptible_tasks(timeout);
> + !hung_detector_suspended) {
> + /* Check 1-min load to detect idle system */
> + get_avenrun(load, 0, 0);
> + if (load[0] > 0)
> + check_hung_uninterruptible_tasks(timeout);
The optimization is not worth the trouble.
I don't think the assumption that "load[0] == 0 means no hung tasks" is
100% correct.
So that would miss actual hung tasks - a false negative, which is worse
than the "wasted scan" you're trying to avoid.
Also, I don't *really* care about optimizing something that runs once
every 120 seconds :)
Nacked-by: Lance Yang <lance.yang@...ux.dev>
Powered by blists - more mailing lists