linux-kernel - Re: [PATCH] hung_task: Skip scan on idle systems

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4db98cdc-132b-4034-a6f1-25a3df6d0d01@linux.dev>
Date: Mon, 26 Jan 2026 13:23:01 +0800
From: Lance Yang <lance.yang@...ux.dev>
To: Aaron Tomlin <atomlin@...mlin.com>
Cc: neelx@...e.com, sean@...e.io, pmladek@...e.com, mhiramat@...nel.org,
 akpm@...ux-foundation.org, joel.granados@...nel.org, mproche@...il.com,
 chjohnst@...il.com, nick.lange@...il.com, gregkh@...uxfoundation.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH] hung_task: Skip scan on idle systems

Hi Aaron,

Keep one patch or series under review at a time, especially in the
same subsystem ...

Maintainers/Reviewers have limited bandwidth and can focus better
on one thing at a time.

Please, be patient! Just wait for it to be merged or rejected before
sending the next.

On 2026/1/26 11:45, Aaron Tomlin wrote:
> At present, the hung task detector behaves in an unoptimised manner: it
> wakes up periodically (every check_interval_secs, defaulting to 120
> seconds) and performs an O(N) scan of the entire process list,
> regardless of the system's actual state. On idle embedded devices,
> virtual machines, or large servers with no activity, this behaviour
> unnecessarily consumes CPU cycles and memory bandwidth, hindering
> power-saving states.
> 
> To rectify this, this patch introduces an adaptive "green" polling
> mechanism. The detector will now verify whether the system is
> effectively idle before committing to a full process scan.
> 
> To implement this, we utilise the standard get_avenrun() API to verify
> the global system load. Tasks in the TASK_UNINTERRUPTIBLE (D) state
> explicitly contribute to the system load average; consequently, if the
> 1-minute load average is zero, we can confidently infer that no tasks
> are currently hung, allowing us to bypass the expensive process scan.
> 
> Crucially, we invoke get_avenrun(load, 0, 0) with both the offset and
> shift parameters set to zero. This configuration is deliberate and
> necessary for safety:
> 
>          1. Zero Offset: Prevents the application of any artificial
>             rounding bias usually intended for human-readable display.
> 
>          2. Zero Shift: Retrieves the raw fixed-point value (where 1.0
>             load = 2048) rather than shifting it down to an integer.
> 
> This ensures maximum sensitivity: even a microscopic fractional load
> (e.g., a single task entering D state momentarily) will register as a
> non-zero raw value. This guarantees that we never encounter a false
> negative where a valid hung task is ignored due to integer truncation or
> rounding errors.
> 
> This heuristic significantly minimises the detector's footprint on
> healthy systems whilst maintaining robust reliability for genuine hangs.
> 
> Signed-off-by: Aaron Tomlin <atomlin@...mlin.com>
> ---
>   kernel/hung_task.c | 10 ++++++++--
>   1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/hung_task.c b/kernel/hung_task.c
> index d2254c91450b..7b9f5c1bd35e 100644
> --- a/kernel/hung_task.c
> +++ b/kernel/hung_task.c
> @@ -17,6 +17,7 @@
>   #include <linux/export.h>
>   #include <linux/panic_notifier.h>
>   #include <linux/sysctl.h>
> +#include <linux/sched/loadavg.h>
>   #include <linux/suspend.h>
>   #include <linux/utsname.h>
>   #include <linux/sched/signal.h>
> @@ -503,6 +504,7 @@ static int watchdog(void *dummy)
>   	for ( ; ; ) {
>   		unsigned long timeout = sysctl_hung_task_timeout_secs;
>   		unsigned long interval = sysctl_hung_task_check_interval_secs;
> +		unsigned long load[3];
>   		long t;
>   
>   		if (interval == 0)
> @@ -511,8 +513,12 @@ static int watchdog(void *dummy)
>   		t = hung_timeout_jiffies(hung_last_checked, interval);
>   		if (t <= 0) {
>   			if (!atomic_xchg(&reset_hung_task, 0) &&
> -			    !hung_detector_suspended)
> -				check_hung_uninterruptible_tasks(timeout);
> +			    !hung_detector_suspended) {
> +				/* Check 1-min load to detect idle system */
> +				get_avenrun(load, 0, 0);
> +				if (load[0] > 0)
> +					check_hung_uninterruptible_tasks(timeout);

The optimization is not worth the trouble.

I don't think the assumption that "load[0] == 0 means no hung tasks" is
100% correct.

So that would miss actual hung tasks - a false negative, which is worse
than the "wasted scan" you're trying to avoid.

Also, I don't *really* care about optimizing something that runs once
every 120 seconds :)

Nacked-by: Lance Yang <lance.yang@...ux.dev>