[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260126034539.3407903-1-atomlin@atomlin.com>
Date: Sun, 25 Jan 2026 22:45:39 -0500
From: Aaron Tomlin <atomlin@...mlin.com>
To: akpm@...ux-foundation.org,
lance.yang@...ux.dev,
mhiramat@...nel.org,
gregkh@...uxfoundation.org,
pmladek@...e.com,
joel.granados@...nel.org
Cc: neelx@...e.com,
sean@...e.io,
mproche@...il.com,
chjohnst@...il.com,
nick.lange@...il.com,
linux-kernel@...r.kernel.org
Subject: [PATCH] hung_task: Skip scan on idle systems
At present, the hung task detector behaves in an unoptimised manner: it
wakes up periodically (every check_interval_secs, defaulting to 120
seconds) and performs an O(N) scan of the entire process list,
regardless of the system's actual state. On idle embedded devices,
virtual machines, or large servers with no activity, this behaviour
unnecessarily consumes CPU cycles and memory bandwidth, hindering
power-saving states.
To rectify this, this patch introduces an adaptive "green" polling
mechanism. The detector will now verify whether the system is
effectively idle before committing to a full process scan.
To implement this, we utilise the standard get_avenrun() API to verify
the global system load. Tasks in the TASK_UNINTERRUPTIBLE (D) state
explicitly contribute to the system load average; consequently, if the
1-minute load average is zero, we can confidently infer that no tasks
are currently hung, allowing us to bypass the expensive process scan.
Crucially, we invoke get_avenrun(load, 0, 0) with both the offset and
shift parameters set to zero. This configuration is deliberate and
necessary for safety:
1. Zero Offset: Prevents the application of any artificial
rounding bias usually intended for human-readable display.
2. Zero Shift: Retrieves the raw fixed-point value (where 1.0
load = 2048) rather than shifting it down to an integer.
This ensures maximum sensitivity: even a microscopic fractional load
(e.g., a single task entering D state momentarily) will register as a
non-zero raw value. This guarantees that we never encounter a false
negative where a valid hung task is ignored due to integer truncation or
rounding errors.
This heuristic significantly minimises the detector's footprint on
healthy systems whilst maintaining robust reliability for genuine hangs.
Signed-off-by: Aaron Tomlin <atomlin@...mlin.com>
---
kernel/hung_task.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index d2254c91450b..7b9f5c1bd35e 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -17,6 +17,7 @@
#include <linux/export.h>
#include <linux/panic_notifier.h>
#include <linux/sysctl.h>
+#include <linux/sched/loadavg.h>
#include <linux/suspend.h>
#include <linux/utsname.h>
#include <linux/sched/signal.h>
@@ -503,6 +504,7 @@ static int watchdog(void *dummy)
for ( ; ; ) {
unsigned long timeout = sysctl_hung_task_timeout_secs;
unsigned long interval = sysctl_hung_task_check_interval_secs;
+ unsigned long load[3];
long t;
if (interval == 0)
@@ -511,8 +513,12 @@ static int watchdog(void *dummy)
t = hung_timeout_jiffies(hung_last_checked, interval);
if (t <= 0) {
if (!atomic_xchg(&reset_hung_task, 0) &&
- !hung_detector_suspended)
- check_hung_uninterruptible_tasks(timeout);
+ !hung_detector_suspended) {
+ /* Check 1-min load to detect idle system */
+ get_avenrun(load, 0, 0);
+ if (load[0] > 0)
+ check_hung_uninterruptible_tasks(timeout);
+ }
hung_last_checked = jiffies;
continue;
}
--
2.51.0
Powered by blists - more mailing lists