lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250925060605.2659-1-lirongqing@baidu.com>
Date: Thu, 25 Sep 2025 14:06:05 +0800
From: lirongqing <lirongqing@...du.com>
To: <corbet@....net>, <akpm@...ux-foundation.org>, <lance.yang@...ux.dev>,
	<mhiramat@...nel.org>, <paulmck@...nel.org>,
	<pawan.kumar.gupta@...ux.intel.com>, <mingo@...nel.org>,
	<dave.hansen@...ux.intel.com>, <rostedt@...dmis.org>, <kees@...nel.org>,
	<arnd@...db.de>, <lirongqing@...du.com>, <feng.tang@...ux.alibaba.com>,
	<pauld@...hat.com>, <joel.granados@...nel.org>, <linux-doc@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>
Subject: [PATCH] hung_task: Panic after fixed number of hung tasks

From: Li RongQing <lirongqing@...du.com>

Currently, when hung_task_panic is enabled, kernel will panic immediately
upon detecting the first hung task. However, some hung tasks are transient
and the system can recover fully, while others are unrecoverable and
trigger consecutive hung task reports, and a panic is expected.

This commit adds a new sysctl parameter hung_task_count_to_panic to allows
specifying the number of consecutive hung tasks that must be detected
before triggering a kernel panic. This provides finer control for
environments where transient hangs maybe happen but persistent hangs should
still be fatal.

Signed-off-by: Li RongQing <lirongqing@...du.com>
---
 Documentation/admin-guide/sysctl/kernel.rst |  6 ++++++
 kernel/hung_task.c                          | 14 +++++++++++++-
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 8b49eab..4240e7b 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -405,6 +405,12 @@ This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.
 1 Panic immediately.
 = =================================================
 
+hung_task_count_to_panic
+=====================
+
+When set to a non-zero value, after the number of consecutive hung task
+occur, the kernel will triggers a panic
+
 
 hung_task_check_count
 =====================
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 8708a12..87a6421 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -83,6 +83,8 @@ static unsigned int __read_mostly sysctl_hung_task_all_cpu_backtrace;
 static unsigned int __read_mostly sysctl_hung_task_panic =
 	IS_ENABLED(CONFIG_BOOTPARAM_HUNG_TASK_PANIC);
 
+static unsigned int __read_mostly sysctl_hung_task_count_to_panic;
+
 static int
 hung_task_panic(struct notifier_block *this, unsigned long event, void *ptr)
 {
@@ -219,7 +221,9 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
 
 	trace_sched_process_hang(t);
 
-	if (sysctl_hung_task_panic) {
+	if (sysctl_hung_task_panic ||
+	    (sysctl_hung_task_count_to_panic &&
+	     (sysctl_hung_task_detect_count >= sysctl_hung_task_count_to_panic))) {
 		console_verbose();
 		hung_task_show_lock = true;
 		hung_task_call_panic = true;
@@ -388,6 +392,14 @@ static const struct ctl_table hung_task_sysctls[] = {
 		.extra2		= SYSCTL_ONE,
 	},
 	{
+		.procname	= "hung_task_count_to_panic",
+		.data		= &sysctl_hung_task_count_to_panic,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ZERO,
+	},
+	{
 		.procname	= "hung_task_check_count",
 		.data		= &sysctl_hung_task_check_count,
 		.maxlen		= sizeof(int),
-- 
2.9.4


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ