lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250923033740.2696-1-lirongqing@baidu.com>
Date: Tue, 23 Sep 2025 11:37:40 +0800
From: lirongqing <lirongqing@...du.com>
To: <corbet@....net>, <akpm@...ux-foundation.org>, <lance.yang@...ux.dev>,
	<mhiramat@...nel.org>, <paulmck@...nel.org>,
	<pawan.kumar.gupta@...ux.intel.com>, <mingo@...nel.org>,
	<dave.hansen@...ux.intel.com>, <rostedt@...dmis.org>, <kees@...nel.org>,
	<arnd@...db.de>, <lirongqing@...du.com>, <feng.tang@...ux.alibaba.com>,
	<pauld@...hat.com>, <joel.granados@...nel.org>, <linux-doc@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>
Subject: [PATCH][RFC] hung_task: Support to panic when the maximum number of hung task warnings is reached

From: Li RongQing <lirongqing@...du.com>

Currently the hung task detector can either panic immediately or continue
operation when hung tasks are detected. However, there are scenarios
where we want a more balanced approach:

- We don't want the system to panic immediately when a few hung tasks
  are detected, as the system may be able to recover
- And we also don't want the system to stall indefinitely with multiple
  hung tasks

This commit introduces a new mode (value 2) for the hung task panic behavior.
When set to 2, the system will panic only after the maximum number of hung
task warnings (hung_task_warnings) has been reached.

This provides a middle ground between immediate panic and potentially
infinite stall, allowing for automated vmcore generation after a reasonable
number of hung task incidents.

Signed-off-by: Li RongQing <lirongqing@...du.com>
---
 Documentation/admin-guide/kernel-parameters.txt | 15 ++++++++-------
 Documentation/admin-guide/sysctl/kernel.rst     |  1 +
 kernel/hung_task.c                              |  5 +++--
 lib/Kconfig.debug                               |  4 ++--
 4 files changed, 14 insertions(+), 11 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 5a7a83c..f2a9876 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1993,13 +1993,14 @@
 
 	hung_task_panic=
 			[KNL] Should the hung task detector generate panics.
-			Format: 0 | 1
-
-			A value of 1 instructs the kernel to panic when a
-			hung task is detected. The default value is controlled
-			by the CONFIG_BOOTPARAM_HUNG_TASK_PANIC build-time
-			option. The value selected by this boot parameter can
-			be changed later by the kernel.hung_task_panic sysctl.
+			Format: 0 | 1 | 2
+
+			A value of 1 instructs the kernel to panic when a hung task is detected.
+			A value of 2 instructs the kernel to panic when hung_task_warnings is
+			decreased to 0.  The default value is controlled by the
+			CONFIG_BOOTPARAM_HUNG_TASK_PANIC build-time option. The value selected
+			by this boot parameter can be changed later by the kernel.hung_task_panic
+			sysctl.
 
 	hvc_iucv=	[S390]	Number of z/VM IUCV hypervisor console (HVC)
 				terminal devices. Valid values: 0..8
diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 8b49eab..6f77241 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -403,6 +403,7 @@ This file shows up if ``CONFIG_DETECT_HUNG_TASK`` is enabled.
 = =================================================
 0 Continue operation. This is the default behavior.
 1 Panic immediately.
+2 Panic when hung_task_warnings is decreased to 0.
 = =================================================
 
 
diff --git a/kernel/hung_task.c b/kernel/hung_task.c
index 8708a12..b052ec7 100644
--- a/kernel/hung_task.c
+++ b/kernel/hung_task.c
@@ -219,7 +219,8 @@ static void check_hung_task(struct task_struct *t, unsigned long timeout)
 
 	trace_sched_process_hang(t);
 
-	if (sysctl_hung_task_panic) {
+	if ((sysctl_hung_task_panic == 1) ||
+		(!sysctl_hung_task_warnings && sysctl_hung_task_panic == 2)) {
 		console_verbose();
 		hung_task_show_lock = true;
 		hung_task_call_panic = true;
@@ -385,7 +386,7 @@ static const struct ctl_table hung_task_sysctls[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= SYSCTL_ZERO,
-		.extra2		= SYSCTL_ONE,
+		.extra2		= SYSCTL_TWO,
 	},
 	{
 		.procname	= "hung_task_check_count",
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index dc0e0c6..e7cf166 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1264,10 +1264,10 @@ config DEFAULT_HUNG_TASK_TIMEOUT
 	  Keeping the default should be fine in most cases.
 
 config BOOTPARAM_HUNG_TASK_PANIC
-	bool "Panic (Reboot) On Hung Tasks"
+	int "Panic (Reboot) On Hung Tasks"
 	depends on DETECT_HUNG_TASK
 	help
-	  Say Y here to enable the kernel to panic on "hung tasks",
+	  Say 1|2 here to enable the kernel to panic on "hung tasks",
 	  which are bugs that cause the kernel to leave a task stuck
 	  in uninterruptible "D" state.
 
-- 
2.9.4


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ