lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <tencent_B7DDA306E3B6A8CDF3195E4C083C9BD37E07@qq.com>
Date: Mon, 17 Nov 2025 12:57:43 +0800
From: cx19970 <cx19970@...com>
To: mingo@...hat.com,
	peterz@...radead.org,
	juri.lelli@...hat.com,
	vincent.guittot@...aro.org
Cc: dietmar.eggemann@....com,
	rostedt@...dmis.org,
	bsegall@...gle.com,
	mgorman@...e.de,
	vschneid@...hat.com,
	linux-kernel@...r.kernel.org,
	cx19970 <cx19970@...com>
Subject: [PATCH] BUG: sysrq: Reset the watchdog during the polling of sysrq-t processing tasks

When the number of processes in the system is extremely large,
manually execute the "sysrq-t" command and print the task list
information on the serial port. The kernel will experience a soft lockup.
You need to manually press the keys to input the "sysrq-t" command,
that is: the "BREAK" key + the "t" key.
For X86 devices with the 8250 serial port, the baud rate is 115200,
and the response duration of "sysrq-t" will exceed 30 minutes.

The key factors causing the soft lock problem:
1.Too many processes.
2.Serial port printing is too slow.
3.The default timeout of the kernel watchdog is 20 seconds, which is too short.
4.The kernel task list "task" is too long.
5.The kernel scheduling queue "cfs_rq" is too long.
6.Exclusive CPU core during sysrq-t.
7.There is too much content to be printed during sysrq-t.
8.Reading cpuinfo requires inter-core communication.

Solution: Add the operation of resetting the watchdog when printing the
task list "task" and the scheduling queue "cfs_rq" using sysrq-t.

Signed-off-by: cx19970 <cx19970@...com>
---
 kernel/sched/debug.c | 2 ++
 kernel/sched/fair.c  | 5 ++++-
 kernel/sched/rt.c    | 5 ++++-
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 02e16b70a790..f3d8df4f8490 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -789,6 +789,8 @@ static void print_rq(struct seq_file *m, struct rq *rq, int rq_cpu)
 		if (task_cpu(p) != rq_cpu)
 			continue;
 
+		touch_nmi_watchdog();
+		touch_all_softlockup_watchdogs();
 		print_task(m, rq, p);
 	}
 	rcu_read_unlock();
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5b752324270b..0e93054d2c46 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -13672,8 +13672,11 @@ void print_cfs_stats(struct seq_file *m, int cpu)
 	struct cfs_rq *cfs_rq, *pos;
 
 	rcu_read_lock();
-	for_each_leaf_cfs_rq_safe(cpu_rq(cpu), cfs_rq, pos)
+	for_each_leaf_cfs_rq_safe(cpu_rq(cpu), cfs_rq, pos) {
+		touch_nmi_watchdog();
+		touch_all_softlockup_watchdogs();
 		print_cfs_rq(m, cpu, cfs_rq);
+	}
 	rcu_read_unlock();
 }
 
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 7936d4333731..46ac5ddc5071 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2932,7 +2932,10 @@ void print_rt_stats(struct seq_file *m, int cpu)
 	struct rt_rq *rt_rq;
 
 	rcu_read_lock();
-	for_each_rt_rq(rt_rq, iter, cpu_rq(cpu))
+	for_each_rt_rq(rt_rq, iter, cpu_rq(cpu)) {
+		touch_nmi_watchdog();
+		touch_all_softlockup_watchdogs();
 		print_rt_rq(m, cpu, rt_rq);
+	}
 	rcu_read_unlock();
 }
-- 
2.36.1.windows.1



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ