[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <tencent_B7DDA306E3B6A8CDF3195E4C083C9BD37E07@qq.com>
Date: Mon, 17 Nov 2025 12:57:43 +0800
From: cx19970 <cx19970@...com>
To: mingo@...hat.com,
peterz@...radead.org,
juri.lelli@...hat.com,
vincent.guittot@...aro.org
Cc: dietmar.eggemann@....com,
rostedt@...dmis.org,
bsegall@...gle.com,
mgorman@...e.de,
vschneid@...hat.com,
linux-kernel@...r.kernel.org,
cx19970 <cx19970@...com>
Subject: [PATCH] BUG: sysrq: Reset the watchdog during the polling of sysrq-t processing tasks
When the number of processes in the system is extremely large,
manually execute the "sysrq-t" command and print the task list
information on the serial port. The kernel will experience a soft lockup.
You need to manually press the keys to input the "sysrq-t" command,
that is: the "BREAK" key + the "t" key.
For X86 devices with the 8250 serial port, the baud rate is 115200,
and the response duration of "sysrq-t" will exceed 30 minutes.
The key factors causing the soft lock problem:
1.Too many processes.
2.Serial port printing is too slow.
3.The default timeout of the kernel watchdog is 20 seconds, which is too short.
4.The kernel task list "task" is too long.
5.The kernel scheduling queue "cfs_rq" is too long.
6.Exclusive CPU core during sysrq-t.
7.There is too much content to be printed during sysrq-t.
8.Reading cpuinfo requires inter-core communication.
Solution: Add the operation of resetting the watchdog when printing the
task list "task" and the scheduling queue "cfs_rq" using sysrq-t.
Signed-off-by: cx19970 <cx19970@...com>
---
kernel/sched/debug.c | 2 ++
kernel/sched/fair.c | 5 ++++-
kernel/sched/rt.c | 5 ++++-
3 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 02e16b70a790..f3d8df4f8490 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -789,6 +789,8 @@ static void print_rq(struct seq_file *m, struct rq *rq, int rq_cpu)
if (task_cpu(p) != rq_cpu)
continue;
+ touch_nmi_watchdog();
+ touch_all_softlockup_watchdogs();
print_task(m, rq, p);
}
rcu_read_unlock();
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5b752324270b..0e93054d2c46 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -13672,8 +13672,11 @@ void print_cfs_stats(struct seq_file *m, int cpu)
struct cfs_rq *cfs_rq, *pos;
rcu_read_lock();
- for_each_leaf_cfs_rq_safe(cpu_rq(cpu), cfs_rq, pos)
+ for_each_leaf_cfs_rq_safe(cpu_rq(cpu), cfs_rq, pos) {
+ touch_nmi_watchdog();
+ touch_all_softlockup_watchdogs();
print_cfs_rq(m, cpu, cfs_rq);
+ }
rcu_read_unlock();
}
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 7936d4333731..46ac5ddc5071 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2932,7 +2932,10 @@ void print_rt_stats(struct seq_file *m, int cpu)
struct rt_rq *rt_rq;
rcu_read_lock();
- for_each_rt_rq(rt_rq, iter, cpu_rq(cpu))
+ for_each_rt_rq(rt_rq, iter, cpu_rq(cpu)) {
+ touch_nmi_watchdog();
+ touch_all_softlockup_watchdogs();
print_rt_rq(m, cpu, rt_rq);
+ }
rcu_read_unlock();
}
--
2.36.1.windows.1
Powered by blists - more mailing lists