[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20251113111039.22701-1-feng.tang@linux.alibaba.com>
Date: Thu, 13 Nov 2025 19:10:35 +0800
From: Feng Tang <feng.tang@...ux.alibaba.com>
To: Andrew Morton <akpm@...ux-foundation.org>,
Petr Mladek <pmladek@...e.com>,
Lance Yang <ioworker0@...il.com>,
Jonathan Corbet <corbet@....net>,
paulmck@...nel.org,
Steven Rostedt <rostedt@...dmis.org>,
linux-kernel@...r.kernel.org
Cc: Feng Tang <feng.tang@...ux.alibaba.com>
Subject: [PATCH v2 0/4] Enable hung_task and lockup cases to dump system info on demand
When working on kernel stability issues: panic, task-hung and soft/hard
lockup are frequently met. And to debug them, user may need lots of
system information at that time, like task call stacks, lock info,
memory info, ftrace dump, etc.
panic case already uses sys_info() for this purpose, and has a
'panic_sys_info' sysctl(also support cmdline setup) interface to take
human readable string like "tasks,mem,timers,locks,ftrace,..." to
control what kinds of information is needed. Which is also helpful
to debug task-hung and lockup cases.
So this patchset introduces the similar sys_info sysctl interface for
task-hung and lockup cases.
Please be noted, this is mainly for debugging and the info dumping
could be intrusive, like dumping call stack for all tasks when system
has huge number of tasks, similarly for ftrace dump (we may add
tracing_stop() and tracing_start() around it)
Locally these have been used in our bug chasing for stability issues
and was helpful.
As Andrew suggested, add a configurable global 'kernel_sys_info' knob.
When error scenarios like panic/hung-task/lockup etc doesn't setup
their own sys_info knob and calls sys_info() with parameter "0", this
global knob will take effect. It could be used for other kernel cases
like OOM, which may not need one dedicated sys_info knob.
Codewise, these 4 patches are independent to each other and could be
applied separately.
Please help to review, thanks!
- Feng
Changelog:
v2:
* Add 0004 patch to add the default kernel sys_info knob (Andrew)
* Simplify the code for hung_sys_info (Petr)
* Use separate sys_info interface for hardlockup and softlockpu (Petr)
* Consider the ALL_CPU_BT handling for hardlockup case (Petr)
* Collect Reviewd-by tags.
* Put soft/hard sys_info knob into correct kernel config domain.
Feng Tang (4):
docs: panic: correct some sys_ifo names in sysctl doc
hung_task: Add hung_task_sys_info sysctl to dump sys info on task-hung
watchdog: add sys_info sysctls to dump sys info on system lockup
sys_info: add a default kernel sys_info mask
Documentation/admin-guide/sysctl/kernel.rst | 23 +++++++-
kernel/hung_task.c | 62 +++++++++++++--------
kernel/watchdog.c | 44 ++++++++++++++-
lib/sys_info.c | 31 ++++++++++-
4 files changed, 130 insertions(+), 30 deletions(-)
--
2.43.5
Powered by blists - more mailing lists