[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230210203510.1734835-1-gpiccoli@igalia.com>
Date: Fri, 10 Feb 2023 17:35:10 -0300
From: "Guilherme G. Piccoli" <gpiccoli@...lia.com>
To: akpm@...ux-foundation.org, bhe@...hat.com, pmladek@...e.com
Cc: linux-kernel@...r.kernel.org, kexec@...ts.infradead.org,
dyoung@...hat.com, d.hatayama@...fujitsu.com, feng.tang@...el.com,
hidehiro.kawai.ez@...achi.com, keescook@...omium.org,
mikelley@...rosoft.com, vgoyal@...hat.com, kernel-dev@...lia.com,
kernel@...ccoli.net, "Guilherme G. Piccoli" <gpiccoli@...lia.com>,
stable@...r.kernel.org
Subject: [PATCH v4] panic: Fixes the panic_print NMI backtrace setting
Commit 8d470a45d1a6 ("panic: add option to dump all CPUs backtraces in panic_print")
introduced a setting for the "panic_print" kernel parameter to allow
users to request a NMI backtrace on panic. Problem is that the panic_print
handling happens after the secondary CPUs are already disabled, hence
this option ended-up being kind of a no-op - kernel skips the NMI trace
in idling CPUs, which is the case of offline CPUs.
Fix it by checking the NMI backtrace bit in the panic_print prior to
the CPU disabling function.
Fixes: 8d470a45d1a6 ("panic: add option to dump all CPUs backtraces in panic_print")
Cc: stable@...r.kernel.org
Signed-off-by: Guilherme G. Piccoli <gpiccoli@...lia.com>
---
V4:
- Sent as standalone patch, rebased against v6.2-rc7.
V2 / V3:
- New patch, there was no V1 of this one.
Link for V3: https://lore.kernel.org/lkml/20220819221731.480795-12-gpiccoli@igalia.com/
Hi folks, thanks in advance for reviews/comments.
Notice that while at it, I got rid of the "crash_kexec_post_notifiers"
local copy in panic(). This was introduced by commit b26e27ddfd2a
("kexec: use core_param for crash_kexec_post_notifiers boot option"),
but it is not clear from comments or commit message why this local copy
is required.
My understanding is that it's a mechanism to prevent some concurrency,
in case some other CPU modify this variable while panic() is running.
I find it very unlikely, hence I removed it - but if people consider
this copy needed, I can respin this patch and keep it, even providing a
comment about that, in order to be explict about its need.
Let me know your thoughts!
Cheers,
Guilherme
kernel/panic.c | 47 +++++++++++++++++++++++++++--------------------
1 file changed, 27 insertions(+), 20 deletions(-)
diff --git a/kernel/panic.c b/kernel/panic.c
index 463c9295bc28..f45ee88be8a2 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -211,9 +211,6 @@ static void panic_print_sys_info(bool console_flush)
return;
}
- if (panic_print & PANIC_PRINT_ALL_CPU_BT)
- trigger_all_cpu_backtrace();
-
if (panic_print & PANIC_PRINT_TASK_INFO)
show_state();
@@ -243,6 +240,30 @@ void check_panic_on_warn(const char *origin)
origin, limit);
}
+/*
+ * Helper that triggers the NMI backtrace (if set in panic_print)
+ * and then performs the secondary CPUs shutdown - we cannot have
+ * the NMI backtrace after the CPUs are off!
+ */
+static void panic_other_cpus_shutdown(void)
+{
+ if (panic_print & PANIC_PRINT_ALL_CPU_BT)
+ trigger_all_cpu_backtrace();
+
+ /*
+ * Note that smp_send_stop() is the usual SMP shutdown function,
+ * which unfortunately may not be hardened to work in a panic
+ * situation. If we want to do crash dump after notifier calls
+ * and kmsg_dump, we will need architecture dependent extra
+ * bits in addition to stopping other CPUs, hence we rely on
+ * crash_smp_send_stop() for that.
+ */
+ if (!crash_kexec_post_notifiers)
+ smp_send_stop();
+ else
+ crash_smp_send_stop();
+}
+
/**
* panic - halt the system
* @fmt: The text string to print
@@ -258,7 +279,6 @@ void panic(const char *fmt, ...)
long i, i_next = 0, len;
int state = 0;
int old_cpu, this_cpu;
- bool _crash_kexec_post_notifiers = crash_kexec_post_notifiers;
if (panic_on_warn) {
/*
@@ -333,23 +353,10 @@ void panic(const char *fmt, ...)
*
* Bypass the panic_cpu check and call __crash_kexec directly.
*/
- if (!_crash_kexec_post_notifiers) {
+ if (!crash_kexec_post_notifiers)
__crash_kexec(NULL);
- /*
- * Note smp_send_stop is the usual smp shutdown function, which
- * unfortunately means it may not be hardened to work in a
- * panic situation.
- */
- smp_send_stop();
- } else {
- /*
- * If we want to do crash dump after notifier calls and
- * kmsg_dump, we will need architecture dependent extra
- * works in addition to stopping other CPUs.
- */
- crash_smp_send_stop();
- }
+ panic_other_cpus_shutdown();
/*
* Run any panic handlers, including those that might need to
@@ -370,7 +377,7 @@ void panic(const char *fmt, ...)
*
* Bypass the panic_cpu check and call __crash_kexec directly.
*/
- if (_crash_kexec_post_notifiers)
+ if (crash_kexec_post_notifiers)
__crash_kexec(NULL);
console_unblank();
--
2.39.1
Powered by blists - more mailing lists