[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aW-Jjsy5s4HySYy0@pathway.suse.cz>
Date: Tue, 20 Jan 2026 14:56:30 +0100
From: Petr Mladek <pmladek@...e.com>
To: Pnina Feder <pnina.feder@...ileye.com>
Cc: akpm@...ux-foundation.org, bhe@...hat.com, linux-kernel@...r.kernel.org,
lkp@...el.com, mgorman@...e.de, mingo@...hat.com,
peterz@...radead.org, rostedt@...dmis.org, senozhatsky@...omium.org,
tglx@...utronix.de, vkondra@...ileye.com
Subject: Re: [PATCH v7] panic: add panic_force_cpu= parameter to redirect
panic to a specific CPU
On Thu 2026-01-15 15:05:52, Pnina Feder wrote:
> Some platforms require panic handling to execute on a specific CPU for
> crash dump to work reliably. This can be due to firmware limitations,
> interrupt routing constraints, or platform-specific requirements where
> only a single CPU is able to safely enter the crash kernel.
>
> Add the panic_force_cpu= kernel command-line parameter to redirect panic
> execution to a designated CPU. When the parameter is provided, the CPU
> that initially triggers panic forwards the panic context to the target
> CPU via IPI, which then proceeds with the normal panic and kexec flow.
>
> The IPI delivery is implemented as a weak function (panic_smp_redirect_cpu)
> so architectures with NMI support can override it for more reliable delivery.
>
> If the specified CPU is invalid, offline, or a panic is already in
> progress on another CPU, the redirection is skipped and panic continues
> on the current CPU.
>
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -299,6 +301,128 @@ void __weak crash_smp_send_stop(void)
> }
>
> atomic_t panic_cpu = ATOMIC_INIT(PANIC_CPU_INVALID);
> +atomic_t panic_redirect_cpu = ATOMIC_INIT(PANIC_CPU_INVALID);
> +
> +#if defined(CONFIG_SMP) && defined(CONFIG_CRASH_DUMP)
> +static int __init panic_force_cpu_setup(char *str)
> +{
> + int cpu;
> +
> + if (!str)
> + return -EINVAL;
> +
> + if (kstrtoint(str, 0, &cpu) || cpu < 0) {
It should also fail when (cpu >= nr_cpu_ids).
IMHO, cpu_online(panic_force_cpu) might cause an invalid access otherwise.
> + pr_warn("panic_force_cpu: invalid value '%s'\n", str);
> + return -EINVAL;
> + }
> +
> + panic_force_cpu = cpu;
> + return 0;
> +}
> +early_param("panic_force_cpu", panic_force_cpu_setup);
[...]
> +/**
> + * panic_try_force_cpu - Redirect panic to a specific CPU for crash kernel
> + * @buf: buffer to format the panic message into
> + * @buf_size: size of the buffer
> + * @fmt: panic message format string
> + * @args: arguments for format string
> + *
> + * Some platforms require panic handling to occur on a specific CPU
> + * for the crash kernel to function correctly. This function redirects
> + * panic handling to the CPU specified via the panic_force_cpu= boot parameter.
> + *
> + * Returns false if panic should proceed on current CPU.
> + * Returns true if panic was redirected.
> + */
> +__printf(3, 0)
> +static bool panic_try_force_cpu(char *buf, int buf_size, const char *fmt, va_list args)
> +{
> + int this_cpu = raw_smp_processor_id();
> + int old_cpu = PANIC_CPU_INVALID;
> +
> + /* Feature not enabled via boot parameter */
> + if (panic_force_cpu < 0)
> + return false;
> +
> + /* Already on target CPU - proceed normally */
> + if (this_cpu == panic_force_cpu)
> + return false;
> +
> + /* Target CPU is offline, can't redirect */
> + if (!cpu_online(panic_force_cpu))
> + return false;
> +
> + /* Another panic already in progress */
> + if (panic_in_progress()) {
> + return false;
> + }
Note that the preferred (panic_force_cpu) could enter panic() right
now and use the buffer shared buffer in parallel.
> + /*
> + * Only one CPU can do the redirect. Use atomic cmpxchg to ensure
> + * we don't race with another CPU also trying to redirect.
> + */
> + if (!atomic_try_cmpxchg(&panic_redirect_cpu, &old_cpu, this_cpu))
> + return false;
> +
> + vsnprintf(buf, buf_size, fmt, args);
I am afraid that we can't share the buffer with the panic() function
as safe way here.
> + console_verbose();
> + bust_spinlocks(1);
> +
> + pr_emerg("panic: Redirecting from CPU %d to CPU %d for crash kernel.\n",
> + this_cpu, panic_force_cpu);
> +
> + /* Dump original CPU before redirecting */
> + if (!test_taint(TAINT_DIE) &&
> + oops_in_progress <= 1 &&
> + IS_ENABLED(CONFIG_DEBUG_BUGVERBOSE)) {
> + dump_stack();
> + }
> +
> + printk_legacy_allow_panic_sync();
> + console_flush_on_panic(CONSOLE_FLUSH_PENDING);
Please, remove the two above functions. They are not safe. They
increase the risk that __crash_kexec() won't be called.
They should be called only by panic() when __crash_kexec()
was not called...
> + if (panic_smp_redirect_cpu(panic_force_cpu, buf) != 0) {
> + atomic_set(&panic_redirect_cpu, PANIC_CPU_INVALID);
> + return false;
If we return "false" here then panic() will continue on this cpu.
This CPU will likely acquire "panic_cpu" and crash_exec() will
likely fail because this is not the preferred CPU.
We might want to warn about it. Or we should at least
add a comment here.
And maybe we even should not call crash_exec() in panic() when
panic_redirect_cpu != PANIC_CPU_INVALID &&
panic_redirect_cpu != panic_cpu
> + }
> +
> + /* IPI/NMI sent, this CPU should stop */
> + return true;
> +}
> +#else
> +__printf(3, 0)
> +static inline bool panic_try_force_cpu(char *buf, int buf_size, const char *fmt, va_list args)
> +{
> + return false;
> +}
> +#endif /* CONFIG_SMP && CONFIG_CRASH_DUMP */
>
> bool panic_try_start(void)
> {
Best Regards,
Petr
Powered by blists - more mailing lists