lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID:
 <MRWPR09MB80226C714E225A8617DB3BE58F97A@MRWPR09MB8022.eurprd09.prod.outlook.com>
Date: Thu, 22 Jan 2026 10:42:10 +0000
From: Pnina Feder <PNINA.FEDER@...ileye.com>
To: Petr Mladek <pmladek@...e.com>
CC: "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>, "bhe@...hat.com"
	<bhe@...hat.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "lkp@...el.com" <lkp@...el.com>,
	"mgorman@...e.de" <mgorman@...e.de>, "mingo@...hat.com" <mingo@...hat.com>,
	"peterz@...radead.org" <peterz@...radead.org>, "rostedt@...dmis.org"
	<rostedt@...dmis.org>, "senozhatsky@...omium.org" <senozhatsky@...omium.org>,
	"tglx@...utronix.de" <tglx@...utronix.de>, Vladimir Kondratiev
	<Vladimir.Kondratiev@...ileye.com>
Subject: RE: [PATCH v7] panic: add panic_force_cpu= parameter to redirect
 panic to a specific CPU

Hi Petr,

Thank you for the review.

> On Thu 2026-01-15 15:05:52, Pnina Feder wrote:
> > Some platforms require panic handling to execute on a specific CPU for 
> > crash dump to work reliably. This can be due to firmware limitations, 
> > interrupt routing constraints, or platform-specific requirements where 
> > only a single CPU is able to safely enter the crash kernel.
> > 
> > Add the panic_force_cpu= kernel command-line parameter to redirect 
> > panic execution to a designated CPU. When the parameter is provided, 
> > the CPU that initially triggers panic forwards the panic context to 
> > the target CPU via IPI, which then proceeds with the normal panic and kexec flow.
> > 
> > The IPI delivery is implemented as a weak function 
> > (panic_smp_redirect_cpu) so architectures with NMI support can override it for more reliable delivery.
> > 
> > If the specified CPU is invalid, offline, or a panic is already in 
> > progress on another CPU, the redirection is skipped and panic 
> > continues on the current CPU.
> > 
> > --- a/kernel/panic.c
> > +++ b/kernel/panic.c
> > @@ -299,6 +301,128 @@ void __weak crash_smp_send_stop(void)  }
> >  
> >  atomic_t panic_cpu = ATOMIC_INIT(PANIC_CPU_INVALID);
> > +atomic_t panic_redirect_cpu = ATOMIC_INIT(PANIC_CPU_INVALID);
> > +
> > +#if defined(CONFIG_SMP) && defined(CONFIG_CRASH_DUMP) static int 
> > +__init panic_force_cpu_setup(char *str) {
> > +	int cpu;
> > +
> > +	if (!str)
> > +		return -EINVAL;
> > +
> > +	if (kstrtoint(str, 0, &cpu) || cpu < 0) {
> 
> It should also fail when (cpu >= nr_cpu_ids).
> 
> IMHO, cpu_online(panic_force_cpu) might cause an invalid access otherwise.

Fixed.

> > +		pr_warn("panic_force_cpu: invalid value '%s'\n", str);
> > +		return -EINVAL;
> > +	}
> > +
> > +	panic_force_cpu = cpu;
> > +	return 0;
> > +}
> > +early_param("panic_force_cpu", panic_force_cpu_setup);
> 
> [...]
> 
> > +/**
> > + * panic_try_force_cpu - Redirect panic to a specific CPU for crash 
> > +kernel
> > + * @buf: buffer to format the panic message into
> > + * @buf_size: size of the buffer
> > + * @fmt: panic message format string
> > + * @args: arguments for format string
> > + *
> > + * Some platforms require panic handling to occur on a specific CPU
> > + * for the crash kernel to function correctly. This function 
> > +redirects
> > + * panic handling to the CPU specified via the panic_force_cpu= boot parameter.
> > + *
> > + * Returns false if panic should proceed on current CPU.
> > + * Returns true if panic was redirected.
> > + */
> > +__printf(3, 0)
> > +static bool panic_try_force_cpu(char *buf, int buf_size, const char 
> > +*fmt, va_list args) {
> > +	int this_cpu = raw_smp_processor_id();
> > +	int old_cpu = PANIC_CPU_INVALID;
> > +
> > +	/* Feature not enabled via boot parameter */
> > +	if (panic_force_cpu < 0)
> > +		return false;
> > +
> > +	/* Already on target CPU - proceed normally */
> > +	if (this_cpu == panic_force_cpu)
> > +		return false;
> > +
> > +	/* Target CPU is offline, can't redirect */
> > +	if (!cpu_online(panic_force_cpu))
> > +		return false;
> > +
> > +	/* Another panic already in progress */
> > +	if (panic_in_progress()) {
> > +		return false;
> > +	}
> 
> Note that the preferred (panic_force_cpu) could enter panic() right now and use the buffer shared buffer in parallel.
> 
> > +	/*
> > +	 * Only one CPU can do the redirect. Use atomic cmpxchg to ensure
> > +	 * we don't race with another CPU also trying to redirect.
> > +	 */
> > +	if (!atomic_try_cmpxchg(&panic_redirect_cpu, &old_cpu, this_cpu))
> > +		return false;
> > +
> > +	vsnprintf(buf, buf_size, fmt, args);
> 
> I am afraid that we can't share the buffer with the panic() function as safe way here.

Fixed via separate buffer using late_initcall + kmalloc.
Falls back to static message for early boot panics or kmalloc fails.

> > +	console_verbose();
> > +	bust_spinlocks(1);
> > +
> > +	pr_emerg("panic: Redirecting from CPU %d to CPU %d for crash kernel.\n",
> > +		 this_cpu, panic_force_cpu);
> > +
> > +	/* Dump original CPU before redirecting */
> > +	if (!test_taint(TAINT_DIE) &&
> > +	    oops_in_progress <= 1 &&
> > +	    IS_ENABLED(CONFIG_DEBUG_BUGVERBOSE)) {
> > +		dump_stack();
> > +	}
> > +
> > +	printk_legacy_allow_panic_sync();
> > +	console_flush_on_panic(CONSOLE_FLUSH_PENDING);
> 
> Please, remove the two above functions. They are not safe. They increase the risk that __crash_kexec() won't be called.
> They should be called only by panic() when __crash_kexec() was not called...

Removed.

> > +	if (panic_smp_redirect_cpu(panic_force_cpu, buf) != 0) {
> > +		atomic_set(&panic_redirect_cpu, PANIC_CPU_INVALID);
> > +		return false;
> 
> If we return "false" here then panic() will continue on this cpu.
> This CPU will likely acquire "panic_cpu" and crash_exec() will likely fail because this is not the preferred CPU.
> 
> We might want to warn about it. Or we should at least add a comment here.
> 
> And maybe we even should not call crash_exec() in panic() when
> 
>      panic_redirect_cpu != PANIC_CPU_INVALID &&
>      panic_redirect_cpu != panic_cpu
> 

Added pr_warn for offline CPU, redirect failure, and invalid parameter.

Thanks,
Pnina

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ