[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240823104552.GB31866@willie-the-truck>
Date: Fri, 23 Aug 2024 11:45:53 +0100
From: Will Deacon <will@...nel.org>
To: Douglas Anderson <dianders@...omium.org>
Cc: Catalin Marinas <catalin.marinas@....com>, Yu Zhao <yuzhao@...gle.com>,
Mark Rutland <mark.rutland@....com>,
Misono Tomohiro <misono.tomohiro@...itsu.com>,
Marc Zyngier <maz@...nel.org>, Sumit Garg <sumit.garg@...aro.org>,
Chen-Yu Tsai <wens@...e.org>,
Daniel Thompson <daniel.thompson@...aro.org>,
Stephen Boyd <swboyd@...omium.org>,
Frederic Weisbecker <frederic@...nel.org>,
"Guilherme G. Piccoli" <gpiccoli@...lia.com>,
James Morse <james.morse@....com>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Kees Cook <kees@...nel.org>, Puranjay Mohan <puranjay@...nel.org>,
Tony Luck <tony.luck@...el.com>,
linux-arm-kernel@...ts.infradead.org,
linux-hardening@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] arm64: smp: smp_send_stop() and crash_smp_send_stop()
should try non-NMI first
Hi Doug,
On Wed, Aug 21, 2024 at 02:53:57PM -0700, Douglas Anderson wrote:
> When testing hard lockup handling on my sc7180-trogdor-lazor device
> with pseudo-NMI enabled, with serial console enabled and with kgdb
> disabled, I found that the stack crawls printed to the serial console
> ended up as a jumbled mess. After rebooting, the pstore-based console
> looked fine though. Also, enabling kgdb to trap the panic made the
> console look fine and avoided the mess.
Just a small nit:
> while (num_other_online_cpus() && timeout--)
> udelay(1);
>
> - if (num_other_online_cpus())
> + /*
> + * If CPUs are still online, try an NMI. There's no excuse for this to
> + * be slow, so we only give them an extra 10 ms to respond.
> + */
> + if (num_other_online_cpus() && ipi_should_be_nmi(IPI_CPU_STOP_NMI)) {
We probably want an smp_rmb() here...
> + cpumask_copy(&mask, cpu_online_mask);
> + cpumask_clear_cpu(smp_processor_id(), &mask);
> +
> + pr_info("SMP: retry stop with NMI for CPUs %*pbl\n",
> + cpumask_pr_args(&mask));
> +
> + smp_cross_call(&mask, IPI_CPU_STOP_NMI);
> + timeout = USEC_PER_MSEC * 10;
> + while (num_other_online_cpus() && timeout--)
> + udelay(1);
> + }
> +
> + if (num_other_online_cpus()) {
... and again here, just to make sure that the re-read of cpu_online_mask
is ordered after the read of __num_online_cpus in num_other_online_cpus().
I can add those when applying.
Will
Powered by blists - more mailing lists