[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240301114211.GC5795@aspen.lan>
Date: Fri, 1 Mar 2024 11:42:11 +0000
From: Daniel Thompson <daniel.thompson@...aro.org>
To: Doug Anderson <dianders@...omium.org>
Cc: Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will@...nel.org>, Mark Rutland <mark.rutland@....com>,
Marc Zyngier <maz@...nel.org>,
Misono Tomohiro <misono.tomohiro@...itsu.com>,
Chen-Yu Tsai <wens@...e.org>, Stephen Boyd <swboyd@...omium.org>,
Sumit Garg <sumit.garg@...aro.org>,
Frederic Weisbecker <frederic@...nel.org>,
"Guilherme G. Piccoli" <gpiccoli@...lia.com>,
Josh Poimboeuf <jpoimboe@...nel.org>,
Kees Cook <keescook@...omium.org>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>,
Tony Luck <tony.luck@...el.com>,
Valentin Schneider <vschneid@...hat.com>,
linux-arm-kernel@...ts.infradead.org,
linux-hardening@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] arm64: smp: smp_send_stop() and crash_smp_send_stop()
should try non-NMI first
On Thu, Feb 29, 2024 at 10:34:26AM -0800, Doug Anderson wrote:
> Hi,
>
> On Wed, Feb 28, 2024 at 5:11 AM Daniel Thompson
> <daniel.thompson@...aro.org> wrote:
> >
> > > I'm still hoping to get some sort of feedback here. If people think
> > > this is a terrible idea then I'll shut up now and leave well enough
> > > alone, but it would be nice to actively decide and get the patch out
> > > of limbo.
> >
> > I've read patch through a couple of times and was generally convinced by
> > the "do what x86 does" argument.
> >
> > However until now I've always held my council since I wasn't familiar
> > with these code paths and I figured it was OK for me to have no opinion
> > because the first line of the description says that kgdb/kdb is 100% not
> > involved in causing the problem ;-) .
> >
> > However today I also took a look at the HAVE_NMI architectures and there
> > is no consensus between them about how to implement this: PowerPC uses
> > NMI and most of the others use IRQ only, s390 special cases for the
> > panic code path and acts differently compared to a normal SMP shutdown.
> >
> > <snip>
> >
> > However, if we talking ourselves into copying x86 then perhaps we should
> > more accurately copy x86! Assuming I read the x86 code correctly then
> > crash_smp_send_stop() will (mostly) go staight to NMI rather
> > than trialling an IRQ first! That is not what is currently implemented
> > in the patch for arm64.
>
> Sure, I'm happy to change the patch to work that way, though I might
> wait to get some confirmation from a maintainer that they think this
> idea is worth pursuing before spending more time on it.
100%. Don't respin on my account.
> I don't think it would be hard to have the "crash stop" code jump
> straight to NMI if that's what people want. Matching x86 here seems
> reasonable, though I'd also say that my gut still says that even for
> crash stop we should try to stop things cleanly before jumping to NMI.
> I guess I could imagine that the code we're kexec-ing to generate the
> core file might be more likely to find the hardware in a funny state
> if we stopped CPUs w/ NMI vs IRQ.
In terms of the "right thing to do" for kdump then reviewing the s390
might be a good idea. Unfortunately it's a bit different to the other
arches and I can't offer a 95% answer about what that arch does.
Daniel.
Powered by blists - more mailing lists