[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAD=FV=Xzynqn8H8G175OodJgiCoV4TjTQQQuzOcbSAzcLZcyhw@mail.gmail.com>
Date: Tue, 25 Jun 2024 16:08:13 -0700
From: Doug Anderson <dianders@...omium.org>
To: Will Deacon <will@...nel.org>
Cc: Catalin Marinas <catalin.marinas@....com>, Mark Rutland <mark.rutland@....com>,
Marc Zyngier <maz@...nel.org>, Misono Tomohiro <misono.tomohiro@...itsu.com>,
Chen-Yu Tsai <wens@...e.org>, Stephen Boyd <swboyd@...omium.org>,
Daniel Thompson <daniel.thompson@...aro.org>, Sumit Garg <sumit.garg@...aro.org>,
Frederic Weisbecker <frederic@...nel.org>, "Guilherme G. Piccoli" <gpiccoli@...lia.com>,
Josh Poimboeuf <jpoimboe@...nel.org>, Kees Cook <keescook@...omium.org>,
Peter Zijlstra <peterz@...radead.org>, Thomas Gleixner <tglx@...utronix.de>,
Tony Luck <tony.luck@...el.com>, Valentin Schneider <vschneid@...hat.com>,
linux-arm-kernel@...ts.infradead.org, linux-hardening@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] arm64: smp: smp_send_stop() and crash_smp_send_stop()
should try non-NMI first
Hi,
On Mon, Jun 24, 2024 at 6:55 AM Will Deacon <will@...nel.org> wrote:
>
> On Fri, May 17, 2024 at 01:01:58PM -0700, Doug Anderson wrote:
> > On Thu, Dec 7, 2023 at 5:03 PM Douglas Anderson <dianders@...omium.org> wrote:
> > > local_irq_disable();
> >
> > The above local_irq_disable() is not new for my patch but it seems
> > wonky for two reasons:
> >
> > 1. It feels like it should have been the first thing in the function.
> >
> > 2. It feels like it should be local_daif_mask() instead.
>
> Is that to ensure we don't take a pNMI? I think that makes sense, but
> let's please add a comment to say why local_irq_disable() is not
> sufficient.
Right, that was my thought. Mostly I realized it was right because the
normal (non-crash) stop case calls local_cpu_stop() which calls
local_daif_mask(). I was comparing the two and trying to figure out if
the difference was on purpose or an oversight. Looks like an oversight
to me.
Sure, I'll add a comment.
Ironically, looking at the code again I found _yet another_ corner
case I missed: panic_smp_self_stop(). If a CPU hits that case then we
could end up waiting for it when it's already stopped itself. I tried
to figure out how to solve that properly and it dawned on me that
maybe I should rethink part of my patch. Specifically, I had added a
new `stop_mask` in this patch because the panic case didn't update
`cpu_online_mask`. ...but that's easy enough to fix: just add a call
to `set_cpu_online(cpu, false)` in ipi_cpu_crash_stop(). ...so I'll do
that and avoid adding a new mask. If there's some reason why crash
stop shouldn't be marking a CPU offline then let me know and I'll go
back...
-Doug
Powered by blists - more mailing lists