linux-kernel - Re: [PATCH] arm64: smp: smp_send_stop() and crash_smp_send

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240624134943.GA8616@willie-the-truck>
Date: Mon, 24 Jun 2024 14:49:43 +0100
From: Will Deacon <will@...nel.org>
To: Doug Anderson <dianders@...omium.org>
Cc: Catalin Marinas <catalin.marinas@....com>,
	Mark Rutland <mark.rutland@....com>, Marc Zyngier <maz@...nel.org>,
	Misono Tomohiro <misono.tomohiro@...itsu.com>,
	Chen-Yu Tsai <wens@...e.org>, Stephen Boyd <swboyd@...omium.org>,
	Daniel Thompson <daniel.thompson@...aro.org>,
	Sumit Garg <sumit.garg@...aro.org>,
	Frederic Weisbecker <frederic@...nel.org>,
	"Guilherme G. Piccoli" <gpiccoli@...lia.com>,
	Josh Poimboeuf <jpoimboe@...nel.org>,
	Kees Cook <keescook@...omium.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Tony Luck <tony.luck@...el.com>,
	Valentin Schneider <vschneid@...hat.com>,
	linux-arm-kernel@...ts.infradead.org,
	linux-hardening@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] arm64: smp: smp_send_stop() and crash_smp_send_stop()
 should try non-NMI first

On Fri, May 17, 2024 at 01:01:51PM -0700, Doug Anderson wrote:
> On Fri, Apr 12, 2024 at 6:55 AM Will Deacon <will@...nel.org> wrote:
> > On Thu, Dec 07, 2023 at 05:02:56PM -0800, Douglas Anderson wrote:
> > > In order to do this we also need a slight change in the way we keep
> > > track of which CPUs still need to be stopped. We need to know
> > > specifically which CPUs haven't stopped yet when we fall back to NMI
> > > but in the "crash stop" case the "cpu_online_mask" isn't updated as
> > > CPUs go down. This is why that code path had an atomic of the number
> > > of CPUs left. We'll solve this by making the cpumask into a
> > > global. This has a potential memory implication--with NR_CPUs = 4096
> > > this is 4096/8 = 512 bytes of globals. On the upside in that same case
> > > we take 512 bytes off the stack which could potentially have made the
> > > stop code less reliable. It can be noted that the NMI backtrace code
> > > (lib/nmi_backtrace.c) uses the same approach and that use also
> > > confirms that updating the mask is safe from NMI.
> >
> > Updating the global masks without any synchronisation feels broken though:
> >
> > > @@ -1085,77 +1080,75 @@ void smp_send_stop(void)
> > >  {
> > >       unsigned long timeout;
> > >
> > > -     if (num_other_online_cpus()) {
> > > -             cpumask_t mask;
> > > +     /*
> > > +      * If this cpu is the only one alive at this point in time, online or
> > > +      * not, there are no stop messages to be sent around, so just back out.
> > > +      */
> > > +     if (num_other_online_cpus() == 0)
> > > +             goto skip_ipi;
> > >
> > > -             cpumask_copy(&mask, cpu_online_mask);
> > > -             cpumask_clear_cpu(smp_processor_id(), &mask);
> > > +     cpumask_copy(to_cpumask(stop_mask), cpu_online_mask);
> > > +     cpumask_clear_cpu(smp_processor_id(), to_cpumask(stop_mask));
> >
> > I don't see what prevents multiple CPUs getting in here concurrently and
> > tripping over the masks. x86 seems to avoid that with an atomic
> > 'stopping_cpu' variable in native_stop_other_cpus(). Do we need something
> > similar?
> 
> Good point. nmi_trigger_cpumask_backtrace(), which my code was based
> on, has a test_and_set() for this and that seems simpler than the
> atomic_try_cmpxchg() from the x86 code.
> 
> If we run into that case, what do you think we should do? I guess x86
> just does a "return", though it feels like at least a warning should
> be printed since we're not doing what the function asked us to do.
> When we return there will be other CPUs running.
> 
> In theory, we could try to help the other processor along? I don't
> know how much complexity to handle here and I could imagine that
> testing some of the corner cases would be extremely hard. I could
> imagine that this might work but maybe it's too complex?
> 
> --
> 
> void smp_send_stop(void)
> {
>     unsigned long timeout;
>     static unsigned long stop_in_progress;
> 
>     /*
>      * If this cpu is the only one alive at this point in time, online or
>      * not, there are no stop messages to be sent around, so just back out.
>      */
>     if (num_other_online_cpus() == 0)
>         goto skip_ipi;
> 
>     /*
>      * If another is already trying to stop and we're here then either the
>      * other CPU hasn't sent us the IPI yet or we have interrupts disabled.
>      * Let's help the other CPU by stopping ourselves.
>      */
>     if (test_and_set_bit(0, &stop_in_progress)) {
>         /* Wait until the other inits stop_mask */
>         while (!test_bit(1, &stop_in_progress)) {
>             cpu_relax();
>             smp_rmb();
>         }
>         do_handle_IPI(IPI_CPU_STOP);
>     }
> 
>     cpumask_copy(to_cpumask(stop_mask), cpu_online_mask);
>     cpumask_clear_cpu(smp_processor_id(), to_cpumask(stop_mask));
> 
>     /* Indicate that we've initted stop_mask */
>     set_bit(1, &stop_in_progress);
>     smp_wmb();
>     ...
>     ...
> 
> --
> 
> Opinions?

I think following the x86 behaviour is probably the best place to start:
the CPU that ends up doing the IPI will spin anyway, so I don't think
there's much to gain by being pro-active on the recipient.

> > Apart from that, I'm fine with the gist of the patch.
> 
> Great. Ironically as I reviewed this patch with fresh eyes and looking
> at the things you brought up, I also found a few issues, I'll respond
> to my post myself so I have context to respond to.

Ok.

> One other question: what did you think about Daniel's suggestion to go
> straight to NMI for crash_stop? I don't feel like I have enough
> experience with crash_stop to have intuition here, but it feels like
> trying IRQ first is still better in that case, but I'm happy to go
> either way.

I don't have a strong preference, but I think I'd prefer a non-NMI first
as it feels like things ought to be more robust in that case and it
should work most of the time.

Will