lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
 <DS7PR12MB820131851A9056F9DF8C37809490A@DS7PR12MB8201.namprd12.prod.outlook.com>
Date: Tue, 27 Jan 2026 16:00:42 +0000
From: "Kaplan, David" <David.Kaplan@....com>
To: Borislav Petkov <bp@...en8.de>, "Chang S. Bae" <chang.seok.bae@...el.com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"x86@...nel.org" <x86@...nel.org>, "tglx@...utronix.de" <tglx@...utronix.de>,
	"mingo@...hat.com" <mingo@...hat.com>, "dave.hansen@...ux.intel.com"
	<dave.hansen@...ux.intel.com>, "peterz@...radead.org" <peterz@...radead.org>
Subject: RE: [PATCH 1/7] stop_machine: Introduce stop_machine_nmi()

[AMD Official Use Only - AMD Internal Distribution Only]

> -----Original Message-----
> From: Borislav Petkov <bp@...en8.de>
> Sent: Tuesday, January 27, 2026 9:50 AM
> To: Chang S. Bae <chang.seok.bae@...el.com>
> Cc: linux-kernel@...r.kernel.org; x86@...nel.org; tglx@...utronix.de;
> mingo@...hat.com; dave.hansen@...ux.intel.com; peterz@...radead.org;
> Kaplan, David <David.Kaplan@....com>
> Subject: Re: [PATCH 1/7] stop_machine: Introduce stop_machine_nmi()
>
> Caution: This message originated from an External Source. Use proper
> caution when opening attachments, clicking links, or responding.
>
>
> On Sun, Jan 25, 2026 at 01:42:16AM +0000, Chang S. Bae wrote:
> > +/**
> > + * stop_machine_nmi: freeze the machine and run this function in NMI
> context
> > + * @fn: the function to run
> > + * @data: the data ptr for the @fn()
> > + * @cpus: the cpus to run the @fn() on (NULL = any online cpu)
>
> s/cpu/CPU/g in all text.

The existing stop_machine() function description in the same header uses lowercase cpu.  This was intended to match.

>
> > + *
> > + * Like stop_machine() but runs the function in NMI context to avoid any
> risk of
> > + * interruption due to NMIs.
> > + *
> > + * Protects against CPU hotplug.
> > + */
> > +int stop_machine_nmi(cpu_stop_fn_t fn, void *data, const struct cpumask
> *cpus);
> > +
> > +/**
> > + * stop_machine_cpuslocked_nmi: freeze and run this function in NMI
> context
> > + * @fn: the function to run
> > + * @data: the data ptr for the @fn()
> > + * @cpus: the cpus to run the @fn() on (NULL = any online cpu)
> > + *
> > + * Same as above. Must be called from within a cpus_read_lock()
> protected
> > + * region. Avoids nested calls to cpus_read_lock().
> > + */
> > +int stop_machine_cpuslocked_nmi(cpu_stop_fn_t fn, void *data, const
> struct cpumask *cpus);
> >  /**
> >   * stop_core_cpuslocked: - stop all threads on just one core
> >   * @cpu: any cpu in the targeted core
> > @@ -160,6 +183,14 @@ int stop_core_cpuslocked(unsigned int cpu,
> cpu_stop_fn_t fn, void *data);
> >
> >  int stop_machine_from_inactive_cpu(cpu_stop_fn_t fn, void *data,
> >                                  const struct cpumask *cpus);
> > +
> > +bool noinstr stop_machine_nmi_handler(void);
> > +DECLARE_STATIC_KEY_FALSE(stop_machine_nmi_handler_enable);
>
> Why is the static key in the header if you have an accessor below?
>
> > +static __always_inline bool stop_machine_nmi_handler_enabled(void)
> > +{
> > +     return static_branch_unlikely(&stop_machine_nmi_handler_enable);
> > +}
>
> Just make the accessor the only thing that external code calls.

Not entirely sure I follow the suggestion, but keep in mind that stop_machine.c is only included in the build if CONFIG_SMP.  So I had to put some stuff in the header to ensure that a non-SMP build would not fail.

>
> ...
>
> > +DEFINE_STATIC_KEY_FALSE(stop_machine_nmi_handler_enable);
> > +static DEFINE_PER_CPU(struct stop_machine_nmi_ctrl,
> stop_machine_nmi_ctrl);
> > +
> > +static void enable_nmi_handler(struct multi_stop_data *msdata)
> > +{
> > +     this_cpu_write(stop_machine_nmi_ctrl.msdata, msdata);
> > +     this_cpu_write(stop_machine_nmi_ctrl.nmi_enabled, true);
> > +}
>
> Why do we have to enable the NMI handler?

This was in the existing CPU patching logic.  I believe it is intended to protect against the handler running multiple times.

In particular, there are some cases where NMIs can get unmasked early so there could be a risk I'd think of a second NMI coming in while the handler is running.  The Boolean protects against that.  Maybe Chang knows more history here.

That said, the whole point of stop_machine_nmi() is to avoid an NMI coming in, and for dynamic mitigations I explicitly made it mutually exclusive with DEBUG_ENTRY so this wouldn't happen...

--David Kaplan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ