[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260127204957.GMaXkk9WRLzn89WP-n@fat_crate.local>
Date: Tue, 27 Jan 2026 21:49:57 +0100
From: Borislav Petkov <bp@...en8.de>
To: "Kaplan, David" <David.Kaplan@....com>
Cc: "Chang S. Bae" <chang.seok.bae@...el.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>,
"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
"peterz@...radead.org" <peterz@...radead.org>
Subject: Re: [PATCH 1/7] stop_machine: Introduce stop_machine_nmi()
On Tue, Jan 27, 2026 at 04:00:42PM +0000, Kaplan, David wrote:
> The existing stop_machine() function description in the same header uses
> lowercase cpu. This was intended to match.
You can change yours to "CPU" and I can convert the rest. Or I can do them all
- not a biggie.
> Not entirely sure I follow the suggestion, but keep in mind that
> stop_machine.c is only included in the build if CONFIG_SMP. So I had to put
> some stuff in the header to ensure that a non-SMP build would not fail.
Something like this:
diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h
index 86113084456a..933fe8ddbec9 100644
--- a/include/linux/stop_machine.h
+++ b/include/linux/stop_machine.h
@@ -184,13 +184,8 @@ int stop_core_cpuslocked(unsigned int cpu, cpu_stop_fn_t fn, void *data);
int stop_machine_from_inactive_cpu(cpu_stop_fn_t fn, void *data,
const struct cpumask *cpus);
+bool stop_machine_nmi_handler_enabled(void);
bool noinstr stop_machine_nmi_handler(void);
-DECLARE_STATIC_KEY_FALSE(stop_machine_nmi_handler_enable);
-static __always_inline bool stop_machine_nmi_handler_enabled(void)
-{
- return static_branch_unlikely(&stop_machine_nmi_handler_enable);
-}
-
#else /* CONFIG_SMP || CONFIG_HOTPLUG_CPU */
static __always_inline int stop_machine_cpuslocked(cpu_stop_fn_t fn, void *data,
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 55a350048d4c..0ecec98d52a6 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -205,7 +205,13 @@ struct stop_machine_nmi_ctrl {
int err;
};
-DEFINE_STATIC_KEY_FALSE(stop_machine_nmi_handler_enable);
+static DEFINE_STATIC_KEY_FALSE(stop_machine_nmi_handler_enable);
+
+bool stop_machine_nmi_handler_enabled(void)
+{
+ return static_branch_unlikely(&stop_machine_nmi_handler_enable);
+}
+
static DEFINE_PER_CPU(struct stop_machine_nmi_ctrl, stop_machine_nmi_ctrl);
static void enable_nmi_handler(struct multi_stop_data *msdata)
> This was in the existing CPU patching logic. I believe it is intended to
> protect against the handler running multiple times.
>
> In particular, there are some cases where NMIs can get unmasked early so
> there could be a risk I'd think of a second NMI coming in while the handler
> is running. The Boolean protects against that. Maybe Chang knows more
> history here.
>
> That said, the whole point of stop_machine_nmi() is to avoid an NMI coming
> in, and for dynamic mitigations I explicitly made it mutually exclusive with
> DEBUG_ENTRY so this wouldn't happen...
Except that you don't need any of that fun. You can do:
struct multi_stop_data {
cpu_stop_fn_t fn;
void *data;
/* Like num_online_cpus(), but hotplug cpu uses us, so we need this. */
unsigned int num_threads;
const struct cpumask *active_cpus;
enum multi_stop_state state;
atomic_t thread_ack;
bool use_nmi;
/* cpumask of CPUs which need to run an NMI handler */
const struct cpumask nmi_cpusmask;
^^^^^^
add this thing
};
which you initialize to the cpumask you want NMIs to run on.
stop_machine_nmi_handler() tests its CPU ID against the mask, if the bit is
set, it clears it and runs the function pointer.
You don't need to enable anything as it is implicitly enabled when you have
that bit in the cpumask set and you don't need to test whether anything's
enabled because, well, also implicitly enabled for only once on the respective
CPU.
And then you can wrap all that logic:
case MULTI_STOP_RUN:
if (is_active) {
if (msdata->use_nmi) {
arch_send_self_nmi();
err = raw_cpu_read(stop_machine_nmi_ctrl.err);
} else {
err = msdata->fn(msdata->data);
}
}
into a single function which multi_cpu_stop() calls and in that function you
decide whether to do an NMI or do a normal function call.
Sounds good?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists