[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <80c16fe8-471a-4e31-987c-15d377f8f165@intel.com>
Date: Thu, 5 Feb 2026 18:14:39 -0800
From: "Chang S. Bae" <chang.seok.bae@...el.com>
To: Borislav Petkov <bp@...en8.de>, "Kaplan, David" <David.Kaplan@....com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>, "tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>, "dave.hansen@...ux.intel.com"
<dave.hansen@...ux.intel.com>, "peterz@...radead.org" <peterz@...radead.org>
Subject: Re: [PATCH 1/7] stop_machine: Introduce stop_machine_nmi()
On 2/2/2026 2:54 AM, Borislav Petkov wrote:
...
> @@ -174,8 +174,26 @@ struct multi_stop_data {
>
> enum multi_stop_state state;
> atomic_t thread_ack;
> +
> + bool use_nmi;
> +
> + /*
> + * cpumasks of CPUs on which to raise an NMI; used in the NMI
> + * stomp_machine variant. nmi_cpus_done is used for tracking
> + * when the NMI handler has executed successfully.
> + */
> + struct cpumask nmi_cpus;
> + struct cpumask nmi_cpus_done;
> +
> +};
Looks like every stop_machine variant then will spend stack for these
masks. It seems they could be cpumask_var_t.
Alternatively, to make it simple further, a per-CPU variable could
achieve this if I understand correctly:
struct stop_machine_nmi_ctrl {
...
bool done;
}
Then,
for_each_cpu(cpu, cpus)
per_cpu(stop_machine_nmi_ctrl.done, cpu) = false;
...
ret = stop_cpus(cpu_online_mask, multi_cpu_stop, &msdata);
...
for_each_cpu(cpu, cpus) {
if (!per_cpu(stop_machine_nmi_ctrl.done, cpu)) {
pr_err("Some CPUs didn't run ...\n");
return -EINVAL;
}
}
Or, I guess even the msdata pointer alone could do that too -- setting
it before and checking if NULL after.
> +static int __stop_machine_cpuslocked(cpu_stop_fn_t fn, void *data,
> + const struct cpumask *cpus, bool use_nmi)
> {
> struct multi_stop_data msdata = {
> .fn = fn,
> .data = data,
> .num_threads = num_online_cpus(),
> .active_cpus = cpus,
> + .use_nmi = use_nmi,
> };
> + int ret, cpu;
> +
> + if (use_nmi) {
> + cpumask_copy(&msdata.nmi_cpus, cpus);
> + cpumask_clear(&msdata.nmi_cpus_done);
> + }
>
> lockdep_assert_cpus_held();
>
> @@ -617,7 +677,32 @@ int stop_machine_cpuslocked(cpu_stop_fn_t fn, void *data,
>
> /* Set the initial state and stop all online cpus. */
> set_state(&msdata, MULTI_STOP_PREPARE);
> - return stop_cpus(cpu_online_mask, multi_cpu_stop, &msdata);
> + ret = stop_cpus(cpu_online_mask, multi_cpu_stop, &msdata);
> +
> + if (!use_nmi)
> + return ret;
> +
> + if (!cpumask_equal(cpus, &msdata.nmi_cpus_done)) {
> + pr_err("Some CPUs didn't run the stomp_machine NMI handler\n");
> + return -EINVAL;
> + } else {
> + for_each_cpu(cpu, cpus)
> + ret |= per_cpu(stop_machine_nmi_ctrl.err, cpu);
This error accumulation here makes sense to me though,
Currently, stop_machine() documents:
/*
...
* Return: 0 if all invocations of @fn return zero. Otherwise, the
* value returned by an arbitrarily chosen member of the set of calls to
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* @fn that returned non-zero.
*/
This behavior dates back to:
commit 1142d810298e694 ("cpu_stop: implement stop_cpu[s]()")
where cpu_stopper_thread() does:
struct cpu_stop_done *done = work->done;
...
ret = fn(arg);
if (ret)
done->ret = ret;
So the return value is overwritten instead of accumulation, while struct
cpu_stop_done is shared like the note:
/*
* Structure to determine completion condition and record errors. May
* be shared by works on different cpus.
*/
I don't know whether that was an intentional design choice or not. But,
at least the NMI variant might have a slight different semantic in this
regard.
Powered by blists - more mailing lists