[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <664c73de-4ad5-4d39-b7aa-9d1a14559535@nvidia.com>
Date: Tue, 8 Jul 2025 10:00:31 -0400
From: Joel Fernandes <joelagnelf@...dia.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
Andrea Righi <arighi@...dia.com>, "Paul E . McKenney" <paulmck@...nel.org>,
Frederic Weisbecker <frederic@...nel.org>, rcu@...r.kernel.org
Subject: Re: [PATCH v2] smp: Document preemption and stop_machine() mutual
exclusion
On 7/8/2025 3:21 AM, Peter Zijlstra wrote:
> On Mon, Jul 07, 2025 at 10:19:52AM -0400, Joel Fernandes wrote:
>
>> From: Joel Fernandes <joelagnelf@...dia.com>
>> Subject: [PATCH] smp: Document preemption and stop_machine() mutual exclusion
>>
>> Recently while revising RCU's cpu online checks, there was some discussion
>> around how IPIs synchronize with hotplug.
>>
>> Add comments explaining how preemption disable creates mutual exclusion with
>> CPU hotplug's stop_machine mechanism. The key insight is that stop_machine()
>> atomically updates CPU masks and flushes IPIs with interrupts disabled, and
>> cannot proceed while any CPU (including the IPI sender) has preemption
>> disabled.
>>
>> Cc: Andrea Righi <arighi@...dia.com>
>> Cc: Paul E. McKenney <paulmck@...nel.org>
>> Cc: Frederic Weisbecker <frederic@...nel.org>
>> Cc: rcu@...r.kernel.org
>> Acked-by: Paul E. McKenney <paulmck@...nel.org>
>> Co-developed-by: Frederic Weisbecker <frederic@...nel.org>
>> Signed-off-by: Joel Fernandes <joelagnelf@...dia.com>
>> ---
>> I am leaving in Paul's Ack but Paul please let me know if there is a concern!
>>
>> kernel/smp.c | 13 +++++++++++--
>> 1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/smp.c b/kernel/smp.c
>> index 974f3a3962e8..957959031063 100644
>> --- a/kernel/smp.c
>> +++ b/kernel/smp.c
>> @@ -93,6 +93,9 @@ int smpcfd_dying_cpu(unsigned int cpu)
>> * explicitly (without waiting for the IPIs to arrive), to
>> * ensure that the outgoing CPU doesn't go offline with work
>> * still pending.
>> + *
>> + * This runs with interrupts disabled inside the stopper task invoked
>> + * by stop_machine(), ensuring CPU offlining and IPI flushing are atomic.
>
> So below you use 'mutual exclusion', which I prefer over 'atomic' as
> used here.
Sure, will fix.
>
>> */
>> __flush_smp_call_function_queue(false);
>> irq_work_run();
>> @@ -418,6 +421,10 @@ void __smp_call_single_queue(int cpu, struct llist_node *node)
>> */
>> static int generic_exec_single(int cpu, call_single_data_t *csd)
>> {
>> + /*
>> + * Preemption already disabled here so stopper cannot run on this CPU,
>> + * ensuring mutual exclusion with CPU offlining and last IPI flush.
>> + */
>> if (cpu == smp_processor_id()) {
>> smp_call_func_t func = csd->func;
>> void *info = csd->info;
>> @@ -638,8 +645,10 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
>> int err;
>>
>> /*
>> - * prevent preemption and reschedule on another processor,
>> - * as well as CPU removal
>> + * Prevent preemption and reschedule on another processor, as well as
>> + * CPU removal.
>
>> Also preempt_disable() prevents stopper from running on
>> + * this CPU, thus providing atomicity between the cpu_online() check
>> + * and IPI sending ensuring IPI is not missed by CPU going offline.
>
> That first sentence already covers this, no? 'prevents preemption' ->
> stopper task cannot run, 'CPU removal' -> no CPU_DYING (because no
> stopper).
Yeah I understand that's "implied" but I'd like to specifically call that out if
that's Ok :)
> Also that 'atomicy' vs 'mutual exclusion' thing.
Sure, will fix :)
Thanks!
- Joel
Powered by blists - more mailing lists