[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250708072321.GB1613376@noisy.programming.kicks-ass.net>
Date: Tue, 8 Jul 2025 09:23:21 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Joel Fernandes <joelagnelf@...dia.com>, linux-kernel@...r.kernel.org,
Thomas Gleixner <tglx@...utronix.de>,
Andrea Righi <arighi@...dia.com>,
Frederic Weisbecker <frederic@...nel.org>, rcu@...r.kernel.org
Subject: Re: [PATCH v2] smp: Document preemption and stop_machine() mutual
exclusion
On Mon, Jul 07, 2025 at 08:56:04AM -0700, Paul E. McKenney wrote:
> On Mon, Jul 07, 2025 at 09:50:50AM +0200, Peter Zijlstra wrote:
> > On Sat, Jul 05, 2025 at 01:23:27PM -0400, Joel Fernandes wrote:
> > > Recently while revising RCU's cpu online checks, there was some discussion
> > > around how IPIs synchronize with hotplug.
> > >
> > > Add comments explaining how preemption disable creates mutual exclusion with
> > > CPU hotplug's stop_machine mechanism. The key insight is that stop_machine()
> > > atomically updates CPU masks and flushes IPIs with interrupts disabled, and
> > > cannot proceed while any CPU (including the IPI sender) has preemption
> > > disabled.
> >
> > I'm very conflicted on this. While the added comments aren't wrong,
> > they're not quite accurate either. Stop_machine doesn't wait for people
> > to enable preemption as such.
> >
> > Fundamentally there seems to be a misconception around what stop machine
> > is and how it works, and I don't feel these comments make things better.
> >
> > Basically, stop-machine (and stop_one_cpu(), stop_two_cpus()) use the
> > stopper task, a task running at the ultimate priority; if it is
> > runnable, it will run.
> >
> > Stop-machine simply wakes all the stopper tasks and co-ordinates them to
> > literally stop the machine. All CPUs have the stopper task scheduled and
> > then they go sit in a spin-loop driven state machine with IRQs disabled.
> >
> > There really isn't anything magical about any of this.
>
> There is the mechanism (which you have described above), and then there
> are the use cases. Those of us maintaining a given mechanism might
> argue that a detailed description of the mechanism suffices, but that
> argument does not always win the day.
>
> I do like the description in the stop_machine() kernel-doc header:
>
> * This can be thought of as a very heavy write lock, equivalent to
> * grabbing every spinlock in the kernel.
>
> Though doesn't this need to upgrace "spinlock" to "raw spinlock"
> now that PREEMPT_RT is in mainline?
>
> Also, this function is more powerful than grabbing every write lock
> in the kernel because it also excludes all regions of code that have
> preemption disabled, which is one thing that CPU hotplug is relying on.
> Any objection to calling out that additional semantic?
Best to just re-formulate the entire comment I think. State it provides
exclusion vs all non-preemptible regions in the kernel -- at insane cost
and should not be used when humanly possible :-)
Powered by blists - more mailing lists