[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87v8ir6j96.ffs@tglx>
Date: Fri, 24 Mar 2023 02:16:53 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: David Woodhouse <dwmw2@...radead.org>,
Usama Arif <usama.arif@...edance.com>, kim.phillips@....com,
brgerst@...il.com
Cc: piotrgorski@...hyos.org, oleksandr@...alenko.name,
arjan@...ux.intel.com, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, hpa@...or.com, x86@...nel.org,
pbonzini@...hat.com, paulmck@...nel.org,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
rcu@...r.kernel.org, mimoja@...oja.de, hewenliang4@...wei.com,
thomas.lendacky@....com, seanjc@...gle.com, pmenzel@...gen.mpg.de,
fam.zheng@...edance.com, punit.agrawal@...edance.com,
simon.evans@...edance.com, liangma@...ngbit.com,
gpiccoli@...lia.com
Subject: Re: [PATCH v16 3/8] cpu/hotplug: Add dynamic parallel bringup
states before CPUHP_BRINGUP_CPU
On Thu, Mar 23 2023 at 23:12, David Woodhouse wrote:
> On Fri, 2023-03-24 at 00:05 +0100, Thomas Gleixner wrote:
>> + if (ret && can_rollback_cpu(st))
>> + WARN_ON(cpuhp_invoke_callback_range(false, cpu, st, CPUHP_OFFLINE));
>> + }
>
> And I'll take doing this bit unconditionally (it's basically a no-op if
> they already got rolled all the way back to CPUHP_OFFLINE, right?).
>
> But the additional complexity of having multiple steps is fairly
> minimal, and I'm already planning to *use* another one even in x86, as
> discussed.
It's not about the "complexity". That's a general design question and
I'm not agreeing with your approach of putting AP specifics into the BP
state space.
The BP only phase ends at the point where the AP is woken up via
SIPI/INIT/whatever. Period.
And no, we are not making this special just because it's the easiest way
to get it done. I have _zero_ interest in this kind of hackery which
just slaps stuff into the code where its conveniant without even
thinking about proper separations
We went a great length to separate things clearly and it takes more than
"oh let's reserve a few special states" to keep this separation
intact. That's a matter of correctness, maintainability and taste.
That microcode thing on X86 has absolutely no business in the pre
bringup DYN states. It's an AP problem and it can be solved completely
without BP interaction.
And before you start drooling over further great parallelism, can you
please take a step back and come up with a sensible design for the
actual real world requirments?
The point is that after an AP comes out of lala land and starts
executing there are mandatory synchronization points which need to be
handled by design. The number of synchronization points is architecture
and platform specific and might be 0, but thats a detail.
So the proper thing is to split CPUHP_BRINGUP_CPU, which is the bridging
state between AP and BP today, into a set of synchronization points
between BP and AP.
But that's non-trivial because if you look at bringup_cpu() then you'll
notice that this state has an implicit protection against interrupt
allocation/free and quite some architectures rely on this for their
early bringup code and possibly their STARTING state callbacks.
That aside. Let's just look at x86 as of today from the BP side:
1) Wakeup AP
2) Wait until there is sign of life
3) Let AP proceed
5) Wait until AP is done with init
6) TSC synchronization
7) Wait until it is online
and on the AP side:
1) Do low level init
2) Report life
3) Wait until BP allows to proceed
4) Do init
5) Report done
6) TSC synchronization
7) Report online
So surely you could claim that up to #6 (TSC sync) nothing needs to
synchronize.
But that's simply not true because topology information is implicitely
serialized by CPU hotplug and your attempt to serialize that in patch
7/8 is not cutting it at all because AP #4 (STARTING) callbacks rely
implicitely on the immutability of the topology information and so do
some of the later (threaded) AP callbacks too.
As I said before 5 out of 10 callbacks I looked at are not ready for
this.
Just because it did not explode in your face yet, does not make any of
your wet dreams more correct.
Again: I'm not interested in this kind of "works for me" nonsense at
all. I wasted enough time already to make CPU hotplug reliable and
understandable. I have no intention to waste more time on it just
because.
Thanks,
tglx
Powered by blists - more mailing lists