[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <be85a49f-9867-6117-9c35-f7d7b8c0cdff@bytedance.com>
Date: Wed, 22 Feb 2023 11:11:25 +0000
From: Usama Arif <usama.arif@...edance.com>
To: David Woodhouse <dwmw2@...radead.org>, tglx@...utronix.de,
kim.phillips@....com, arjan@...ux.intel.com
Cc: mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
hpa@...or.com, x86@...nel.org, pbonzini@...hat.com,
paulmck@...nel.org, linux-kernel@...r.kernel.org,
kvm@...r.kernel.org, rcu@...r.kernel.org, mimoja@...oja.de,
hewenliang4@...wei.com, thomas.lendacky@....com, seanjc@...gle.com,
pmenzel@...gen.mpg.de, fam.zheng@...edance.com,
punit.agrawal@...edance.com, simon.evans@...edance.com,
liangma@...ngbit.com
Subject: Re: [External] Re: [PATCH v9 0/8] Parallel CPU bringup for x86_64
On 22/02/2023 10:11, David Woodhouse wrote:
> On Wed, 2023-02-15 at 14:54 +0000, Usama Arif wrote:
>> The main change over v8 is dropping the patch to avoid repeated saves of MTRR
>> at boot time. It didn't make a difference to smpboot time and is independent
>> of parallel CPU bringup, so if needed can be explored in a separate patchset.
>>
>> The patches have also been rebased to v6.2-rc8 and retested and the
>> improvement in boot time is the same as v8.
>
> Thanks for picking this up, Usama.
>
> So the next thing that might be worth looking at is allowing the APs
> all to be running their hotplug thread simultaneously, bringing
> themselves from CPUHP_BRINGUP_CPU to CPUHP_AP_ONLINE. This series eats
> the initial INIT/SIPI/SIPI latency, but if there's any significant time
> in the AP hotplug thread, that could be worth parallelising.
>
> There may be further wins in the INIT/SIPI/SIPI too. Currently we
> process each CPU at a time, sending INIT, SIPI, waiting 10µs and
> sending another SIPI.
>
> What if we sent the first INIT+SIPI to all CPUs, then did another pass
> sending another SIPI only to those which hadn't already started running
> and set their bit in cpu_initialized_mask ?
>
> Might not be worth it, and there's an added complexity that they all
> have to wait for each other (on the real mode trampoline lock) before
> they can take their turn and get as far as setting their bit in
> cpu_initialized_mask. So we'd probably end up sending the second SIPI
> to most of them *anyway*.
Thanks! I think I sent out v10 a bit too early, but hopefully it looks
like everyone agrees on the suspend code in it at the moment?
As a next step, I was thinking of reposting and starting a discussion on
the reuse timer calibration patch separately. Its not part of parallel
smp, but in my testing, it takes away (70ms) ~70% of the remaining
parallel smpboot time. With the machine and kernel I am testing, the
kexec reboot time after parallel smp is just under a second, so this
represents ~7% of the boot time, which is a notable percentage reduction
in server downtime. Or maybe someone could reply to this thread saying
its not a good idea to post it as I remember there were quite a few
reservations about it? :)
Thanks,
Usama
Powered by blists - more mailing lists