[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1d0ed92ab68409b62a14cd29d0021f92c6e2568a.camel@infradead.org>
Date: Wed, 01 Feb 2023 15:08:27 +0000
From: David Woodhouse <dwmw2@...radead.org>
To: Usama Arif <usama.arif@...edance.com>,
Thomas Gleixner <tglx@...utronix.de>
Cc: Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H . Peter Anvin" <hpa@...or.com>,
Paolo Bonzini <pbonzini@...hat.com>,
"Paul E . McKenney" <paulmck@...nel.org>,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
rcu@...r.kernel.org, mimoja@...oja.de, hewenliang4@...wei.com,
hushiyuan@...wei.com, luolongjun@...wei.com, hejingxian@...wei.com,
Tom Lendacky <thomas.lendacky@....com>,
Sean Christopherson <seanjc@...gle.com>,
Paul Menzel <pmenzel@...gen.mpg.de>,
Fam Zheng <fam.zheng@...edance.com>,
Punit Agrawal <punit.agrawal@...edance.com>,
simon.evans@...edance.com, liangma@...ngbit.com
Subject: Re: [PATCH v4 0/9] Parallel CPU bringup for x86_64
On Wed, 2023-02-01 at 14:40 +0000, Usama Arif wrote:
> On 01/02/2022 20:53, David Woodhouse wrote:
> > Doing the INIT/SIPI/SIPI in parallel for all APs and *then* waiting for
> > them shaves about 80% off the AP bringup time on a 96-thread 2-socket
> > Skylake box (EC2 c5.metal) — from about 500ms to 100ms.
> >
> > There are more wins to be had with further parallelisation, but this is
> > the simple part.
> >
>
> Hi,
>
> We are interested in reducing the boot time of servers (with kexec), and
> smpboot takes up a significant amount of time while booting. When
> testing the patch series (rebased to v6.1) on a server with 128 CPUs
> split across 2 NUMA nodes, it brought down the smpboot time from ~700ms
> to 100ms. Adding another cpuhp state for do_wait_cpu_initialized to make
> sure cpu_init is reached (as done in v1 of the series + using the
> cpu_finishup_mask) brought it down further to ~30ms.
>
> I just wanted to check what was needed to progress the patch series
> further for review? There weren't any comments on v4 of the patch so I
> couldn't figure out what more is needed. I think its quite useful to
> have this working so would be really glad help in anything needed to
> restart the review.
I believe the only thing holding it back was the fact that it broke on
some AMD CPUs.
We don't *think* there are any remaining software issues; we think it's
hardware. Either an actual hardware race in CPU or chipset, or perhaps
even something as simple as a voltage regulator which can't cope with
an increase in power draw from *all* the CPUs at the same time.
We have prodded AMD a few times to investigate, but so far to no avail.
Last time I actually spoke to Thomas in person, I think he agreed that
we should just merge it and disable the parallel mode for the affected
AMD CPUs.
If you've already rebased to a newer kernel and tested it, perhaps now
is the time to do just that.
Download attachment "smime.p7s" of type "application/pkcs7-signature" (5965 bytes)
Powered by blists - more mailing lists