[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c83673d74bc161b8e5bfcc3049ccfecf5c9e96f5.camel@infradead.org>
Date: Tue, 01 Feb 2022 12:39:17 +0000
From: David Woodhouse <dwmw2@...radead.org>
To: Borislav Petkov <bp@...en8.de>
Cc: Tom Lendacky <thomas.lendacky@....com>,
Thomas Gleixner <tglx@...utronix.de>,
Sean Christopherson <seanjc@...gle.com>,
Ingo Molnar <mingo@...hat.com>,
Dave Hansen <dave.hansen@...ux.intel.com>,
"x86@...nel.org" <x86@...nel.org>,
"H . Peter Anvin" <hpa@...or.com>,
Paolo Bonzini <pbonzini@...hat.com>,
"Paul E . McKenney" <paulmck@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"rcu@...r.kernel.org" <rcu@...r.kernel.org>,
"mimoja@...oja.de" <mimoja@...oja.de>,
"hewenliang4@...wei.com" <hewenliang4@...wei.com>,
"hushiyuan@...wei.com" <hushiyuan@...wei.com>,
"luolongjun@...wei.com" <luolongjun@...wei.com>,
"hejingxian@...wei.com" <hejingxian@...wei.com>,
Joerg Roedel <joro@...tes.org>,
Andrew Cooper <andrew.cooper3@...rix.com>
Subject: Re: [PATCH v3 6/9] x86/smpboot: Support parallel startup of
secondary CPUs
On Tue, 2022-02-01 at 11:56 +0100, Borislav Petkov wrote:
> On Tue, Feb 01, 2022 at 10:25:01AM +0000, David Woodhouse wrote:
> > Thanks. It looks like that is only invoked after boot, with a write to
> > /sys/devices/system/cpu/microcode/reload.
> >
> > My series is only parallelising the initial bringup at boot time, so it
> > shouldn't make any difference.
>
> No, I don't mean __reload_late() - I pointed you at that function to
> show the dance we must do when updating microcode late.
>
> The load_ucode_{ap,bsp}() routines are what is called when loading ucode
> early.
>
> So the question is, does the parallelizing change the order in which APs
> are brought up and can it happen that a SMT sibling of a two-SMT core
> executes *something* while the other SMT sibling is updating microcode.
>
> If so, that would be bad.
Right. So as you surmise, I haven't broken that... yet. At least not in
the patches I've posted :)
The call to ucode_cpu_init() is in cpu_init(), right after the call to
wait_for_master_cpu(), which this AP's bit in cpu_initialized_mask and
then waits for the BSP to set its bit in cpu_callout_mask.
That's a full synchronization point with do_wait_cpu_initalized() on
the BSP, which waits for the former and then sets the later.
So... with the series I've posted, all APs end up waiting in
wait_for_master_cpu() until the final serialized bringup.
In the top of my git tree, you can see a half-baked 'parallel part 2'
commit which introduces a new x86/cpu:wait-init cpuhp state that would
invoke do_wait_cpu_initialized() for each CPU in turn, which *would*
release them all into load_ucode_bsp() at the same time and have
precisely the problem you're describing.
I'll commit a FIXME comment now so that it doesn't slip my mind.
Thanks.
> > However... it does look like there's nothing preventing a sibling being
> > brought online *while* the dance you mention above is occurring.
>
> Bottom line is: of the two SMT siblings, one needs to be updating
> microcode while the other is idle. I.e., what __reload_late() does.
>
> > Shouldn't __reload_late() take the device_hotplug_lock to prevent that?
>
> See reload_store().
Hm, not sure I see how that's protecting itself from someone
simultaneously echoing 1 > /sys/devices/system/cpu/cpu${SIBLING}/online
Download attachment "smime.p7s" of type "application/pkcs7-signature" (5965 bytes)
Powered by blists - more mailing lists