[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9e84cdea-754c-1864-4c54-66beeba7f921@intel.com>
Date: Wed, 30 Mar 2022 16:44:13 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc: tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
luto@...nel.org, peterz@...radead.org,
sathyanarayanan.kuppuswamy@...ux.intel.com, aarcange@...hat.com,
ak@...ux.intel.com, dan.j.williams@...el.com, david@...hat.com,
hpa@...or.com, jgross@...e.com, jmattson@...gle.com,
joro@...tes.org, jpoimboe@...hat.com, knsathya@...nel.org,
pbonzini@...hat.com, sdeep@...are.com, seanjc@...gle.com,
tony.luck@...el.com, vkuznets@...hat.com, wanpengli@...cent.com,
thomas.lendacky@....com, brijesh.singh@....com, x86@...nel.org,
linux-kernel@...r.kernel.org,
Sean Christopherson <sean.j.christopherson@...el.com>,
"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>
Subject: Re: [PATCHv7 21/30] x86/acpi, x86/boot: Add multiprocessor wake-up
support
On 3/30/22 16:16, Kirill A. Shutemov wrote:
> On Mon, Mar 28, 2022 at 12:17:35PM -0700, Dave Hansen wrote:
>> Is there anything in the trampoline code or start_secondary() that would
>> cause damage if it got run later than the kernel expects?
>
> I didn't find anything specific.
>
> But I tried to simulate similar scenario by returning -EIO just after
> writing wake up command in acpi_wakeup_cpu(). System booted, but there is
> a warning in the dmesg:
>
> WARNING: CPU: 0 PID: 1 at kernel/irq/chip.c:210 irq_startup+0x2a3/0x2b0
>
> System seems recovered fine, but I'm not sure what will happen if
> wake up is delayed by VMM.
That shouldn't be too hard to simulate. Just add a spin loop at the
beginning of start_secondary() that can spin for grotesque amounts of
time and have it get more grotesque with each CPU that boots.
If you're still booting CPUs when userspace comes up, you've done as
much damage as possible.
But, I do think there are two relatively discrete problems here:
1. How long do we wait for a secondary CPU to actually ack the MADT
wakeup? (the answer might be "don't wait").
2. How do we verify that a woken-up CPU actually booted all the way?
Even if #1 is a loooooong time, it might get stalled before it fully boots.
So, for this patch, let's just remove the timeout. If the boot CPU
hangs (spins forever) because the VMM is being naughty, I'm OK with it.
It's hard to do too much damage when you're spinning.
#2 seems like a separate issue to tackle. Maybe it's as simple as
waiting for the secondary CPU to mark itself as online.
The only question is whether #2 is a big enough deal that it has to be
tackled before starting to merge the TDX guest support.
>>> Patch below drops timeout handling completely. Any opinions?
>>>
>>> Other option would be to check in the trampoline code that initiated wake
>>> up is legitimate. But it should only be untrue if VMM acting weird (or
>>> virtual BIOS is buggy). I don't think it's right side to deal with the
>>> problem.
>>
>> Yeah, that would just be a band-aid. If we are worried about the
>> secondary CPU doing some damage at a random time, the host can just wait
>> until *after* the wakeup check.
>
> VMM cannot wake up secondary CPU on its own. Guest has to initiate it by
> writing into the mailbox. Reader of the mailbox command is also within
> trust boundary -- it loops inside virtual BIOS code. But VMM can
> *postpone* reacting to the command by not scheduling the secondary.
Right. I was just pointing out that if we add a check early in the boot
code, the VMM can postpone its naughtiness until after the check.
>>>>> /*
>>>>> * If the CPU wakeup process is successful, store the
>>>>> * status in apic_id_wakemap to prevent re-wakeup
>>>>> * requests.
>>>>> */
>>>>> physid_set(apicid, apic_id_wakemap);
>>>>>
>>>>> return 0;
>>>>> }
>>>>
>>>> If this goes wrong, won't the new wakeup just timeout? Do we really
>>>> need a dedicated mechanism to stop re-wakeups? How much of a problem is
>>>> this going to be?
>>>
>>> Well, it can provide a proper diagnostics and a distinct error code. If
>>> you think it is unneeded we can drop it.
>>
>> It looks like debugging that you add when you're bringing something up.
>> I'm not sure of its utility going forward. I'd just zap it for now and
>> bring it back later if it's really needed.
>
> Sathya pointed out that this protection was added based on request[1] by
> Rafael.
>
> TDX should be safe from re-wakeups as we forbid offlining CPUs, but the
> wake up method suppose to be generic, not limited to TDX.
>
> [1] https://lore.kernel.org/all/CAJZ5v0jonk2Pb2HUHMoe69vnCV4+EV9XZpo2LwKrnYPF3CxD_A@mail.gmail.com/
Fair enough. Please just add that reasoning somewhere.
Powered by blists - more mailing lists