[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cbf171e1-a518-429b-fd07-3526a0b252bf@amd.com>
Date: Wed, 10 Apr 2024 11:14:25 -0500
From: Tom Lendacky <thomas.lendacky@....com>
To: Kai Huang <kai.huang@...el.com>, linux-kernel@...r.kernel.org
Cc: x86@...nel.org, dave.hansen@...el.com, bp@...en8.de,
kirill.shutemov@...ux.intel.com, tglx@...utronix.de, mingo@...hat.com,
hpa@...or.com, luto@...nel.org, peterz@...radead.org,
rick.p.edgecombe@...el.com, ashish.kalra@....com, chao.gao@...el.com,
bhe@...hat.com, nik.borisov@...e.com, pbonzini@...hat.com, seanjc@...gle.com
Subject: Re: [PATCH v3 1/5] x86/kexec: do unconditional WBINVD for bare-metal
in stop_this_cpu()
On 4/10/24 11:08, Tom Lendacky wrote:
> On 4/7/24 07:44, Kai Huang wrote:
>
>> diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
>> index b8441147eb5e..5ba8a9c1e47a 100644
>> --- a/arch/x86/kernel/process.c
>> +++ b/arch/x86/kernel/process.c
>> @@ -813,18 +813,16 @@ void __noreturn stop_this_cpu(void *dummy)
>> mcheck_cpu_clear(c);
>> /*
>> - * Use wbinvd on processors that support SME. This provides support
>> - * for performing a successful kexec when going from SME inactive
>> - * to SME active (or vice-versa). The cache must be cleared so that
>> - * if there are entries with the same physical address, both with
>> and
>> - * without the encryption bit, they don't race each other when
>> flushed
>> - * and potentially end up with the wrong entry being committed to
>> - * memory.
>> + * The kernel could leave caches in incoherent state on SME/TDX
>> + * capable platforms. Flush cache to avoid silent memory
>> + * corruption for these platforms.
>> *
>> - * Test the CPUID bit directly because the machine might've cleared
>> - * X86_FEATURE_SME due to cmdline options.
>> + * stop_this_cpu() is not a fast path, just do unconditional
>> + * WBINVD for simplicity. But only do WBINVD for bare-metal
>> + * as TDX guests and SEV-ES/SEV-SNP guests will get unexpected
>> + * (and unnecessary) #VE and may unable to handle.
>
> In addition to Kirill's comment on #VE...
>
> This last part of the comment reads a bit odd since you say
> unconditional and then say only do WBINVD for bare-metal. Maybe
> something like this makes it a bit clearer?:
>
> For TDX and SEV-ES/SEV-SNP guests, a WBINVD may cause an exception (#VE
> or #VC). However, all exception handling has been torn down at this
> point, so this would cause the guest to crash. Since memory within these
> types of guests is coherent only issue the WBINVD on bare-metal.
Hmmm... actually it was the other WBINVD in patch #2 that caused the
crash, so what I wrote above isn't accurate. You might want to re-word
as appropriate.
Thanks,
Tom
>
> And you can expand the comment block out to at least 80 characters to
> make it more compact.
>
> Thanks,
> Tom
>
>> */
>> - if (c->extended_cpuid_level >= 0x8000001f &&
>> (cpuid_eax(0x8000001f) & BIT(0)))
>> + if (!boot_cpu_has(X86_FEATURE_HYPERVISOR))
>> native_wbinvd();
>> /*
Powered by blists - more mailing lists