[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f53ab4dc-20b6-4ada-83dc-89e2f7e8307a@intel.com>
Date: Fri, 12 Apr 2024 09:59:04 +1200
From: "Huang, Kai" <kai.huang@...el.com>
To: Tom Lendacky <thomas.lendacky@....com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>
CC: "x86@...nel.org" <x86@...nel.org>, "Hansen, Dave" <dave.hansen@...el.com>,
"bp@...en8.de" <bp@...en8.de>, "kirill.shutemov@...ux.intel.com"
<kirill.shutemov@...ux.intel.com>, "tglx@...utronix.de" <tglx@...utronix.de>,
"mingo@...hat.com" <mingo@...hat.com>, "hpa@...or.com" <hpa@...or.com>,
"luto@...nel.org" <luto@...nel.org>, "peterz@...radead.org"
<peterz@...radead.org>, "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
"ashish.kalra@....com" <ashish.kalra@....com>, "Gao, Chao"
<chao.gao@...el.com>, "bhe@...hat.com" <bhe@...hat.com>,
"nik.borisov@...e.com" <nik.borisov@...e.com>, "pbonzini@...hat.com"
<pbonzini@...hat.com>, "seanjc@...gle.com" <seanjc@...gle.com>
Subject: Re: [PATCH v3 2/5] x86/kexec: do unconditional WBINVD for bare-metal
in relocate_kernel()
On 12/04/2024 2:25 am, Tom Lendacky wrote:
> On 4/10/24 17:55, Huang, Kai wrote:
>> On 11/04/2024 4:21 am, Tom Lendacky wrote:
>>> On 4/7/24 07:44, Kai Huang wrote:
>
>>>> @@ -160,9 +160,13 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
>>>> movq %r9, %cr3
>>>> /*
>>>> - * If SME is active, there could be old encrypted cache line
>>>> - * entries that will conflict with the now unencrypted memory
>>>> - * used by kexec. Flush the caches before copying the kernel.
>>>> + * The kernel could leave caches in incoherent state on SME/TDX
>>>> + * capable platforms. Just do unconditional WBINVD to avoid
>>>> + * silent memory corruption to the new kernel for these platforms.
>>>> + *
>>>> + * But only do WBINVD for bare-metal because TDX guests and
>>>> + * SEV-ES/SEV-SNP guests will get #VE which the kernel is unable
>>>> + * to handle at this stage.
>>>
>>> Similar comment here about doing an unconditional WBINVD, but then
>>> qualifying that as only on guests. This is where talking about how
>>> exception handling has been torn down would be good.
>>>
>>
>> OK.
>>
>> Thinking again, also it might be a good idea to not lose the existing
>> comment for SME, because it somehow justifies why we do WBINVD _HERE_ I
>> suppose?
>>
>> How about below?
>>
>> /*
>> * The kernel could leave caches in incoherent state on SME/TDX
>> * capable platforms. Flush cache to avoid silent memory
>> * corruption for these platforms.
>> *
>> * For SME, need to flush cache here before copying the kernel.
>> * When it is active, there could be old encrypted cache line
>> * entries that will conflict with the now unencrypted memory
>> * used by kexec.
>> *
>> * Do WBINVD for bare-metal to cover both SME and TDX, as it's
>> * not safe to do WBINVD for TDX and SEV-ES/SEV-SNP guests.
>> * WBINVD results in exception (#VE or #VC) for these guests, and
>> * at this stage kernel is not able to handle such exception any
>> * more because the kernel has torn down IDT.
>> */
>
> Similar to my comment in the other patch, just adding something that
> indicates that the WBINVD isn't necessary for TDX and SEV-ES/SEV-SNP
> (and maybe guests in general) would help. Maybe something like:
>
> * Do WBINVD for bare-metal only to cover both SME and TDX. It
> * isn't necessary to perform a WBINVD in a guest and performing
> * one could result in an exception (#VE or #VC) for a TDX or
> * SEV-ES/SEV-SNP guest that can crash the guest since, at this
> * stage, the kernel has torn down the IDT.
>
Ah sure. I will replace the 3rd paragraph in my original comment with
yours (assuming that is what you meant).
Thanks again for providing the feedback!
Powered by blists - more mailing lists