lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <533c2f6a-9cb2-6d25-c8a0-0643c6862d9b@amd.com>
Date: Thu, 11 Apr 2024 09:25:39 -0500
From: Tom Lendacky <thomas.lendacky@....com>
To: "Huang, Kai" <kai.huang@...el.com>, linux-kernel@...r.kernel.org
Cc: x86@...nel.org, dave.hansen@...el.com, bp@...en8.de,
 kirill.shutemov@...ux.intel.com, tglx@...utronix.de, mingo@...hat.com,
 hpa@...or.com, luto@...nel.org, peterz@...radead.org,
 rick.p.edgecombe@...el.com, ashish.kalra@....com, chao.gao@...el.com,
 bhe@...hat.com, nik.borisov@...e.com, pbonzini@...hat.com, seanjc@...gle.com
Subject: Re: [PATCH v3 2/5] x86/kexec: do unconditional WBINVD for bare-metal
 in relocate_kernel()

On 4/10/24 17:55, Huang, Kai wrote:
> On 11/04/2024 4:21 am, Tom Lendacky wrote:
>> On 4/7/24 07:44, Kai Huang wrote:

>>> @@ -160,9 +160,13 @@ SYM_CODE_START_LOCAL_NOALIGN(identity_mapped)
>>>       movq    %r9, %cr3
>>>       /*
>>> -     * If SME is active, there could be old encrypted cache line
>>> -     * entries that will conflict with the now unencrypted memory
>>> -     * used by kexec. Flush the caches before copying the kernel.
>>> +     * The kernel could leave caches in incoherent state on SME/TDX
>>> +     * capable platforms.  Just do unconditional WBINVD to avoid
>>> +     * silent memory corruption to the new kernel for these platforms.
>>> +     *
>>> +     * But only do WBINVD for bare-metal because TDX guests and
>>> +     * SEV-ES/SEV-SNP guests will get #VE which the kernel is unable
>>> +     * to handle at this stage.
>>
>> Similar comment here about doing an unconditional WBINVD, but then 
>> qualifying that as only on guests. This is where talking about how 
>> exception handling has been torn down would be good.
>>
> 
> OK.
> 
> Thinking again, also it might be a good idea to not lose the existing 
> comment for SME, because it somehow justifies why we do WBINVD _HERE_ I 
> suppose?
> 
> How about below?
> 
>      /*
>       * The kernel could leave caches in incoherent state on SME/TDX
>       * capable platforms.  Flush cache to avoid silent memory
>       * corruption for these platforms.
>       *
>       * For SME, need to flush cache here before copying the kernel.
>       * When it is active, there could be old encrypted cache line
>       * entries that will conflict with the now unencrypted memory
>       * used by kexec.
>       *
>       * Do WBINVD for bare-metal to cover both SME and TDX, as it's
>       * not safe to do WBINVD for TDX and SEV-ES/SEV-SNP guests.
>       * WBINVD results in exception (#VE or #VC) for these guests, and
>       * at this stage kernel is not able to handle such exception any
>       * more because the kernel has torn down IDT.
>       */

Similar to my comment in the other patch, just adding something that 
indicates that the WBINVD isn't necessary for TDX and SEV-ES/SEV-SNP 
(and maybe guests in general) would help. Maybe something like:

	 * Do WBINVD for bare-metal only to cover both SME and TDX. It
	 * isn't necessary to perform a WBINVD in a guest and performing
	 * one could result in an exception (#VE or #VC) for a TDX or
	 * SEV-ES/SEV-SNP guest that can crash the guest since, at this
	 * stage, the kernel has torn down the IDT.

This is all just my opinion, though, and others may read it differently. 
Probably not worth bike-shedding it.

Thanks,
Tom


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ