linux-kernel - Re: AMD Memory encryption vs. kexec

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <cbc9c527-17e5-4a63-80fe-85451394cc7c@amd.com>
Date:   Wed, 29 Nov 2023 14:54:24 -0600
From:   Tom Lendacky <thomas.lendacky@....com>
To:     Dave Hansen <dave.hansen@...el.com>,
        Borislav Petkov <bp@...en8.de>,
        "Shutemov, Kirill" <kirill.shutemov@...el.com>,
        Ashish Kalra <ashish.kalra@....com>,
        Kai Huang <kai.huang@...el.com>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        the arch/x86 maintainers <x86@...nel.org>
Subject: Re: AMD Memory encryption vs. kexec

On 11/29/23 14:01, Dave Hansen wrote:
> On 11/28/23 06:03, Tom Lendacky wrote:
> ...
>>> By my reading, the CC_ATTR_HOST_MEM_ENCRYPT is basically a check for
>>> whether the current kernel has enabled SME but not SEV while the
>>> stop_this_cpu() site is driven purely by whether the hardware *supports*
>>> SME.
>>>
>>> The whole supposed reason stop_this_cpu() checks CPUID directly is that
>>> the current kernel SME/SEV enabling might not match the _next_ kernel's
>>> enabling choices.
>>
>> Correct.
>>
>>> So, why is a _current_ kernel check OK for relocate_kernel(), but not OK
>>> for stop_this_cpu()?
>>
>> The relocate_kernel() check provides an indication of whether SME is
>> actually active. The kexec kernel is placed in unencrypted memory to
>> match how the system was booted - where the kernel is loaded into
>> unencrypted memory and then encrypted in-place if SME is desired
>> (mem_encrypt=on). Since the kexec kernel will be unencrypted, the
>> cc_platform_has() call is used to indicate whether to perform a wbinvd
>> to remove encrypted cache line entries. If SME is not active, then there
>> is no need to flush caches prior to booting the kexec kernel.
> 
> Ahh, so that wbinvd is truly specific to kexec.  It protects the
> always-unencrypted kexec area from being zapped by encrypted lines.  It
> isn't necessary when the old kexec kernel is mem_encrypt=off because the
> unencrypted old kernel matches the always unencrypted kexec area.
> 
> What I was worried about was the _larger_ case.  Not the kexec area, the
> *rest* of memory.  But I think that's irrelevant because there's yet
> *another* wbinvd in __enc_copy() that is will flush the rest of memory
> when going from mem_encrypt=off=>on.

Correct (I was actually sitting here before I got your email wondering if 
I should reply to my previous email with just that info).

> 
> I'd like to propose a simplification.  Let's add a
> CC_ATTR_HOST_MEM_INCOHERENT.  That bit gets set on all hardware that
> needs WBVINDs at kexec.  On AMD, it can use the stop_this_cpu() logic.
> This will cause an additional wbinvd in case where a mem_encrypt=off
> kernel is kexec'ing.
> 
> We can also set it on any TDX-enabled Intel hardware.
> 
> That leads to very simple logic at kexec:
> 
> 	Could the old kernel leave incoherent caches
> 	around?  If so, do WBINVD.
> 
> That logic gets applied to all CPUs, both boot and secondary.  It
> applies to all the SME-only systems (currently CC_ATTR_HOST_MEM_ENCRYPT)
> and also all TDX systems.  It would not depend on the current kernel's
> SME enabling and it would allow both kexec-related sites to share the
> same logic.
> 
> I don't really like the idea of yet another CC_ATTR_HOST_MEM_INCOHERENT
> bit, but I do think it's better than adding some TDX-specific paths.

I'm good with that change. I think an additional WBINVD during kexec is 
acceptable to make everything less complicated in the code.

Thanks,
Tom