linux-kernel - Re: [PATCH v4 1/5] x86/kexec: do unconditional WBINVD for bare-metal in stop_this

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <90ee763e-e059-12a4-d7c3-9a710a1c5db4@amd.com>
Date: Fri, 31 May 2024 15:45:09 -0500
From: Tom Lendacky <thomas.lendacky@....com>
To: "Huang, Kai" <kai.huang@...el.com>, linux-kernel@...r.kernel.org
Cc: x86@...nel.org, dave.hansen@...el.com, bp@...en8.de,
 kirill.shutemov@...ux.intel.com, tglx@...utronix.de, mingo@...hat.com,
 hpa@...or.com, luto@...nel.org, peterz@...radead.org,
 rick.p.edgecombe@...el.com, ashish.kalra@....com, chao.gao@...el.com,
 bhe@...hat.com, nik.borisov@...e.com, pbonzini@...hat.com, seanjc@...gle.com
Subject: Re: [PATCH v4 1/5] x86/kexec: do unconditional WBINVD for bare-metal
 in stop_this_cpu()

On 5/22/24 21:49, Huang, Kai wrote:
> On 18/04/2024 11:48 pm, Kai Huang wrote:
>> TL;DR:
>>
>> Change to do unconditional WBINVD in stop_this_cpu() for bare metal
>> to cover kexec support for both AMD SME and Intel TDX, despite there
>> _was_ some issue preventing from doing so but now has it got fixed.
>>
>> Long version:
>>
>> Both AMD SME and Intel TDX can leave caches in an incoherent state due
>> to memory encryption, which can lead to silent memory corruption during
>> kexec.  To address this issue, it is necessary to flush the caches
>> before jumping to the second kernel.
>>
>> Currently, the kernel only performs WBINVD in stop_this_cpu() when SME
>> is supported by hardware.  To support TDX, instead of adding one more
>> vendor-specific check, it is proposed to perform unconditional WBINVD.
>> Kexec() is a slow path, and the additional WBINVD is acceptable for the
>> sake of simplicity and maintainability.
>>
> 
> Hi Tom,
> 
> May I ask how does SME work with kdump in crash_kexec().  Looking at the 
> code, AFAICT the crash_kexec() path doesn't use stop_this_cpu() to stop 
> all other cpus.  Instead, kdump_nmi_shootdown_cpus() is called to send NMI 
> to remote cpus and crash_nmi_callback() is invoked to stop them.
> 
> But the crash_nmi_callback() doesn't invoke WBINVD for SME AFAICT.  It 
> does call the kdump_nmi_callback() callback where a WBINVD is performed 
> for the SNP host:
> 
> void kdump_sev_callback(void)
> {
>          /*
>           * Do wbinvd() on remote CPUs when SNP is enabled in order to
>           * safely do SNP_SHUTDOWN on the local CPU.
>           */
>          if (cc_platform_has(CC_ATTR_HOST_SEV_SNP))
>                  wbinvd();
> }
> 
> So if I read correctly, what's the reason the WBINVD is skipped for SME in 
> case of crash_kexec()?

The system is rebooted after a crash and doesn't continue directly on into 
a new kernel.

Thanks,
Tom