linux-kernel - Re: [PATCH v15 17/23] x86/kexec: Flush cache of TDX private memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e8fd4bff8244e9e709c997da309e73a932567959.camel@intel.com>
Date:   Mon, 27 Nov 2023 19:33:47 +0000
From:   "Huang, Kai" <kai.huang@...el.com>
To:     "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "Hansen, Dave" <dave.hansen@...el.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
CC:     "sathyanarayanan.kuppuswamy@...ux.intel.com" 
        <sathyanarayanan.kuppuswamy@...ux.intel.com>,
        "Luck, Tony" <tony.luck@...el.com>,
        "david@...hat.com" <david@...hat.com>,
        "bagasdotme@...il.com" <bagasdotme@...il.com>,
        "ak@...ux.intel.com" <ak@...ux.intel.com>,
        "kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>,
        "seanjc@...gle.com" <seanjc@...gle.com>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "pbonzini@...hat.com" <pbonzini@...hat.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "Yamahata, Isaku" <isaku.yamahata@...el.com>,
        "nik.borisov@...e.com" <nik.borisov@...e.com>,
        "hpa@...or.com" <hpa@...or.com>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "sagis@...gle.com" <sagis@...gle.com>,
        "imammedo@...hat.com" <imammedo@...hat.com>,
        "bp@...en8.de" <bp@...en8.de>, "Gao, Chao" <chao.gao@...el.com>,
        "Brown, Len" <len.brown@...el.com>,
        "rafael@...nel.org" <rafael@...nel.org>,
        "Huang, Ying" <ying.huang@...el.com>,
        "Williams, Dan J" <dan.j.williams@...el.com>,
        "x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCH v15 17/23] x86/kexec: Flush cache of TDX private memory

On Mon, 2023-11-27 at 10:13 -0800, Hansen, Dave wrote:
> On 11/9/23 03:55, Kai Huang wrote:
> ...
> > --- a/arch/x86/kernel/reboot.c
> > +++ b/arch/x86/kernel/reboot.c
> > @@ -31,6 +31,7 @@
> >  #include <asm/realmode.h>
> >  #include <asm/x86_init.h>
> >  #include <asm/efi.h>
> > +#include <asm/tdx.h>
> >  
> >  /*
> >   * Power off function, if any
> > @@ -741,6 +742,20 @@ void native_machine_shutdown(void)
> >  	local_irq_disable();
> >  	stop_other_cpus();
> >  #endif
> > +	/*
> > +	 * stop_other_cpus() has flushed all dirty cachelines of TDX
> > +	 * private memory on remote cpus.  Unlike SME, which does the
> > +	 * cache flush on _this_ cpu in the relocate_kernel(), flush
> > +	 * the cache for _this_ cpu here.  This is because on the
> > +	 * platforms with "partial write machine check" erratum the
> > +	 * kernel needs to convert all TDX private pages back to normal
> > +	 * before booting to the new kernel in kexec(), and the cache
> > +	 * flush must be done before that.  If the kernel took SME's way,
> > +	 * it would have to muck with the relocate_kernel() assembly to
> > +	 * do memory conversion.
> > +	 */
> > +	if (platform_tdx_enabled())
> > +		native_wbinvd();
> 
> Why can't the TDX host code just set host_mem_enc_active=1?
> 
> Sure, you'll end up *using* the SME WBINVD support, but then you don't
> have two different WBINVD call sites.  You also don't have to mess with
> a single line of assembly.

I wanted to avoid changing the assembly.

Perhaps the comment isn't very clear.  Flushing cache (on the CPU running kexec)
when the host_mem_enc_active=1 is handled in the relocate_kernel() assembly,
which happens at very late stage right before jumping to the new kernel. 
However for TDX when the platform has erratum we need to convert TDX private
pages back to normal, which must be done after flushing cache.  If we reuse
host_mem_enc_active=1, then we will need to change the assembly code to do that.