[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230609101435.xmz3kgydseddrty7@box.shutemov.name>
Date: Fri, 9 Jun 2023 13:14:35 +0300
From: kirill.shutemov@...ux.intel.com
To: Kai Huang <kai.huang@...el.com>
Cc: linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
linux-mm@...ck.org, dave.hansen@...el.com, tony.luck@...el.com,
peterz@...radead.org, tglx@...utronix.de, seanjc@...gle.com,
pbonzini@...hat.com, david@...hat.com, dan.j.williams@...el.com,
rafael.j.wysocki@...el.com, ying.huang@...el.com,
reinette.chatre@...el.com, len.brown@...el.com, ak@...ux.intel.com,
isaku.yamahata@...el.com, chao.gao@...el.com,
sathyanarayanan.kuppuswamy@...ux.intel.com, bagasdotme@...il.com,
sagis@...gle.com, imammedo@...hat.com
Subject: Re: [PATCH v11 17/20] x86/kexec: Flush cache of TDX private memory
On Mon, Jun 05, 2023 at 02:27:30AM +1200, Kai Huang wrote:
> There are two problems in terms of using kexec() to boot to a new kernel
> when the old kernel has enabled TDX: 1) Part of the memory pages are
> still TDX private pages; 2) There might be dirty cachelines associated
> with TDX private pages.
>
> The first problem doesn't matter on the platforms w/o the "partial write
> machine check" erratum. KeyID 0 doesn't have integrity check. If the
> new kernel wants to use any non-zero KeyID, it needs to convert the
> memory to that KeyID and such conversion would work from any KeyID.
>
> However the old kernel needs to guarantee there's no dirty cacheline
> left behind before booting to the new kernel to avoid silent corruption
> from later cacheline writeback (Intel hardware doesn't guarantee cache
> coherency across different KeyIDs).
>
> There are two things that the old kernel needs to do to achieve that:
>
> 1) Stop accessing TDX private memory mappings:
> a. Stop making TDX module SEAMCALLs (TDX global KeyID);
> b. Stop TDX guests from running (per-guest TDX KeyID).
> 2) Flush any cachelines from previous TDX private KeyID writes.
>
> For 2), use wbinvd() to flush cache in stop_this_cpu(), following SME
> support. And in this way 1) happens for free as there's no TDX activity
> between wbinvd() and the native_halt().
>
> Flushing cache in stop_this_cpu() only flushes cache on remote cpus. On
> the cpu which does kexec(), unlike SME which does the cache flush in
> relocate_kernel(), do the cache flush right after stopping remote cpus
> in machine_shutdown(). This is because on the platforms with above
> erratum, the kernel needs to convert all TDX private pages back to
> normal before a fast warm reset reboot or booting to the new kernel in
> kexec(). Flushing cache in relocate_kernel() only covers the kexec()
> but not the fast warm reset reboot.
>
> Theoretically, cache flush is only needed when the TDX module has been
> initialized. However initializing the TDX module is done on demand at
> runtime, and it takes a mutex to read the module status. Just check
> whether TDX is enabled by the BIOS instead to flush cache.
>
> Signed-off-by: Kai Huang <kai.huang@...el.com>
> Reviewed-by: Isaku Yamahata <isaku.yamahata@...el.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
--
Kiryl Shutsemau / Kirill A. Shutemov
Powered by blists - more mailing lists