[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aIx5jcdZ32mjOfXg@intel.com>
Date: Fri, 1 Aug 2025 16:23:41 +0800
From: Chao Gao <chao.gao@...el.com>
To: Kai Huang <kai.huang@...el.com>
CC: <dave.hansen@...el.com>, <bp@...en8.de>, <tglx@...utronix.de>,
<peterz@...radead.org>, <mingo@...hat.com>, <hpa@...or.com>,
<thomas.lendacky@....com>, <x86@...nel.org>, <kas@...nel.org>,
<rick.p.edgecombe@...el.com>, <dwmw@...zon.co.uk>,
<linux-kernel@...r.kernel.org>, <pbonzini@...hat.com>, <seanjc@...gle.com>,
<kvm@...r.kernel.org>, <reinette.chatre@...el.com>,
<isaku.yamahata@...el.com>, <dan.j.williams@...el.com>,
<ashish.kalra@....com>, <nik.borisov@...e.com>, <sagis@...gle.com>, "Farrah
Chen" <farrah.chen@...el.com>
Subject: Re: [PATCH v5 3/7] x86/virt/tdx: Mark memory cache state incoherent
when making SEAMCALL
On Tue, Jul 29, 2025 at 12:28:37AM +1200, Kai Huang wrote:
>On TDX platforms, dirty cacheline aliases with and without encryption
>bits can coexist, and the cpu can flush them back to memory in random
>order. During kexec, the caches must be flushed before jumping to the
>new kernel otherwise the dirty cachelines could silently corrupt the
>memory used by the new kernel due to different encryption property.
>
>A percpu boolean is used to mark whether the cache of a given CPU may be
>in an incoherent state, and the kexec performs WBINVD on the CPUs with
>that boolean turned on.
>
>For TDX, only the TDX module or the TDX guests can generate dirty
>cachelines of TDX private memory, i.e., they are only generated when the
>kernel does a SEAMCALL.
>
>Set that boolean when the kernel does SEAMCALL so that kexec can flush
>the cache correctly.
>
>The kernel provides both the __seamcall*() assembly functions and the
>seamcall*() wrapper ones which additionally handle running out of
>entropy error in a loop. Most of the SEAMCALLs are called using the
>seamcall*(), except TDH.VP.ENTER and TDH.PHYMEM.PAGE.RDMD which are
>called using __seamcall*() variant directly.
>
>To cover the two special cases, add a new helper do_seamcall() which
>only sets the percpu boolean and then calls the __seamcall*(), and
>change the special cases to use do_seamcall(). To cover all other
>SEAMCALLs, change seamcall*() to call do_seamcall().
>
>For the SEAMCALLs invoked via seamcall*(), they can be made from both
>task context and IRQ disabled context. Given SEAMCALL is just a lengthy
>instruction (e.g., thousands of cycles) from kernel's point of view and
>preempt_{disable|enable}() is cheap compared to it, just unconditionally
>disable preemption during setting the boolean and making SEAMCALL.
>
>Signed-off-by: Kai Huang <kai.huang@...el.com>
>Tested-by: Farrah Chen <farrah.chen@...el.com>
Reviewed-by: Chao Gao <chao.gao@...el.com>
One small question below,
<snip>
>+static __always_inline u64 do_seamcall(sc_func_t func, u64 fn,
>+ struct tdx_module_args *args)
>+{
>+ lockdep_assert_preemption_disabled();
>+
>+ /*
>+ * SEAMCALLs are made to the TDX module and can generate dirty
>+ * cachelines of TDX private memory. Mark cache state incoherent
>+ * so that the cache can be flushed during kexec.
>+ *
>+ * This needs to be done before actually making the SEAMCALL,
...
>+ * because kexec-ing CPU could send NMI to stop remote CPUs,
>+ * in which case even disabling IRQ won't help here.
>+ */
>+ this_cpu_write(cache_state_incoherent, true);
I'm not sure this write is guaranteed to occur before the SEAMCALL, as I don't
see any explicit barrier to prevent the compiler from reordering them. Do you
think this is an issue?
>+
>+ return func(fn, args);
>+}
>+
Powered by blists - more mailing lists