[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <cd88e97e53c502f0a457d6a82a31d9e8e0f9fca7.1706698706.git.kai.huang@intel.com>
Date: Wed, 31 Jan 2024 11:31:53 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: linux-kernel@...r.kernel.org
Cc: x86@...nel.org,
dave.hansen@...el.com,
kirill.shutemov@...ux.intel.com,
tglx@...utronix.de,
bp@...en8.de,
mingo@...hat.com,
hpa@...or.com,
luto@...nel.org,
peterz@...radead.org,
thomas.lendacky@....com,
chao.gao@...el.com,
bhe@...hat.com,
nik.borisov@...e.com,
pbonzini@...hat.com
Subject: [PATCH 1/4] x86/coco: Add a new CC attribute to unify cache flush during kexec
From: Kai Huang <kai.huang@...el.com>
Currently on AMD SME platforms, during kexec() caches are flushed
manually before jumping to the new kernel due to memory encryption.
Intel TDX needs to flush cachelines of TDX private memory before
jumping to the second kernel too, otherwise they may silently corrupt
the new kernel.
Instead of sprinkling both AMD and Intel's specific checks around,
introduce a new CC_ATTR_HOST_MEM_INCOHERENT attribute to unify both
Intel and AMD, and simplify the logic:
Could the old kernel leave incoherent caches around?
If so, do WBINVD.
Convert the AMD SME to use this new CC attribute. A later patch will
utilize this new attribute for Intel TDX too.
Specifically, AMD SME flushes caches at two places: 1) stop_this_cpu();
2) relocate_kernel(). stop_this_cpu() checks the CPUID directly to do
WBINVD for the reason that the current kernel's SME enabling status may
not match the new kernel's choice. However the relocate_kernel() only
does the WBINVD when the current kernel has enabled SME for the reason
that the new kernel is always placed in an "unencrypted" area.
To simplify the logic, for AMD SME change to always use the way that is
done in stop_this_cpu(). This will cause an additional WBINVD in
relocate_kernel() when the current kernel hasn't enabled SME (e.g.,
disabled by kernel command line), but this is acceptable for the sake of
having less complicated code (see [1] for the relevant discussion).
Note currently the kernel only advertises CC vendor for AMD SME when SME
is actually enabled by the kernel. To always advertise the new
CC_ATTR_HOST_MEM_INCOHERENT regardless of the kernel's SME enabling
status, change to set CC vendor as long as the hardware has enabled SME.
Note "advertising CC_ATTR_HOST_MEM_INCOHERENT when the hardware has
enabled SME" is still different from "checking the CPUID" (the way that
is done in stop_this_cpu()), but technically the former also serves the
purpose and is actually more accurate.
Such change allows sme_me_mask to be 0 while CC vendor reports as AMD.
But this doesn't impact other CC attributes on AMD platforms, nor does
it impact the cc_mkdec()/cc_mkenc().
[1] https://lore.kernel.org/lkml/cbc9c527-17e5-4a63-80fe-85451394cc7c@amd.com/
Suggested-by: Dave Hansen <dave.hansen@...el.com>
Signed-off-by: Kai Huang <kai.huang@...el.com>
---
arch/x86/coco/core.c | 13 +++++++++++++
arch/x86/kernel/machine_kexec_64.c | 2 +-
arch/x86/kernel/process.c | 14 +++-----------
arch/x86/mm/mem_encrypt_identity.c | 11 ++++++++++-
include/linux/cc_platform.h | 15 +++++++++++++++
5 files changed, 42 insertions(+), 13 deletions(-)
diff --git a/arch/x86/coco/core.c b/arch/x86/coco/core.c
index eeec9986570e..8d6d727e6e18 100644
--- a/arch/x86/coco/core.c
+++ b/arch/x86/coco/core.c
@@ -72,6 +72,19 @@ static bool noinstr amd_cc_platform_has(enum cc_attr attr)
case CC_ATTR_HOST_MEM_ENCRYPT:
return sme_me_mask && !(sev_status & MSR_AMD64_SEV_ENABLED);
+ case CC_ATTR_HOST_MEM_INCOHERENT:
+ /*
+ * CC_ATTR_HOST_MEM_INCOHERENT represents whether SME has
+ * enabled on the platform regardless whether the kernel
+ * has actually enabled the SME.
+ */
+ return !(sev_status & MSR_AMD64_SEV_ENABLED);
+
+ /*
+ * For all CC_ATTR_GUEST_* there's no need to check sme_me_mask
+ * as it must be true when there's any SEV enable bit set in
+ * sev_status.
+ */
case CC_ATTR_GUEST_MEM_ENCRYPT:
return sev_status & MSR_AMD64_SEV_ENABLED;
diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index bc0a5348b4a6..c9c6974e2e9c 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -358,7 +358,7 @@ void machine_kexec(struct kimage *image)
(unsigned long)page_list,
image->start,
image->preserve_context,
- cc_platform_has(CC_ATTR_HOST_MEM_ENCRYPT));
+ cc_platform_has(CC_ATTR_HOST_MEM_INCOHERENT));
#ifdef CONFIG_KEXEC_JUMP
if (image->preserve_context)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index ab49ade31b0d..2c7e8d9889c0 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -813,18 +813,10 @@ void __noreturn stop_this_cpu(void *dummy)
mcheck_cpu_clear(c);
/*
- * Use wbinvd on processors that support SME. This provides support
- * for performing a successful kexec when going from SME inactive
- * to SME active (or vice-versa). The cache must be cleared so that
- * if there are entries with the same physical address, both with and
- * without the encryption bit, they don't race each other when flushed
- * and potentially end up with the wrong entry being committed to
- * memory.
- *
- * Test the CPUID bit directly because the machine might've cleared
- * X86_FEATURE_SME due to cmdline options.
+ * Use wbinvd on processors that the first kernel *could*
+ * potentially leave incoherent cachelines.
*/
- if (c->extended_cpuid_level >= 0x8000001f && (cpuid_eax(0x8000001f) & BIT(0)))
+ if (cc_platform_has(CC_ATTR_HOST_MEM_INCOHERENT))
native_wbinvd();
/*
diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c
index 7f72472a34d6..87e4fddab770 100644
--- a/arch/x86/mm/mem_encrypt_identity.c
+++ b/arch/x86/mm/mem_encrypt_identity.c
@@ -570,9 +570,19 @@ void __init sme_enable(struct boot_params *bp)
msr = __rdmsr(MSR_AMD64_SYSCFG);
if (!(msr & MSR_AMD64_SYSCFG_MEM_ENCRYPT))
return;
+
+ /*
+ * Always set CC vendor when the platform has SME enabled
+ * regardless whether the kernel will actually activates the
+ * SME or not. This reports the CC_ATTR_HOST_MEM_INCOHERENT
+ * being true as long as the platform has SME enabled so that
+ * stop_this_cpu() can do necessary WBINVD during kexec().
+ */
+ cc_vendor = CC_VENDOR_AMD;
} else {
/* SEV state cannot be controlled by a command line option */
sme_me_mask = me_mask;
+ cc_vendor = CC_VENDOR_AMD;
goto out;
}
@@ -608,7 +618,6 @@ void __init sme_enable(struct boot_params *bp)
out:
if (sme_me_mask) {
physical_mask &= ~sme_me_mask;
- cc_vendor = CC_VENDOR_AMD;
cc_set_mask(sme_me_mask);
}
}
diff --git a/include/linux/cc_platform.h b/include/linux/cc_platform.h
index cb0d6cd1c12f..2f7273596102 100644
--- a/include/linux/cc_platform.h
+++ b/include/linux/cc_platform.h
@@ -42,6 +42,21 @@ enum cc_attr {
*/
CC_ATTR_HOST_MEM_ENCRYPT,
+ /**
+ * @CC_ATTR_HOST_MEM_INCOHERENT: Host memory encryption can be
+ * incoherent
+ *
+ * The platform/OS is running as a bare-metal system or a hypervisor.
+ * The memory encryption engine might have left non-cache-coherent
+ * data in the caches that needs to be flushed.
+ *
+ * Use this in places where the cache coherency of the memory matters
+ * but the encryption status does not.
+ *
+ * Includes all systems that set CC_ATTR_HOST_MEM_ENCRYPT.
+ */
+ CC_ATTR_HOST_MEM_INCOHERENT,
+
/**
* @CC_ATTR_GUEST_MEM_ENCRYPT: Guest memory encryption is active
*
--
2.34.1
Powered by blists - more mailing lists