[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220411180131.5054-1-jon@nutanix.com>
Date: Mon, 11 Apr 2022 14:01:29 -0400
From: Jon Kohler <jon@...anix.com>
To: Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H. Peter Anvin" <hpa@...or.com>, Tony Luck <tony.luck@...el.com>,
Jon Kohler <jon@...anix.com>, Andi Kleen <ak@...ux.intel.com>,
Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
linux-kernel@...r.kernel.org
Cc: Borislav Petkov <bp@...e.de>,
Neelima Krishnan <neelima.krishnan@...el.com>,
"kvm @ vger . kernel . org" <kvm@...r.kernel.org>
Subject: [PATCH] x86/tsx: fix KVM guest live migration for tsx=on
Move automatic disablement for TSX microcode deprecation from tsx_init() to
x86_get_tsx_auto_mode(), such that systems with tsx=on will continue to
see the TSX CPU features (HLE, RTM) even on updated microcode.
KVM live migration could be possibly be broken in 5.14+ commit 293649307ef9
("x86/tsx: Clear CPUID bits when TSX always force aborts"). Consider the
following scenario:
1. KVM hosts clustered in a live migration capable setup.
2. KVM guests have TSX CPU features HLE and/or RTM presented.
3. One of the three maintenance events occur:
3a. An existing host running kernel >= 5.14 in the pool updated with the
new microcode.
3b. A new host running kernel >= 5.14 is commissioned that already has the
microcode update preloaded.
3c. All hosts are running kernel < 5.14 with microcode update already
loaded and one existing host gets updated to kernel >= 5.14.
4. After maintenance event, the impacted host will not have HLE and RTM
exposed, and live migrations with guests with TSX features might not
migrate.
Users using tsx=on or CONFIG_X86_INTEL_TSX_MODE_ON should always see
HLE and RTM on capable Intel SKUs, even if microcode has been clubbed to
prevent functionality.
Users using tsx=auto get or CONFIG_X86_INTEL_TSX_MODE_AUTO get to roll the
dice with whatever the kernel believes the appropriate default is, which
includes the feature disappearing after a kernel and/or microcode update.
These users should consider masking HLE and RTM at a higher control plane
level, e.g. qemu or libvirt, such that guests on TSX enabled systems do not
see HLE/RTM and therefore do not enable TAA mitigation.
Fixes: 293649307ef9 ("x86/tsx: Clear CPUID bits when TSX always force aborts")
Signed-off-by: Jon Kohler <jon@...anix.com>
Cc: Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>
Cc: Borislav Petkov <bp@...e.de>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Andi Kleen <ak@...ux.intel.com>
Cc: Tony Luck <tony.luck@...el.com>
Cc: Neelima Krishnan <neelima.krishnan@...el.com>
Cc: kvm@...r.kernel.org <kvm@...r.kernel.org>
---
arch/x86/kernel/cpu/tsx.c | 29 ++++++++++++++---------------
1 file changed, 14 insertions(+), 15 deletions(-)
diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
index 9c7a5f049292..a24e5e471e3f 100644
--- a/arch/x86/kernel/cpu/tsx.c
+++ b/arch/x86/kernel/cpu/tsx.c
@@ -78,6 +78,20 @@ static bool __init tsx_ctrl_is_supported(void)
static enum tsx_ctrl_states x86_get_tsx_auto_mode(void)
{
+ /*
+ * Hardware will always abort a TSX transaction if both CPUID bits
+ * RTM_ALWAYS_ABORT and TSX_FORCE_ABORT are set. In this case, it is
+ * better not to enumerate CPUID.RTM and CPUID.HLE bits. Clear them
+ * here.
+ */
+ if (boot_cpu_has(X86_FEATURE_RTM_ALWAYS_ABORT) &&
+ boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT)) {
+ tsx_clear_cpuid();
+ setup_clear_cpu_cap(X86_FEATURE_RTM);
+ setup_clear_cpu_cap(X86_FEATURE_HLE);
+ return TSX_CTRL_RTM_ALWAYS_ABORT;
+ }
+
if (boot_cpu_has_bug(X86_BUG_TAA))
return TSX_CTRL_DISABLE;
@@ -105,21 +119,6 @@ void __init tsx_init(void)
char arg[5] = {};
int ret;
- /*
- * Hardware will always abort a TSX transaction if both CPUID bits
- * RTM_ALWAYS_ABORT and TSX_FORCE_ABORT are set. In this case, it is
- * better not to enumerate CPUID.RTM and CPUID.HLE bits. Clear them
- * here.
- */
- if (boot_cpu_has(X86_FEATURE_RTM_ALWAYS_ABORT) &&
- boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT)) {
- tsx_ctrl_state = TSX_CTRL_RTM_ALWAYS_ABORT;
- tsx_clear_cpuid();
- setup_clear_cpu_cap(X86_FEATURE_RTM);
- setup_clear_cpu_cap(X86_FEATURE_HLE);
- return;
- }
-
if (!tsx_ctrl_is_supported()) {
tsx_ctrl_state = TSX_CTRL_NOT_SUPPORTED;
return;
--
2.30.1 (Apple Git-130)
Powered by blists - more mailing lists