lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220411200703.48654-1-jon@nutanix.com>
Date:   Mon, 11 Apr 2022 16:07:01 -0400
From:   Jon Kohler <jon@...anix.com>
To:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
        "H. Peter Anvin" <hpa@...or.com>, Andi Kleen <ak@...ux.intel.com>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Jon Kohler <jon@...anix.com>, Tony Luck <tony.luck@...el.com>,
        linux-kernel@...r.kernel.org
Cc:     dave.hansen@...el.com, Borislav Petkov <bp@...e.de>,
        Neelima Krishnan <neelima.krishnan@...el.com>,
        "kvm @ vger . kernel . org" <kvm@...r.kernel.org>
Subject: [PATCH v2] x86/tsx: fix KVM guest live migration for tsx=on

Move automatic disablement for TSX microcode deprecation from tsx_init() to
x86_get_tsx_auto_mode(), such that systems with tsx=on will continue to
see the TSX CPU features (HLE, RTM) even on updated microcode.

KVM live migration could be possibly be broken in 5.14+ commit 293649307ef9
("x86/tsx: Clear CPUID bits when TSX always force aborts"). Consider the
following scenario:

1. KVM hosts clustered in a live migration capable setup.
2. KVM guests have TSX CPU features HLE and/or RTM presented.
3. One of the three maintenance events occur:
3a. An existing host running kernel >= 5.14 in the pool updated with the
    new microcode.
3b. A new host running kernel >= 5.14 is commissioned that already has the
    microcode update preloaded.
3c. All hosts are running kernel < 5.14 with microcode update already
    loaded and one existing host gets updated to kernel >= 5.14.
4. After maintenance event, the impacted host will not have HLE and RTM
   exposed, and live migrations with guests with TSX features might not
   migrate.

Users using tsx=on or CONFIG_X86_INTEL_TSX_MODE_ON should always see
HLE and RTM on capable Intel SKUs, even if microcode has been clubbed to
prevent functionality.

Users using tsx=auto get or CONFIG_X86_INTEL_TSX_MODE_AUTO get to roll the
dice with whatever the kernel believes the appropriate default is, which
includes the feature disappearing after a kernel and/or microcode update.
These users should consider masking HLE and RTM at a higher control plane
level, e.g. qemu or libvirt, such that guests on TSX enabled systems do not
see HLE/RTM and therefore do not enable TAA mitigation.

Fixes: 293649307ef9 ("x86/tsx: Clear CPUID bits when TSX always force aborts")

Signed-off-by: Jon Kohler <jon@...anix.com>
Cc: Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>
Cc: Borislav Petkov <bp@...e.de>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Andi Kleen <ak@...ux.intel.com>
Cc: Tony Luck <tony.luck@...el.com>
Cc: Neelima Krishnan <neelima.krishnan@...el.com>
Cc: kvm@...r.kernel.org <kvm@...r.kernel.org>
---
v1 -> v2:
 - Addressed comments on approach from Dave.

 arch/x86/kernel/cpu/tsx.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/cpu/tsx.c b/arch/x86/kernel/cpu/tsx.c
index 9c7a5f049292..4b701fa64869 100644
--- a/arch/x86/kernel/cpu/tsx.c
+++ b/arch/x86/kernel/cpu/tsx.c
@@ -78,6 +78,10 @@ static bool __init tsx_ctrl_is_supported(void)

 static enum tsx_ctrl_states x86_get_tsx_auto_mode(void)
 {
+	if (boot_cpu_has(X86_FEATURE_RTM_ALWAYS_ABORT) &&
+	    boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT))
+		return TSX_CTRL_RTM_ALWAYS_ABORT;
+
 	if (boot_cpu_has_bug(X86_BUG_TAA))
 		return TSX_CTRL_DISABLE;

@@ -105,21 +109,6 @@ void __init tsx_init(void)
 	char arg[5] = {};
 	int ret;

-	/*
-	 * Hardware will always abort a TSX transaction if both CPUID bits
-	 * RTM_ALWAYS_ABORT and TSX_FORCE_ABORT are set. In this case, it is
-	 * better not to enumerate CPUID.RTM and CPUID.HLE bits. Clear them
-	 * here.
-	 */
-	if (boot_cpu_has(X86_FEATURE_RTM_ALWAYS_ABORT) &&
-	    boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT)) {
-		tsx_ctrl_state = TSX_CTRL_RTM_ALWAYS_ABORT;
-		tsx_clear_cpuid();
-		setup_clear_cpu_cap(X86_FEATURE_RTM);
-		setup_clear_cpu_cap(X86_FEATURE_HLE);
-		return;
-	}
-
 	if (!tsx_ctrl_is_supported()) {
 		tsx_ctrl_state = TSX_CTRL_NOT_SUPPORTED;
 		return;
@@ -173,5 +162,16 @@ void __init tsx_init(void)
 		 */
 		setup_force_cpu_cap(X86_FEATURE_RTM);
 		setup_force_cpu_cap(X86_FEATURE_HLE);
+	} else if (tsx_ctrl_state == TSX_CTRL_RTM_ALWAYS_ABORT) {
+
+		/*
+		 * Hardware will always abort a TSX transaction if both CPUID bits
+		 * RTM_ALWAYS_ABORT and TSX_FORCE_ABORT are set. In this case, it is
+		 * better not to enumerate CPUID.RTM and CPUID.HLE bits. Clear them
+		 * here.
+		 */
+		tsx_clear_cpuid();
+		setup_clear_cpu_cap(X86_FEATURE_RTM);
+		setup_clear_cpu_cap(X86_FEATURE_HLE);
 	}
 }
--
2.30.1 (Apple Git-130)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ