[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200821025050.32573-1-sean.j.christopherson@intel.com>
Date: Thu, 20 Aug 2020 19:50:50 -0700
From: Sean Christopherson <sean.j.christopherson@...el.com>
To: Andy Lutomirski <luto@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
x86@...nel.org
Cc: "H. Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org,
Dave Hansen <dave.hansen@...el.com>,
Chang Seok Bae <chang.seok.bae@...el.com>,
Peter Zijlstra <peterz@...radead.org>,
Sasha Levin <sashal@...nel.org>,
Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org,
Tom Lendacky <thomas.lendacky@....com>,
Sean Christopherson <sean.j.christopherson@...el.com>
Subject: [PATCH] x86/entry/64: Disallow RDPID in paranoid entry if KVM is enabled
Don't use RDPID in the paranoid entry flow if KVM is enabled as doing so
can consume a KVM guest's MSR_TSC_AUX value if an NMI arrives in KVM's
run loop.
As a performance optimization, KVM loads the guest's TSC_AUX when a CPU
first enters its run loop, and on AMD's SVM doesn't restore the host's
value until the CPU exits the run loop. VMX is even more aggressive and
defers restoring the host's value until the CPU returns to userspace.
This optimization obviously relies on the kernel not consuming TSC_AUX,
which falls apart if an NMI arrives in the run loop.
Removing KVM's optimizaton would be painful, as both SVM and VMX would
need to context switch the MSR on every VM-Enter (2x WRMSR + 1x RDMSR),
whereas using LSL instead RDPID is a minor blip.
Fixes: eaad981291ee3 ("x86/entry/64: Introduce the FIND_PERCPU_BASE macro")
Cc: Dave Hansen <dave.hansen@...el.com>
Cc: Chang Seok Bae <chang.seok.bae@...el.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Sasha Levin <sashal@...nel.org>
Cc: Paolo Bonzini <pbonzini@...hat.com>
Cc: kvm@...r.kernel.org
Reported-by: Tom Lendacky <thomas.lendacky@....com>
Debugged-by: Tom Lendacky <thomas.lendacky@....com>
Suggested-by: Andy Lutomirski <luto@...nel.org>
Signed-off-by: Sean Christopherson <sean.j.christopherson@...el.com>
---
Andy, I know you said "unconditionally", but it felt weird adding a
comment way down in GET_PERCPU_BASE without plumbing a param in to help
provide context. But, paranoid_entry is the only user so adding a param
that is unconditional also felt weird. That being said, I definitely
don't have a strong opinion one way or the other.
arch/x86/entry/calling.h | 10 +++++++---
arch/x86/entry/entry_64.S | 7 ++++++-
2 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 98e4d8886f11c..a925c0cf89c1a 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -342,9 +342,9 @@ For 32-bit we have the following conventions - kernel is built with
#endif
.endm
-.macro SAVE_AND_SET_GSBASE scratch_reg:req save_reg:req
+.macro SAVE_AND_SET_GSBASE scratch_reg:req save_reg:req no_rdpid=0
rdgsbase \save_reg
- GET_PERCPU_BASE \scratch_reg
+ GET_PERCPU_BASE \scratch_reg \no_rdpid
wrgsbase \scratch_reg
.endm
@@ -375,11 +375,15 @@ For 32-bit we have the following conventions - kernel is built with
* We normally use %gs for accessing per-CPU data, but we are setting up
* %gs here and obviously can not use %gs itself to access per-CPU data.
*/
-.macro GET_PERCPU_BASE reg:req
+.macro GET_PERCPU_BASE reg:req no_rdpid=0
+ .if \no_rdpid
+ LOAD_CPU_AND_NODE_SEG_LIMIT \reg
+ .else
ALTERNATIVE \
"LOAD_CPU_AND_NODE_SEG_LIMIT \reg", \
"RDPID \reg", \
X86_FEATURE_RDPID
+ .endif
andq $VDSO_CPUNODE_MASK, \reg
movq __per_cpu_offset(, \reg, 8), \reg
.endm
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 70dea93378162..fd915c46297c5 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -842,8 +842,13 @@ SYM_CODE_START_LOCAL(paranoid_entry)
*
* The MSR write ensures that no subsequent load is based on a
* mispredicted GSBASE. No extra FENCE required.
+ *
+ * Disallow RDPID if KVM is enabled as it may consume a guest's TSC_AUX
+ * if an NMI arrives in KVM's run loop. KVM loads guest's TSC_AUX on
+ * VM-Enter and may not restore the host's value until the CPU returns
+ * to userspace, i.e. KVM depends on the kernel not using TSC_AUX.
*/
- SAVE_AND_SET_GSBASE scratch_reg=%rax save_reg=%rbx
+ SAVE_AND_SET_GSBASE scratch_reg=%rax save_reg=%rbx no_rdpid=IS_ENABLED(CONFIG_KVM)
ret
.Lparanoid_entry_checkgs:
--
2.28.0
Powered by blists - more mailing lists