[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20210831175025.27570-22-jiangshanlai@gmail.com>
Date: Wed, 1 Sep 2021 01:50:22 +0800
From: Lai Jiangshan <jiangshanlai@...il.com>
To: linux-kernel@...r.kernel.org
Cc: Lai Jiangshan <laijs@...ux.alibaba.com>,
Andy Lutomirski <luto@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>
Subject: [PATCH 21/24] x86/entry: Add the C version ist_switch_to_kernel_gsbase()
From: Lai Jiangshan <laijs@...ux.alibaba.com>
It implements the second half of paranoid_entry() whose functionality
is to switch to kernel gsbase.
Not functional difference intended.
Signed-off-by: Lai Jiangshan <laijs@...ux.alibaba.com>
---
arch/x86/entry/traps.c | 48 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 48 insertions(+)
diff --git a/arch/x86/entry/traps.c b/arch/x86/entry/traps.c
index 9f4bc52410d0..b5c92b4e0cb5 100644
--- a/arch/x86/entry/traps.c
+++ b/arch/x86/entry/traps.c
@@ -981,6 +981,54 @@ static __always_inline unsigned long get_percpu_base(void)
return pcpu_unit_offsets;
}
#endif
+
+/*
+ * Handle GSBASE depends on the availability of FSGSBASE.
+ *
+ * Without FSGSBASE the kernel enforces that negative GSBASE
+ * values indicate kernel GSBASE. With FSGSBASE no assumptions
+ * can be made about the GSBASE value when entering from user
+ * space.
+ */
+static __always_inline unsigned long ist_switch_to_kernel_gsbase(void)
+{
+ unsigned long gsbase;
+
+ if (static_cpu_has(X86_FEATURE_FSGSBASE)) {
+ /*
+ * Read the current GSBASE for return.
+ * Retrieve and set the current CPUs kernel GSBASE.
+ *
+ * The unconditional write to GS base below ensures that
+ * no subsequent loads based on a mispredicted GS base can
+ * happen, therefore no LFENCE is needed here.
+ */
+ gsbase = rdgsbase();
+ wrgsbase(get_percpu_base());
+ return gsbase;
+ }
+
+ gsbase = __rdmsr(MSR_GS_BASE);
+
+ /*
+ * The kernel-enforced convention is a negative GSBASE indicates
+ * a kernel value. No SWAPGS needed on entry and exit.
+ */
+ if ((long)gsbase < 0)
+ return 1;
+
+ native_swapgs();
+
+ /*
+ * The above ist_switch_to_kernel_cr3() doesn't do an unconditional
+ * CR3 write, even in the PTI case. So do an lfence to prevent GS
+ * speculation, regardless of whether PTI is enabled.
+ */
+ fence_swapgs_kernel_entry();
+
+ /* SWAPGS required on exit */
+ return 0;
+}
#endif
static bool is_sysenter_singlestep(struct pt_regs *regs)
--
2.19.1.6.gb485710b
Powered by blists - more mailing lists