[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251007065119.148605-6-sohil.mehta@intel.com>
Date: Mon, 6 Oct 2025 23:51:09 -0700
From: Sohil Mehta <sohil.mehta@...el.com>
To: x86@...nel.org,
Dave Hansen <dave.hansen@...ux.intel.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>
Cc: Jonathan Corbet <corbet@....net>,
"H . Peter Anvin" <hpa@...or.com>,
Andy Lutomirski <luto@...nel.org>,
Josh Poimboeuf <jpoimboe@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ard Biesheuvel <ardb@...nel.org>,
"Kirill A . Shutemov" <kas@...nel.org>,
Sohil Mehta <sohil.mehta@...el.com>,
Xin Li <xin@...or.com>,
David Woodhouse <dwmw@...zon.co.uk>,
Sean Christopherson <seanjc@...gle.com>,
Rick Edgecombe <rick.p.edgecombe@...el.com>,
Vegard Nossum <vegard.nossum@...cle.com>,
Andrew Cooper <andrew.cooper3@...rix.com>,
David Laight <david.laight.linux@...il.com>,
Randy Dunlap <rdunlap@...radead.org>,
Geert Uytterhoeven <geert@...ux-m68k.org>,
Kees Cook <kees@...nel.org>,
Tony Luck <tony.luck@...el.com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org,
linux-efi@...r.kernel.org
Subject: [PATCH v10 05/15] x86/cpu: Defer CR pinning enforcement until late_initcall()
Problem
-------
In order to map the EFI runtime services, set_virtual_address_map()
needs to be called, which resides in the lower half of the address
space. This means that LASS needs to be temporarily disabled around this
call.
Wrapping efi_enter_virtual_mode() with lass_stac()/clac() is not enough,
because the AC flag only gates data accesses, not instruction fetches.
Clearing the CR4.LASS bit is required to make this work.
However, pinned CR4 bits are not expected to be modified after
boot CPU init, resulting in a kernel warning.
Solution
--------
One option is to move the CR pinning setup immediately after the runtime
services have been mapped. However, that is a narrow fix that would
require revisiting if something else needs to modify a pinned CR bit.
CR pinning mainly prevents exploits from trivially modifying
security-sensitive CR bits. There is limited benefit to enabling CR
pinning before userspace comes up. Defer CR pinning enforcement until
late_initcall() to allow EFI and future users to modify the CR bits
without any concern for CR pinning.
Save the pinned bits while initializing the boot CPU because they are
needed later to program the value on APs when they come up.
Note
----
This introduces a small window between the boot CPU being initialized
and CR pinning being enforced, where any in-kernel clearing of the
pinned bits could go unnoticed. Later, when enforcement begins, a
warning is triggered as soon as any CR4 bit is modified, such as
X86_CR4_PGE during a TLB flush.
Currently, this is a purely theoretical concern. There are multiple ways
to resolve it [1] if it becomes a problem in practice.
Link: https://lore.kernel.org/lkml/c59aa7ac-62a6-45ec-b626-de518b25f7d9@intel.com/ [1]
Suggested-by: Dave Hansen <dave.hansen@...el.com>
Signed-off-by: Sohil Mehta <sohil.mehta@...el.com>
---
v10:
- Split recording pinned bits and enabling pinning into two functions.
- Defer pinning until userspace comes up.
This patch does not include any changes to harden the CR pinning
implementation, as that is beyond the scope of this series.
---
arch/x86/kernel/cpu/common.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 61ab332eaf73..57d5824465b0 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -476,8 +476,8 @@ void cr4_init(void)
if (boot_cpu_has(X86_FEATURE_PCID))
cr4 |= X86_CR4_PCIDE;
- if (static_branch_likely(&cr_pinning))
- cr4 = (cr4 & ~cr4_pinned_mask) | cr4_pinned_bits;
+
+ cr4 = (cr4 & ~cr4_pinned_mask) | cr4_pinned_bits;
__write_cr4(cr4);
@@ -487,14 +487,21 @@ void cr4_init(void)
/*
* Once CPU feature detection is finished (and boot params have been
- * parsed), record any of the sensitive CR bits that are set, and
- * enable CR pinning.
+ * parsed), record any of the sensitive CR bits that are set.
*/
-static void __init setup_cr_pinning(void)
+static void __init record_cr_pinned_bits(void)
{
cr4_pinned_bits = this_cpu_read(cpu_tlbstate.cr4) & cr4_pinned_mask;
+}
+
+/* Enables enforcement of the CR pinned bits */
+static int __init enable_cr_pinning(void)
+{
static_key_enable(&cr_pinning.key);
+
+ return 0;
}
+late_initcall(enable_cr_pinning);
static __init int x86_nofsgsbase_setup(char *arg)
{
@@ -2119,7 +2126,7 @@ static __init void identify_boot_cpu(void)
enable_sep_cpu();
#endif
cpu_detect_tlb(&boot_cpu_data);
- setup_cr_pinning();
+ record_cr_pinned_bits();
tsx_init();
tdx_init();
--
2.43.0
Powered by blists - more mailing lists