[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251014231039.6d23008f@kf-m2g5>
Date: Tue, 14 Oct 2025 23:10:39 -0500
From: Aaron Rainbolt <arraybolt3@...il.com>
To: David Kaplan <david.kaplan@....com>
Cc: Thomas Gleixner <tglx@...utronix.de>, Borislav Petkov <bp@...en8.de>,
Peter Zijlstra <peterz@...radead.org>, Josh Poimboeuf
<jpoimboe@...nel.org>, Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
Ingo Molnar <mingo@...hat.com>, Dave Hansen <dave.hansen@...ux.intel.com>,
<x86@...nel.org>, "H . Peter Anvin" <hpa@...or.com>, Alexander Graf
<graf@...zon.com>, Boris Ostrovsky <boris.ostrovsky@...cle.com>,
<linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH 00/56] Dynamic mitigations
On Mon, 13 Oct 2025 09:33:48 -0500
David Kaplan <david.kaplan@....com> wrote:
> Dynamic mitigations enables changing the kernel CPU security
> mitigations at runtime without a reboot/kexec.
>
> Previously, mitigation choices had to be made on the kernel cmdline.
> With this feature an administrator can select new mitigation choices
> by writing a sysfs file, after which the kernel will re-patch itself
> based on the new mitigations.
>
> As the performance cost of CPU mitigations can be significant,
> selecting the right set of mitigations is important to achieve the
> correct balance of performance/security.
>
> Use
> ---
> As described in the supplied documentation file, new mitigations are
> selected by writing cmdline options to a new sysfs file. Only cmdline
> options related to mitigations are recognized via this interface. All
> previous mitigation-related cmdline options are ignored and
> selections are done based on the new options.
>
> Examples:
> echo "mitigations=off" > /sys/devices/system/cpu/mitigations
> echo "spectre_v2=retpoline tsa=off" >
> /sys/devices/system/cpu/mitigations
If `root` is capable of setting `mitigations=off` via this interface,
doesn't that somewhat defeat the purpose of denying `/proc/kcore`
access in lockdown confidentiality mode? Assuming one is running on a
CPU with some form of side-channel memory read vulnerability (which they
very likely are), they can turn off all mitigations, then read kernel
memory via one of those exploits.
There should be a one-way switch to allow denying all further writes to
this interface, so that once the system's mitigations are set properly,
any further attempts to change them until the next reboot can be
prevented.
--
Aaron
>
> There are several use cases that will benefit from dynamic
> mitigations:
>
> Use Cases
> ---------
> 1. Runtime Policy
>
> Some workflows rely on booting a generic kernel before customizing
> the system. cloud-init is a popular example of this where a VM is
> started typically with default settings and then is customized based
> on a customer-provided configuration file.
>
> As flows like this rely on configuring the system after boot, they
> currently cannot customize the mitigation policy. With dynamic
> mitigations, this configuration information can be augmented to
> include security policy information.
>
> For example, a cloud VM which runs only trusted workloads likely does
> not need any CPU security mitigations applied. But as this policy
> information is not known at boot time, the kernel will be booted with
> unnecessary mitigations enabled. With dynamic mitigations, these
> mitigations can be disabled during boot after policy information is
> retrieved, improving performance.
>
> 2. Mitigation Changes
>
> Sometimes there are needs to change the mitigation settings in light
> of new security findings. For example, AMD-SB-1036 advised of a
> security issue with a spectre v2 mitigation and advised using a
> different one instead.
>
> With dynamic mitigations, such changes can be made without a
> reboot/kexec which minimizes disruption in environments which cannot
> easily tolerate such an event.
>
> 3. Mitigation Testing
>
> Being able to quickly change between different mitigation settings
> without having to restart applications is beneficial when conducting
> mitigation development and testing.
>
> Note that some bugs have multiple mitigation options, which may have
> varying performance impacts. Being able to quickly switch between
> them makes evaluating such options easier.
>
>
> Implementation Details
> ----------------------
> Re-patching the kernel is expected to be a very rare operation and is
> done under very big hammers. All tasks are put into the freezer and
> the re-patching is then done under the (new) stop_machine_nmi()
> routine.
>
> To re-patch the kernel, it is first reverted back to its compile-time
> state. The original bytes from alternatives, retpolines, etc. are
> saved during boot so they can later be used to restore the original
> kernel image. After that, the kernel is patched based on the new
> feature flags.
>
> This simplifies the re-patch process as restoring the original kernel
> image is relatively straightforward. In other words, instead of
> having to re-patch from mitigation A to mitigation B directly, we
> first restore the original image and then patch from that to
> mitigation B, similar to if the system had booted with mitigation B
> selected originally.
>
>
> Performance
> -----------
> Testing so far has demonstrated that re-patching takes ~50ms on an
> AMD EPYC 7713 running a typical Ubuntu kernel with around 100 modules
> loaded.
>
> Guide to Patch Series
> ---------------------
> As this series is rather lengthy, this may help with understanding it:
>
> Patches 3-18 focus on "resetting" mitigations. Every bug that may
> set feature flags, MSRs, static branches, etc. now has matching
> "reset" functions that will undo all these changes. This is used at
> the beginning of the re-patch flow.
>
> Patches 20-22 move various functions and values out of the .init
> section. Most of the existing mitigation logic was marked as __init
> and the mitigation settings as __ro_after_init but now these can be
> changed at runtime. The __ro_after_init marking functioned as a
> defense-in-depth measure but is arguably of limited meaningful
> security value as an attacker who can modify kernel data can do a lot
> worse than change some speculation settings. As re-patching requires
> being able to modify these settings, it was simplest to remove them
> from that section.
>
> Patches 23-27 involve linker and related modifications to keep
> alternative information around at runtime instead of free'ing it
> after boot. This does result in slightly higher runtime memory
> consumption which is one reason why this feature is behind a Kconfig
> option. On a typical kernel, this was measured at around 2MB of
> extra kernel memory usage.
>
> Patches 28-30 focus on the new stop_machine_nmi() which behaves like
> stop_machine() but runs the handler in NMI context, thus ensuring
> that even NMIs cannot interrupt the handler. As dynamic mitigations
> involves re-patching functions used by NMI entry code, this is
> required for safety.
>
> Patches 31-40 focus on support for restoring the kernel text at
> runtime. This involves saving the original kernel bytes when patched
> the first time and adding support to then restore those later.
>
> Patches 41-44 start building support for updating code, in particular
> module code at runtime.
>
> Patches 45-47 focus on support for the Indirect Target Selection
> mitigation which is particularly challenging because it requires
> runtime memory allocations and permission changes which are not
> possible in NMI context. As a result, ITS memory is pre-allocated
> before entering NMI context.
>
> Patch 50 adds the complete function for resetting and re-patching the
> kernel.
>
> Patches 51-53 build the sysfs interface for re-patching and support
> for parsing the new options provided.
>
> Patches 54-56 add debugfs interfaces to values which are important for
> mitigations. These are useful for userspace test utilities to be
> able to force a CPU to appear to be vulnerable or immune to certain
> bugs as well as being able to help verify if the kernel is correctly
> mitigating various vulnerabilities.
>
> David Kaplan (56):
> Documentation/admin-guide: Add documentation
> x86/Kconfig: Add CONFIG_DYNAMIC_MITIGATIONS
> cpu: Reset global mitigations
> x86/bugs: Reset spectre_v1 mitigations
> x86/bugs: Reset spectre_v2 mitigations
> x86/bugs: Reset retbleed mitigations
> x86/bugs: Reset spectre_v2_user mitigations
> x86/bugs: Reset SSB mitigations
> x86/bugs: Reset L1TF mitigations
> x86/bugs: Reset MDS mitigations
> x86/bugs: Reset MMIO mitigations
> x86/bugs: Reset SRBDS mitigations
> x86/bugs: Reset SRSO mitigations
> x86/bugs: Reset GDS mitigations
> x86/bugs: Reset BHI mitigations
> x86/bugs: Reset ITS mitigation
> x86/bugs: Reset TSA mitigations
> x86/bugs: Reset VMSCAPE mitigations
> x86/bugs: Define bugs_smt_disable()
> x86/bugs: Move bugs.c logic out of .init section
> x86/callthunks: Move logic out of .init
> cpu: Move mitigation logic out of .init
> x86/vmlinux.lds: Move alternative sections
> x86/vmlinux.lds: Move altinstr_aux conditionally
> x86/vmlinux.lds: Define __init_alt_end
> module: Save module ELF info
> x86/mm: Conditionally free alternative sections
> stop_machine: Add stop_machine_nmi()
> x86/apic: Add self-NMI support
> x86/nmi: Add support for stop_machine_nmi()
> x86/alternative: Prepend nops with retpolines
> x86/alternative: Add module param
> x86/alternative: Avoid re-patching init code
> x86/alternative: Save old bytes for alternatives
> x86/alternative: Save old bytes for retpolines
> x86/alternative: Do not recompute len on re-patch
> x86/alternative: Reset alternatives
> x86/callthunks: Reset callthunks
> x86/sync_core: Add sync_core_nmi_safe()
> x86/alternative: Use sync_core_nmi_safe()
> static_call: Add update_all_static_calls()
> module: Make memory writeable for re-patching
> module: Update alternatives
> x86/module: Update alternatives
> x86/alternative: Use boot_cpu_has in ITS code
> x86/alternative: Add ITS re-patching support
> x86/module: Add ITS re-patch support for modules
> x86/bugs: Move code for updating speculation MSRs
> x86/fpu: Qualify warning in os_xsave
> x86/alternative: Add re-patch support
> cpu: Parse string of mitigation options
> x86/bugs: Support parsing mitigation options
> drivers/cpu: Re-patch mitigations through sysfs
> x86/debug: Create debugfs interface to x86_capabilities
> x86/debug: Show return thunk in debugfs
> x86/debug: Show static branch config in debugfs
>
> .../ABI/testing/sysfs-devices-system-cpu | 8 +
> .../hw-vuln/dynamic_mitigations.rst | 75 ++
> Documentation/admin-guide/hw-vuln/index.rst | 1 +
> arch/x86/Kconfig | 12 +
> arch/x86/entry/vdso/vma.c | 2 +-
> arch/x86/include/asm/alternative.h | 51 +-
> arch/x86/include/asm/bugs.h | 4 +
> arch/x86/include/asm/module.h | 10 +
> arch/x86/include/asm/sync_core.h | 14 +
> arch/x86/kernel/alternative.c | 497 ++++++++++++-
> arch/x86/kernel/apic/ipi.c | 7 +
> arch/x86/kernel/callthunks.c | 85 ++-
> arch/x86/kernel/cpu/bugs.c | 686
> +++++++++++++----- arch/x86/kernel/cpu/common.c |
> 65 +- arch/x86/kernel/cpu/cpu.h | 4 -
> arch/x86/kernel/fpu/xstate.h | 2 +-
> arch/x86/kernel/module.c | 96 ++-
> arch/x86/kernel/nmi.c | 4 +
> arch/x86/kernel/static_call.c | 3 +-
> arch/x86/kernel/vmlinux.lds.S | 110 +--
> arch/x86/mm/init.c | 12 +-
> arch/x86/mm/mm_internal.h | 2 +
> arch/x86/tools/relocs.c | 1 +
> drivers/base/cpu.c | 113 +++
> include/linux/cpu.h | 10 +
> include/linux/module.h | 11 +
> include/linux/static_call.h | 2 +
> include/linux/stop_machine.h | 32 +
> kernel/cpu.c | 62 +-
> kernel/module/main.c | 78 +-
> kernel/static_call_inline.c | 22 +
> kernel/stop_machine.c | 79 +-
> 32 files changed, 1876 insertions(+), 284 deletions(-)
> create mode 100644
> Documentation/admin-guide/hw-vuln/dynamic_mitigations.rst
>
>
> base-commit: a5652f0f2a69fadcfb2f687a11a737a57f15b28e
Content of type "application/pgp-signature" skipped
Powered by blists - more mailing lists