linux-kernel - Re: [RFC PATCH 00/56] Dynamic mitigations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251024050058.stc2nthc2bklhyqv@desk>
Date: Thu, 23 Oct 2025 22:00:58 -0700
From: Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>
To: David Kaplan <david.kaplan@....com>
Cc: Thomas Gleixner <tglx@...utronix.de>, Borislav Petkov <bp@...en8.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Josh Poimboeuf <jpoimboe@...nel.org>,
	Ingo Molnar <mingo@...hat.com>,
	Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
	"H . Peter Anvin" <hpa@...or.com>, Alexander Graf <graf@...zon.com>,
	Boris Ostrovsky <boris.ostrovsky@...cle.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 00/56] Dynamic mitigations

On Mon, Oct 13, 2025 at 09:33:48AM -0500, David Kaplan wrote:
> Dynamic mitigations enables changing the kernel CPU security mitigations at
> runtime without a reboot/kexec.
> 
> Previously, mitigation choices had to be made on the kernel cmdline.  With
> this feature an administrator can select new mitigation choices by writing
> a sysfs file, after which the kernel will re-patch itself based on the new
> mitigations.
> 
> As the performance cost of CPU mitigations can be significant, selecting
> the right set of mitigations is important to achieve the correct balance of
> performance/security.
> 
> Use
> ---
> As described in the supplied documentation file, new mitigations are
> selected by writing cmdline options to a new sysfs file.  Only cmdline
> options related to mitigations are recognized via this interface.  All
> previous mitigation-related cmdline options are ignored and selections are
> done based on the new options.
> 
> Examples:
>    echo "mitigations=off" > /sys/devices/system/cpu/mitigations
>    echo "spectre_v2=retpoline tsa=off" > /sys/devices/system/cpu/mitigations
> 
> 
> There are several use cases that will benefit from dynamic mitigations:
> 
> Use Cases
> ---------
> 1. Runtime Policy
> 
> Some workflows rely on booting a generic kernel before customizing the system.
> cloud-init is a popular example of this where a VM is started typically with
> default settings and then is customized based on a customer-provided
> configuration file.
> 
> As flows like this rely on configuring the system after boot, they currently
> cannot customize the mitigation policy.  With dynamic mitigations, this
> configuration information can be augmented to include security policy
> information.
> 
> For example, a cloud VM which runs only trusted workloads likely does not
> need any CPU security mitigations applied.  But as this policy information
> is not known at boot time, the kernel will be booted with unnecessary
> mitigations enabled.  With dynamic mitigations, these mitigations can be
> disabled during boot after policy information is retrieved, improving
> performance.
> 
> 2. Mitigation Changes
> 
> Sometimes there are needs to change the mitigation settings in light of new
> security findings.  For example, AMD-SB-1036 advised of a security issue
> with a spectre v2 mitigation and advised using a different one instead.
> 
> With dynamic mitigations, such changes can be made without a reboot/kexec
> which minimizes disruption in environments which cannot easily tolerate
> such an event.
> 
> 3. Mitigation Testing
> 
> Being able to quickly change between different mitigation settings without
> having to restart applications is beneficial when conducting mitigation
> development and testing.
> 
> Note that some bugs have multiple mitigation options, which may have
> varying performance impacts.  Being able to quickly switch between them
> makes evaluating such options easier.
> 
> 
> Implementation Details
> ----------------------
> Re-patching the kernel is expected to be a very rare operation and is done
> under very big hammers.  All tasks are put into the freezer and the
> re-patching is then done under the (new) stop_machine_nmi() routine.
> 
> To re-patch the kernel, it is first reverted back to its compile-time
> state.  The original bytes from alternatives, retpolines, etc. are saved
> during boot so they can later be used to restore the original kernel image.
> After that, the kernel is patched based on the new feature flags.
> 
> This simplifies the re-patch process as restoring the original kernel image
> is relatively straightforward.  In other words, instead of having to
> re-patch from mitigation A to mitigation B directly, we first restore the
> original image and then patch from that to mitigation B, similar to if the
> system had booted with mitigation B selected originally.
> 
> 
> Performance
> -----------
> Testing so far has demonstrated that re-patching takes ~50ms on an AMD EPYC
> 7713 running a typical Ubuntu kernel with around 100 modules loaded.
> 
> Guide to Patch Series
> ---------------------
> As this series is rather lengthy, this may help with understanding it:
> 
> Patches 3-18 focus on "resetting" mitigations.  Every bug that may set feature
> flags, MSRs, static branches, etc. now has matching "reset" functions that will
> undo all these changes.  This is used at the beginning of the re-patch flow.
> 
> Patches 20-22 move various functions and values out of the .init section.  Most
> of the existing mitigation logic was marked as __init and the mitigation
> settings as __ro_after_init but now these can be changed at runtime.  The
> __ro_after_init marking functioned as a defense-in-depth measure but is arguably
> of limited meaningful security value as an attacker who can modify kernel data
> can do a lot worse than change some speculation settings.  As re-patching
> requires being able to modify these settings, it was simplest to remove them
> from that section.
> 
> Patches 23-27 involve linker and related modifications to keep alternative
> information around at runtime instead of free'ing it after boot.  This does
> result in slightly higher runtime memory consumption which is one reason why
> this feature is behind a Kconfig option.  On a typical kernel, this was measured
> at around 2MB of extra kernel memory usage.
> 
> Patches 28-30 focus on the new stop_machine_nmi() which behaves like
> stop_machine() but runs the handler in NMI context, thus ensuring that even NMIs
> cannot interrupt the handler.  As dynamic mitigations involves re-patching
> functions used by NMI entry code, this is required for safety.
> 
> Patches 31-40 focus on support for restoring the kernel text at runtime.  This
> involves saving the original kernel bytes when patched the first time and adding
> support to then restore those later.
> 
> Patches 41-44 start building support for updating code, in particular module
> code at runtime.
> 
> Patches 45-47 focus on support for the Indirect Target Selection mitigation
> which is particularly challenging because it requires runtime memory allocations
> and permission changes which are not possible in NMI context.  As a result, ITS
> memory is pre-allocated before entering NMI context.
> 
> Patch 50 adds the complete function for resetting and re-patching the kernel.
> 
> Patches 51-53 build the sysfs interface for re-patching and support for parsing
> the new options provided.
> 
> Patches 54-56 add debugfs interfaces to values which are important for
> mitigations.  These are useful for userspace test utilities to be able to force
> a CPU to appear to be vulnerable or immune to certain bugs as well as being able
> to help verify if the kernel is correctly mitigating various vulnerabilities.

Although it adds some complexity, this adds a very useful feature. Thanks
for doing this series.

Just curious, for patching indirect branches, was replacing alternatives
with static_calls considered? I haven't looked at the feasibility, but
static_calls seems to be more suited for post-boot patching.

Thinking out loud, patching in something similar to suspend-to-RAM flow may
reduce some corner cases. Barring the BSP, the APs gets reinitialized in
that case.