[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<LV3PR12MB92652373139858D54C5D538C94C4A@LV3PR12MB9265.namprd12.prod.outlook.com>
Date: Tue, 4 Nov 2025 16:54:43 +0000
From: "Kaplan, David" <David.Kaplan@....com>
To: Nikolay Borisov <nik.borisov@...e.com>, Thomas Gleixner
<tglx@...utronix.de>, Borislav Petkov <bp@...en8.de>, Peter Zijlstra
<peterz@...radead.org>, Josh Poimboeuf <jpoimboe@...nel.org>, Pawan Gupta
<pawan.kumar.gupta@...ux.intel.com>, Ingo Molnar <mingo@...hat.com>, Dave
Hansen <dave.hansen@...ux.intel.com>, "x86@...nel.org" <x86@...nel.org>, "H .
Peter Anvin" <hpa@...or.com>
CC: Alexander Graf <graf@...zon.com>, Boris Ostrovsky
<boris.ostrovsky@...cle.com>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>
Subject: RE: [RFC PATCH 50/56] x86/alternative: Add re-patch support
[AMD Official Use Only - AMD Internal Distribution Only]
> -----Original Message-----
> From: Nikolay Borisov <nik.borisov@...e.com>
> Sent: Friday, October 31, 2025 5:23 AM
> To: Kaplan, David <David.Kaplan@....com>; Thomas Gleixner
> <tglx@...utronix.de>; Borislav Petkov <bp@...en8.de>; Peter Zijlstra
> <peterz@...radead.org>; Josh Poimboeuf <jpoimboe@...nel.org>; Pawan
> Gupta <pawan.kumar.gupta@...ux.intel.com>; Ingo Molnar
> <mingo@...hat.com>; Dave Hansen <dave.hansen@...ux.intel.com>;
> x86@...nel.org; H . Peter Anvin <hpa@...or.com>
> Cc: Alexander Graf <graf@...zon.com>; Boris Ostrovsky
> <boris.ostrovsky@...cle.com>; linux-kernel@...r.kernel.org
> Subject: Re: [RFC PATCH 50/56] x86/alternative: Add re-patch support
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> On 10/13/25 17:34, David Kaplan wrote:
> > Updating alternatives is done under the biggest hammers possible. The
> > freezer is used to freeze all processes and kernel threads at safe
> > points to ensure they are not in the middle of a sequence we're about to
> > patch. Then stop_machine_nmi() synchronizes all CPUs and puts them into
> > a tight spin loop while re-patching occurs. The actual patching is done
> > using simple memcpy, just like during boot.
> >
> > Signed-off-by: David Kaplan <david.kaplan@....com>
> > ---
> > arch/x86/include/asm/alternative.h | 6 ++
> > arch/x86/kernel/alternative.c | 131
> +++++++++++++++++++++++++++++
> > 2 files changed, 137 insertions(+)
> >
> > diff --git a/arch/x86/include/asm/alternative.h
> b/arch/x86/include/asm/alternative.h
> > index 61ce8a4b1aa6..f0b863292c3c 100644
> > --- a/arch/x86/include/asm/alternative.h
> > +++ b/arch/x86/include/asm/alternative.h
> > @@ -19,6 +19,7 @@
> > #ifndef __ASSEMBLER__
> >
> > #include <linux/stddef.h>
> > +#include <linux/static_call_types.h>
> >
> > /*
> > * Alternative inline assembly for SMP.
> > @@ -89,6 +90,9 @@ extern s32 __cfi_sites[], __cfi_sites_end[];
> > extern s32 __ibt_endbr_seal[], __ibt_endbr_seal_end[];
> > extern s32 __smp_locks[], __smp_locks_end[];
> >
> > +extern struct static_call_site __start_static_call_sites[],
> > + __stop_static_call_sites[];
> > +
> > /*
> > * Debug flag that can be tested to see whether alternative
> > * instructions were patched in already:
> > @@ -98,6 +102,8 @@ extern int alternatives_patched;
> > struct module;
> >
> > #ifdef CONFIG_DYNAMIC_MITIGATIONS
> > +extern void cpu_update_alternatives(void);
> > +extern void cpu_prepare_repatch_alternatives(void);
> > extern void reset_retpolines(s32 *start, s32 *end, struct module *mod);
> > extern void reset_returns(s32 *start, s32 *end, struct module *mod);
> > extern void reset_alternatives(struct alt_instr *start, struct alt_instr *end,
> > diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> > index 23bb3386ec5e..613cb645bd9f 100644
> > --- a/arch/x86/kernel/alternative.c
> > +++ b/arch/x86/kernel/alternative.c
> > @@ -6,12 +6,15 @@
> > #include <linux/vmalloc.h>
> > #include <linux/memory.h>
> > #include <linux/execmem.h>
> > +#include <linux/stop_machine.h>
> > +#include <linux/freezer.h>
> >
> > #include <asm/text-patching.h>
> > #include <asm/insn.h>
> > #include <asm/ibt.h>
> > #include <asm/set_memory.h>
> > #include <asm/nmi.h>
> > +#include <asm/bugs.h>
> >
> > int __read_mostly alternatives_patched;
> >
> > @@ -3468,4 +3471,132 @@ void its_free_all(struct module *mod)
> > its_page = NULL;
> > }
> > #endif
> > +static atomic_t thread_ack;
> > +
> > +/*
> > + * This function is called by ALL online CPUs but only CPU0 will do the
> > + * re-patching. It is important that all other cores spin in the tight loop
> > + * below (and not in multi_cpu_stop) because they cannot safely do return
> > + * instructions while returns are being patched. Therefore, spin them here
> > + * (with interrupts disabled) until CPU0 has finished its work.
> > + */
> > +static int __cpu_update_alternatives(void *__unused)
> > +{
> > + if (smp_processor_id()) {
> > + atomic_dec(&thread_ack);
> > + while (!READ_ONCE(alternatives_patched))
> > + cpu_relax();
> > +
> > + cpu_bugs_update_speculation_msrs();
> > + } else {
> > + repatch_in_progress = true;
> > +
> > + /* Wait for all cores to enter this function. */
> > + while (atomic_read(&thread_ack))
> > + cpu_relax();
> > +
> > + /* These must be un-done in the opposite order in which they were
> applied. */
> > + reset_alternatives(__alt_instructions, __alt_instructions_end, NULL);
> > + reset_builtin_callthunks();
> > + reset_returns(__return_sites, __return_sites_end, NULL);
> > + reset_retpolines(__retpoline_sites, __retpoline_sites_end, NULL);
> > +
> > + apply_retpolines(__retpoline_sites, __retpoline_sites_end, NULL);
> > + apply_returns(__return_sites, __return_sites_end, NULL);
>
> This triggers the following splat:
>
> [ 363.467469] BUG: sleeping function called from invalid context at
> kernel/locking/mutex.c:575
> [ 363.467472] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 18, name:
> migration/0
>
<snip>
>
> The reason being apply_returns->__static_call_fixup acquires text_mutex from
> NMI context.
>
Thank you for testing the code and reporting this! I am looking into how best to resolve this.
--David Kaplan
Powered by blists - more mailing lists