[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <174956686826.1494782.11512582667456262594.stgit@mhiramat.tok.corp.google.com>
Date: Tue, 10 Jun 2025 23:47:48 +0900
From: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>
Cc: Steven Rostedt <rostedt@...dmis.org>,
x86@...nel.org,
Naresh Kamboju <naresh.kamboju@...aro.org>,
open list <linux-kernel@...r.kernel.org>,
Linux trace kernel <linux-trace-kernel@...r.kernel.org>,
lkft-triage@...ts.linaro.org,
Stephen Rothwell <sfr@...b.auug.org.au>,
Arnd Bergmann <arnd@...db.de>,
Dan Carpenter <dan.carpenter@...aro.org>,
Anders Roxell <anders.roxell@...aro.org>
Subject: [RFC PATCH 2/2] x86: alternative: Invalidate the cache for updated instructions
From: Masami Hiramatsu (Google) <mhiramat@...nel.org>
Invalidate the cache after replacing INT3 with the new instruction.
This will prevent the other CPUs seeing the removed INT3 in their
cache after serializing the pipeline.
LKFT reported an oops by INT3 but there is no INT3 shown in the
dumped code. This means the INT3 is removed after the CPU hits
INT3.
## Test log
ftrace-stress-test: <12>[ 21.971153] /usr/local/bin/kirk[277]:
starting test ftrace-stress-test (ftrace_stress_test.sh 90)
<4>[ 58.997439] Oops: int3: 0000 [#1] SMP PTI
<4>[ 58.998089] CPU: 0 UID: 0 PID: 323 Comm: sh Not tainted
6.15.0-next-20250605 #1 PREEMPT(voluntary)
<4>[ 58.998152] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS 1.16.3-debian-1.16.3-2 04/01/2014
<4>[ 58.998260] RIP: 0010:_raw_spin_lock+0x5/0x50
<4>[ 58.998563] Code: 5d e9 ff 12 00 00 66 66 2e 0f 1f 84 00 00 00
00 00 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3
0f 1e fa 0f <1f> 44 00 00 55 48 89 e5 53 48 89 fb bf 01 00 00 00 e8 15
12 e4 fe
Maybe one possible scenario is to hit the int3 after the third step
somehow (on I-cache).
------
<CPU0> <CPU1>
Start smp_text_poke_batch_finish().
Start the third step. (remove INT3)
on_each_cpu(do_sync_core)
do_sync_core(do SERIALIZE)
Finish the third step.
Hit INT3 (from I-cache)
Clear text_poke_array_refs[cpu0]
Start smp_text_poke_int3_handler()
Failed to get text_poke_array_refs[cpu0]
Oops: int3
------
SERIALIZE instruction flashes pipeline, thus the processor needs
to reload the instruction. But it is not ensured to reload it from
memory because SERIALIZE does not invalidate the cache.
To prevent reloading replaced INT3, we need to invalidate the cache
(flush TLB) in the third step, before the do_sync_core().
Reported-by: Linux Kernel Functional Testing <lkft@...aro.org>
Closes: https://lore.kernel.org/all/CA+G9fYsLu0roY3DV=tKyqP7FEKbOEETRvTDhnpPxJGbA=Cg+4w@mail.gmail.com/
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@...nel.org>
---
arch/x86/kernel/alternative.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index ecfe7b497cad..1b606db48017 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2949,8 +2949,16 @@ void smp_text_poke_batch_finish(void)
do_sync++;
}
- if (do_sync)
+ if (do_sync) {
+ /*
+ * Flush the instructions on the cache, then serialize the
+ * pipeline of each CPU.
+ */
+ flush_tlb_kernel_range((unsigned long)text_poke_addr(&text_poke_array.vec[0]),
+ (unsigned long)text_poke_addr(text_poke_array.vec +
+ text_poke_array.nr_entries - 1));
smp_text_poke_sync_each_cpu();
+ }
/*
* Remove and wait for refs to be zero.
Powered by blists - more mailing lists