[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250811185044.2227268-1-sohil.mehta@intel.com>
Date: Mon, 11 Aug 2025 11:50:44 -0700
From: Sohil Mehta <sohil.mehta@...el.com>
To: Dave Hansen <dave.hansen@...ux.intel.com>,
x86@...nel.org
Cc: Borislav Petkov <bp@...en8.de>,
"H . Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
Dave Hansen <dave.hansen@...el.com>,
Sohil Mehta <sohil.mehta@...el.com>,
Sean Christopherson <seanjc@...gle.com>,
Peter Zijlstra <peterz@...radead.org>,
Vignesh Balasubramanian <vigbalas@....com>,
Rick Edgecombe <rick.p.edgecombe@...el.com>,
Oleg Nesterov <oleg@...hat.com>,
"Chang S . Bae" <chang.seok.bae@...el.com>,
Brian Gerst <brgerst@...il.com>,
Eric Biggers <ebiggers@...gle.com>,
Kees Cook <kees@...nel.org>,
Chao Gao <chao.gao@...el.com>,
Christoph Hellwig <hch@...radead.org>,
Fushuai Wang <wangfushuai@...du.com>,
linux-kernel@...r.kernel.org
Subject: [PATCH v4] x86/fpu: Fix NULL dereference in avx512_status()
From: Fushuai Wang <wangfushuai@...du.com>
Problem
-------
With CONFIG_X86_DEBUG_FPU enabled, reading /proc/[kthread]/arch_status
causes a kernel NULL pointer dereference.
Kernel threads aren't expected to access the FPU state directly. Kernel
usage of FPU registers is contained within kernel_fpu_begin()/_end()
sections.
However, to report AVX-512 usage, the avx512_timestamp variable within
struct fpu needs to be accessed, which triggers a warning in
x86_task_fpu().
For Kthreads:
proc_pid_arch_status()
avx512_status()
x86_task_fpu() => Warning and returns NULL
x86_task_fpu()->avx512_timestamp => NULL dereference
The warning is a false alarm since the access isn't intended for
modifying the FPU state. All kernel threads (except the init_task) have
a "struct fpu" with an avx512_timestamp variable that is valid to
access. Also, the init_task (PID 0) never follows this path since it is
not exposed in /proc.
Solution
--------
One option is to get rid of the warning in x86_task_fpu() for kernel
threads. However, that warning was recently added and might be useful to
catch any potential misuse of the FPU state in kernel threads.
A better option is to avoid the access altogether. The kernel does not
track AVX-512 usage for kernel threads.
save_fpregs_to_fpstate()->update_avx_timestamp() is never invoked for
kernel threads, so avx512_timestamp is always guaranteed to be 0.
Also, the legacy behavior of reporting "AVX512_elapsed_ms: -1", which
signifies "no AVX-512 usage", is misleading. The kernel usage just isn't
tracked.
For now, update the ABI for kernel threads and do not report AVX-512
usage for them. Reading /proc/[kthread]/arch_status would display no
AVX-512 information. This avoids the NULL dereference as well as the
misleading report.
Suggested-by: Dave Hansen <dave.hansen@...el.com>
Fixes: 22aafe3bcb67 ("x86/fpu: Remove init_task FPU state dependencies, add debugging warning for PF_KTHREAD tasks")
Cc: <stable@...r.kernel.org> # v6.15+
Signed-off-by: Fushuai Wang <wangfushuai@...du.com>
Co-developed-by: Sohil Mehta <sohil.mehta@...el.com>
Signed-off-by: Sohil Mehta <sohil.mehta@...el.com>
---
v4:
- No significant change, minor wording improvements.
v3: https://lore.kernel.org/lkml/20250724013422.307954-1-sohil.mehta@intel.com/
- Do not report anything for kernel threads. (DaveH)
- Make the commit message more precise.
v2:
- Avoid making the fix dependent on CONFIG_X86_DEBUG_FPU.
- Include PF_USER_WORKER in the kernel thread check.
- Update commit message for clarity.
---
arch/x86/kernel/fpu/xstate.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 12ed75c1b567..28e4fd65c9da 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1881,19 +1881,20 @@ long fpu_xstate_prctl(int option, unsigned long arg2)
#ifdef CONFIG_PROC_PID_ARCH_STATUS
/*
* Report the amount of time elapsed in millisecond since last AVX512
- * use in the task.
+ * use in the task. Report -1 if no AVX-512 usage.
*/
static void avx512_status(struct seq_file *m, struct task_struct *task)
{
- unsigned long timestamp = READ_ONCE(x86_task_fpu(task)->avx512_timestamp);
- long delta;
+ unsigned long timestamp;
+ long delta = -1;
- if (!timestamp) {
- /*
- * Report -1 if no AVX512 usage
- */
- delta = -1;
- } else {
+ /* AVX-512 usage is not tracked for kernel threads. Don't report anything. */
+ if (task->flags & (PF_KTHREAD | PF_USER_WORKER))
+ return;
+
+ timestamp = READ_ONCE(x86_task_fpu(task)->avx512_timestamp);
+
+ if (timestamp) {
delta = (long)(jiffies - timestamp);
/*
* Cap to LONG_MAX if time difference > LONG_MAX
--
2.43.0
Powered by blists - more mailing lists