[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9d02685f-5c19-47d4-8f7f-bb546c0c7504@intel.com>
Date: Mon, 21 Jul 2025 15:33:30 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: Sohil Mehta <sohil.mehta@...el.com>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org
Cc: Borislav Petkov <bp@...en8.de>, "H . Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Sean Christopherson <seanjc@...gle.com>,
Peter Zijlstra <peterz@...radead.org>,
Vignesh Balasubramanian <vigbalas@....com>,
Rick Edgecombe <rick.p.edgecombe@...el.com>, Oleg Nesterov
<oleg@...hat.com>, "Chang S . Bae" <chang.seok.bae@...el.com>,
Brian Gerst <brgerst@...il.com>, Eric Biggers <ebiggers@...gle.com>,
Kees Cook <kees@...nel.org>, Chao Gao <chao.gao@...el.com>,
Fushuai Wang <wangfushuai@...du.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/2] x86/fpu: Fix NULL dereference in avx512_status()
On 7/21/25 14:53, Sohil Mehta wrote:
> From: Fushuai Wang <wangfushuai@...du.com>
>
> When CONFIG_X86_DEBUG_FPU is set, reading /proc/[kthread]/arch_status
> causes a NULL pointer dereference.
>
> For Kthreads tasks:
> proc_pid_arch_status()
> avx512_status()
> x86_task_fpu() => returns NULL when CONFIG_X86_DEBUG_FPU=y
> x86_task_fpu()->avx512_timestamp => NULL dereference
This seems to imply that CONFIG_X86_DEBUG_FPU _triggers_ the bug.
It certainly makes it obvious, but isn't there a bug with or without
CONFIG_X86_DEBUG_FPU? Even without it, I think it'll access out of the
init_task bounds.
> Kernel threads aren't expected to access FPU state directly. However,
> avx512_timestamp resides within struct fpu which lead to this unique
> situation.
What does this mean? Most kernel threads have a 'struct fpu', right?
> It is uncertain whether kernel threads use AVX-512 in a meaningful way
> that needs userspace reporting. For now, avoid reporting AVX-512 usage
> for kernel threads.
It would be idea if this was more explicit about how this changes the
ABI for kernel threads.
Could we make this more precise, please?
Report "AVX512_elapsed_ms: -1" for kernel tasks, whether they
use AVX-512 or not. This is the same value that is reported for
user tasks which have not been detected using AVX-512.
> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> index 9aa9ac8399ae..a75077c645b6 100644
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -1855,19 +1855,18 @@ long fpu_xstate_prctl(int option, unsigned long arg2)
> #ifdef CONFIG_PROC_PID_ARCH_STATUS
> /*
> * Report the amount of time elapsed in millisecond since last AVX512
> - * use in the task.
> + * use in the task. Report -1 if no AVX-512 usage.
> */
> static void avx512_status(struct seq_file *m, struct task_struct *task)
> {
> - unsigned long timestamp = READ_ONCE(x86_task_fpu(task)->avx512_timestamp);
> - long delta;
> + unsigned long timestamp = 0;
> + long delta = -1;
>
> - if (!timestamp) {
> - /*
> - * Report -1 if no AVX512 usage
> - */
> - delta = -1;
> - } else {
> + /* Do not report AVX-512 usage for kernel threads */
> + if (!(task->flags & (PF_KTHREAD | PF_USER_WORKER)))
> + timestamp = READ_ONCE(x86_task_fpu(task)->avx512_timestamp);
> +
> + if (timestamp) {
> delta = (long)(jiffies - timestamp);
> /*
> * Cap to LONG_MAX if time difference > LONG_MAX
Can we just do:
unsigned long timestamp;
long delta;
if (task->flags & (PF_KTHREAD | PF_USER_WORKER))
return;
timestamp = READ_ONCE(x86_task_fpu(task)->avx512_timestamp);
...
for now, please? That way, there's no made up value for kernel threads.
The value just isn't present in the file.
Powered by blists - more mailing lists