[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6197fd94-76a9-a391-f290-7001a71add7f@kernel.org>
Date: Sun, 23 May 2021 20:25:35 -0700
From: Andy Lutomirski <luto@...nel.org>
To: "Chang S. Bae" <chang.seok.bae@...el.com>, bp@...e.de,
tglx@...utronix.de, mingo@...nel.org, x86@...nel.org
Cc: len.brown@...el.com, dave.hansen@...el.com, jing2.liu@...el.com,
ravi.v.shankar@...el.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 25/28] x86/fpu/xstate: Skip writing zeros to signal
frame for dynamic user states if in INIT-state
On 5/23/21 12:32 PM, Chang S. Bae wrote:
> By default, for xstate features in the INIT-state, XSAVE writes zeros to
> the uncompressed destination buffer.
>
> E.g., if you are not using AVX-512, you will still get a bunch of zeros on
> the signal stack where live AVX-512 data would go.
>
> For 'dynamic user state' (currently only XTILEDATA), explicitly skip this
> data transfer. The result is that the user buffer for the AMX region will
> not be touched by XSAVE.
Why?
>
> Signed-off-by: Chang S. Bae <chang.seok.bae@...el.com>
> Reviewed-by: Len Brown <len.brown@...el.com>
> Cc: x86@...nel.org
> Cc: linux-kernel@...r.kernel.org
> ---
> Changes from v4:
> * Added as a new patch.
> ---
> arch/x86/include/asm/fpu/internal.h | 22 +++++++++++++++++++---
> 1 file changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
> index 4a3436684805..131f2557fc85 100644
> --- a/arch/x86/include/asm/fpu/internal.h
> +++ b/arch/x86/include/asm/fpu/internal.h
> @@ -354,11 +354,27 @@ static inline void copy_kernel_to_xregs(struct xregs_state *xstate, u64 mask)
> */
> static inline int copy_xregs_to_user(struct xregs_state __user *buf)
> {
> - u64 mask = current->thread.fpu.state_mask;
> - u32 lmask = mask;
> - u32 hmask = mask >> 32;
> + u64 state_mask = current->thread.fpu.state_mask;
> + u64 dynamic_state_mask;
> + u32 lmask, hmask;
> int err;
>
> + dynamic_state_mask = state_mask & xfeatures_mask_user_dynamic;
> + if (dynamic_state_mask && boot_cpu_has(X86_FEATURE_XGETBV1)) {
> + u64 dynamic_xinuse, dynamic_init;
> + u64 xinuse = xgetbv(1);
> +
> + dynamic_xinuse = xinuse & dynamic_state_mask;
> + dynamic_init = ~(xinuse) & dynamic_state_mask;
> + if (dynamic_init) {
> + state_mask &= ~xfeatures_mask_user_dynamic;
> + state_mask |= dynamic_xinuse;
That's a long-winded way to say:
state_mask &= ~dynamic_init;
But what happens if we don't have the XGETBV1 feature? Are we making
AMX support depend on XGETBV1?
How does this patch interact with "[PATCH v5 24/28] x86/fpu/xstate: Use
per-task xstate mask for saving xstate in signal frame"? They seem to
be try to do something similar but not quite the same, and they seem to
be patching the same function. The result seems odd.
Finally, isn't part of the point that we need to avoid even *allocating*
space for non-AMX-using tasks? That would require writing out the
compacted format and/or fiddling with XCR0.
Powered by blists - more mailing lists