[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190124133449.GC11554@zn.tnic>
Date: Thu, 24 Jan 2019 14:34:49 +0100
From: Borislav Petkov <bp@...en8.de>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: linux-kernel@...r.kernel.org, x86@...nel.org,
Andy Lutomirski <luto@...nel.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Radim Krčmář <rkrcmar@...hat.com>,
kvm@...r.kernel.org, "Jason A. Donenfeld" <Jason@...c4.com>,
Rik van Riel <riel@...riel.com>,
Dave Hansen <dave.hansen@...ux.intel.com>
Subject: Re: [PATCH 07/22] x86/fpu: Remove fpu->initialized
On Wed, Jan 09, 2019 at 12:47:29PM +0100, Sebastian Andrzej Siewior wrote:
> The `initialized' member of the fpu struct is always set to one user
^
for
> tasks and zero for kernel tasks. This avoids saving/restoring the FPU
> registers for kernel threads.
>
> I expect that fpu->initialized is always 1 and the 0 case has been
"I expect" reads funny. Make that impartial and passive and "expecting"
is the wrong formulation. It needs to be a clear statement talking about
the FPU context's state and why that state is always initialized.
> removed or is not important. For instance fpu__drop() sets the value to
> zero and its caller call either fpu__initialize() (which would
"its callers invoke" or so
> set it back to one) or don't return to userland.
>
> The context switch code (switch_fpu_prepare() + switch_fpu_finish())
> can't unconditionally save/restore registers for kernel threads. I have
> no idea what will happen if we restore a zero FPU context for the kernel
> thread (since it never was initialized).
Yeah, avoid those "author is wondering" statements.
> Also it has been agreed that
> for PKRU we don't want a random state (inherited from the previous task)
> but a deterministic one.
Rewrite that to state what the PKRU state is going to be.
> For kernel_fpu_begin() (+end) the situation is similar: The kernel test
> bot told me, that EFI with runtime services uses this before
> alternatives_patched is true. Which means that this function is used too
> early and it wasn't the case before.
>
> For those two cases current->mm is used to determine between user &
> kernel thread.
Now that we start looking at ->mm, I think we should document this
somewhere prominently, maybe
arch/x86/include/asm/fpu/internal.h
or so along with all the logic this patchset changes wrt FPU handling.
Then we wouldn't have to wonder in the future why stuff is being done
the way it is done.
Like the FPU saving on the user stack frame or why this was needed:
- /* Update the thread's fxstate to save the fsave header. */
- if (ia32_fxstate)
- copy_fxregs_to_kernel(fpu);
Some sort of a high-level invariants written down would save us a lot of
head scratching in the future.
> For kernel_fpu_begin() we skip save/restore of the FPU
> registers.
> During the context switch into a kernel thread we don't do anything.
> There is no reason to save the FPU state of a kernel thread.
>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
> ---
> arch/x86/ia32/ia32_signal.c | 17 +++-----
> arch/x86/include/asm/fpu/internal.h | 15 +++----
> arch/x86/include/asm/fpu/types.h | 9 ----
> arch/x86/include/asm/trace/fpu.h | 5 +--
> arch/x86/kernel/fpu/core.c | 68 ++++++++---------------------
> arch/x86/kernel/fpu/init.c | 2 -
> arch/x86/kernel/fpu/regset.c | 19 ++------
> arch/x86/kernel/fpu/xstate.c | 2 -
> arch/x86/kernel/process_32.c | 4 +-
> arch/x86/kernel/process_64.c | 4 +-
> arch/x86/kernel/signal.c | 17 +++-----
> arch/x86/mm/pkeys.c | 7 +--
> 12 files changed, 49 insertions(+), 120 deletions(-)
...
> diff --git a/arch/x86/include/asm/trace/fpu.h b/arch/x86/include/asm/trace/fpu.h
> index 069c04be15076..bd65f6ba950f8 100644
> --- a/arch/x86/include/asm/trace/fpu.h
> +++ b/arch/x86/include/asm/trace/fpu.h
> @@ -13,22 +13,19 @@ DECLARE_EVENT_CLASS(x86_fpu,
>
> TP_STRUCT__entry(
> __field(struct fpu *, fpu)
> - __field(bool, initialized)
> __field(u64, xfeatures)
> __field(u64, xcomp_bv)
> ),
Yikes, can you do that?
rostedt has been preaching that adding members at the end of tracepoints
is ok but not changing them in the middle as that breaks ABI.
Might wanna ping him about it first.
>
> TP_fast_assign(
> __entry->fpu = fpu;
> - __entry->initialized = fpu->initialized;
> if (boot_cpu_has(X86_FEATURE_OSXSAVE)) {
> __entry->xfeatures = fpu->state.xsave.header.xfeatures;
> __entry->xcomp_bv = fpu->state.xsave.header.xcomp_bv;
> }
> ),
> - TP_printk("x86/fpu: %p initialized: %d xfeatures: %llx xcomp_bv: %llx",
> + TP_printk("x86/fpu: %p xfeatures: %llx xcomp_bv: %llx",
> __entry->fpu,
> - __entry->initialized,
> __entry->xfeatures,
> __entry->xcomp_bv
> )
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index e43296854e379..3a4668c9d24f1 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -101,7 +101,7 @@ static void __kernel_fpu_begin(void)
>
> kernel_fpu_disable();
>
> - if (fpu->initialized) {
> + if (current->mm) {
> /*
> * Ignore return value -- we don't care if reg state
> * is clobbered.
> @@ -116,7 +116,7 @@ static void __kernel_fpu_end(void)
> {
> struct fpu *fpu = ¤t->thread.fpu;
>
> - if (fpu->initialized)
> + if (current->mm)
> copy_kernel_to_fpregs(&fpu->state);
>
> kernel_fpu_enable();
> @@ -147,10 +147,9 @@ void fpu__save(struct fpu *fpu)
>
> preempt_disable();
> trace_x86_fpu_before_save(fpu);
> - if (fpu->initialized) {
> - if (!copy_fpregs_to_fpstate(fpu)) {
> - copy_kernel_to_fpregs(&fpu->state);
> - }
> +
> + if (!copy_fpregs_to_fpstate(fpu)) {
> + copy_kernel_to_fpregs(&fpu->state);
> }
WARNING: braces {} are not necessary for single statement blocks
#217: FILE: arch/x86/kernel/fpu/core.c:151:
+ if (!copy_fpregs_to_fpstate(fpu)) {
+ copy_kernel_to_fpregs(&fpu->state);
}
...
> diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
> index 7888a41a03cdb..77d9eb43ccac8 100644
> --- a/arch/x86/kernel/process_32.c
> +++ b/arch/x86/kernel/process_32.c
> @@ -288,10 +288,10 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
> if (prev->gs | next->gs)
> lazy_load_gs(next->gs);
>
> - switch_fpu_finish(next_fpu, cpu);
> -
> this_cpu_write(current_task, next_p);
>
> + switch_fpu_finish(next_fpu, cpu);
> +
> /* Load the Intel cache allocation PQR MSR. */
> resctrl_sched_in();
>
> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
> index e1983b3a16c43..ffea7c557963a 100644
> --- a/arch/x86/kernel/process_64.c
> +++ b/arch/x86/kernel/process_64.c
> @@ -566,14 +566,14 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
>
> x86_fsgsbase_load(prev, next);
>
> - switch_fpu_finish(next_fpu, cpu);
> -
> /*
> * Switch the PDA and FPU contexts.
> */
> this_cpu_write(current_task, next_p);
> this_cpu_write(cpu_current_top_of_stack, task_top_of_stack(next_p));
>
> + switch_fpu_finish(next_fpu, cpu);
> +
> /* Reload sp0. */
> update_task_stack(next_p);
>
Those moves need at least a comment in the commit message or a separate
patch.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
Powered by blists - more mailing lists