lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190124133449.GC11554@zn.tnic>
Date:   Thu, 24 Jan 2019 14:34:49 +0100
From:   Borislav Petkov <bp@...en8.de>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc:     linux-kernel@...r.kernel.org, x86@...nel.org,
        Andy Lutomirski <luto@...nel.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        kvm@...r.kernel.org, "Jason A. Donenfeld" <Jason@...c4.com>,
        Rik van Riel <riel@...riel.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>
Subject: Re: [PATCH 07/22] x86/fpu: Remove fpu->initialized

On Wed, Jan 09, 2019 at 12:47:29PM +0100, Sebastian Andrzej Siewior wrote:
> The `initialized' member of the fpu struct is always set to one user
								 ^
								 for

> tasks and zero for kernel tasks. This avoids saving/restoring the FPU
> registers for kernel threads.
> 
> I expect that fpu->initialized is always 1 and the 0 case has been

"I expect" reads funny. Make that impartial and passive and "expecting"
is the wrong formulation. It needs to be a clear statement talking about
the FPU context's state and why that state is always initialized.

> removed or is not important. For instance fpu__drop() sets the value to
> zero and its caller call either fpu__initialize() (which would

"its callers invoke" or so

> set it back to one) or don't return to userland.
> 
> The context switch code (switch_fpu_prepare() + switch_fpu_finish())
> can't unconditionally save/restore registers for kernel threads. I have
> no idea what will happen if we restore a zero FPU context for the kernel
> thread (since it never was initialized).

Yeah, avoid those "author is wondering" statements.

> Also it has been agreed that
> for PKRU we don't want a random state (inherited from the previous task)
> but a deterministic one.

Rewrite that to state what the PKRU state is going to be.

> For kernel_fpu_begin() (+end) the situation is similar: The kernel test
> bot told me, that EFI with runtime services uses this before
> alternatives_patched is true. Which means that this function is used too
> early and it wasn't the case before.
> 
> For those two cases current->mm is used to determine between user &
> kernel thread.

Now that we start looking at ->mm, I think we should document this
somewhere prominently, maybe

  arch/x86/include/asm/fpu/internal.h

or so along with all the logic this patchset changes wrt FPU handling.
Then we wouldn't have to wonder in the future why stuff is being done
the way it is done.

Like the FPU saving on the user stack frame or why this was needed:

-	/* Update the thread's fxstate to save the fsave header. */
-	if (ia32_fxstate)
-		copy_fxregs_to_kernel(fpu);

Some sort of a high-level invariants written down would save us a lot of
head scratching in the future.

> For kernel_fpu_begin() we skip save/restore of the FPU
> registers.
> During the context switch into a kernel thread we don't do anything.
> There is no reason to save the FPU state of a kernel thread.
> 
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
> ---
>  arch/x86/ia32/ia32_signal.c         | 17 +++-----
>  arch/x86/include/asm/fpu/internal.h | 15 +++----
>  arch/x86/include/asm/fpu/types.h    |  9 ----
>  arch/x86/include/asm/trace/fpu.h    |  5 +--
>  arch/x86/kernel/fpu/core.c          | 68 ++++++++---------------------
>  arch/x86/kernel/fpu/init.c          |  2 -
>  arch/x86/kernel/fpu/regset.c        | 19 ++------
>  arch/x86/kernel/fpu/xstate.c        |  2 -
>  arch/x86/kernel/process_32.c        |  4 +-
>  arch/x86/kernel/process_64.c        |  4 +-
>  arch/x86/kernel/signal.c            | 17 +++-----
>  arch/x86/mm/pkeys.c                 |  7 +--
>  12 files changed, 49 insertions(+), 120 deletions(-)

...

> diff --git a/arch/x86/include/asm/trace/fpu.h b/arch/x86/include/asm/trace/fpu.h
> index 069c04be15076..bd65f6ba950f8 100644
> --- a/arch/x86/include/asm/trace/fpu.h
> +++ b/arch/x86/include/asm/trace/fpu.h
> @@ -13,22 +13,19 @@ DECLARE_EVENT_CLASS(x86_fpu,
>  
>  	TP_STRUCT__entry(
>  		__field(struct fpu *, fpu)
> -		__field(bool, initialized)
>  		__field(u64, xfeatures)
>  		__field(u64, xcomp_bv)
>  		),

Yikes, can you do that?

rostedt has been preaching that adding members at the end of tracepoints
is ok but not changing them in the middle as that breaks ABI.

Might wanna ping him about it first.

>  
>  	TP_fast_assign(
>  		__entry->fpu		= fpu;
> -		__entry->initialized	= fpu->initialized;
>  		if (boot_cpu_has(X86_FEATURE_OSXSAVE)) {
>  			__entry->xfeatures = fpu->state.xsave.header.xfeatures;
>  			__entry->xcomp_bv  = fpu->state.xsave.header.xcomp_bv;
>  		}
>  	),
> -	TP_printk("x86/fpu: %p initialized: %d xfeatures: %llx xcomp_bv: %llx",
> +	TP_printk("x86/fpu: %p xfeatures: %llx xcomp_bv: %llx",
>  			__entry->fpu,
> -			__entry->initialized,
>  			__entry->xfeatures,
>  			__entry->xcomp_bv
>  	)
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index e43296854e379..3a4668c9d24f1 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -101,7 +101,7 @@ static void __kernel_fpu_begin(void)
>  
>  	kernel_fpu_disable();
>  
> -	if (fpu->initialized) {
> +	if (current->mm) {
>  		/*
>  		 * Ignore return value -- we don't care if reg state
>  		 * is clobbered.
> @@ -116,7 +116,7 @@ static void __kernel_fpu_end(void)
>  {
>  	struct fpu *fpu = &current->thread.fpu;
>  
> -	if (fpu->initialized)
> +	if (current->mm)
>  		copy_kernel_to_fpregs(&fpu->state);
>  
>  	kernel_fpu_enable();
> @@ -147,10 +147,9 @@ void fpu__save(struct fpu *fpu)
>  
>  	preempt_disable();
>  	trace_x86_fpu_before_save(fpu);
> -	if (fpu->initialized) {
> -		if (!copy_fpregs_to_fpstate(fpu)) {
> -			copy_kernel_to_fpregs(&fpu->state);
> -		}
> +
> +	if (!copy_fpregs_to_fpstate(fpu)) {
> +		copy_kernel_to_fpregs(&fpu->state);
>  	}

WARNING: braces {} are not necessary for single statement blocks
#217: FILE: arch/x86/kernel/fpu/core.c:151:
+       if (!copy_fpregs_to_fpstate(fpu)) {
+               copy_kernel_to_fpregs(&fpu->state);
        }


...

> diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c
> index 7888a41a03cdb..77d9eb43ccac8 100644
> --- a/arch/x86/kernel/process_32.c
> +++ b/arch/x86/kernel/process_32.c
> @@ -288,10 +288,10 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
>  	if (prev->gs | next->gs)
>  		lazy_load_gs(next->gs);
>  
> -	switch_fpu_finish(next_fpu, cpu);
> -
>  	this_cpu_write(current_task, next_p);
>  
> +	switch_fpu_finish(next_fpu, cpu);
> +
>  	/* Load the Intel cache allocation PQR MSR. */
>  	resctrl_sched_in();
>  
> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
> index e1983b3a16c43..ffea7c557963a 100644
> --- a/arch/x86/kernel/process_64.c
> +++ b/arch/x86/kernel/process_64.c
> @@ -566,14 +566,14 @@ __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
>  
>  	x86_fsgsbase_load(prev, next);
>  
> -	switch_fpu_finish(next_fpu, cpu);
> -
>  	/*
>  	 * Switch the PDA and FPU contexts.
>  	 */
>  	this_cpu_write(current_task, next_p);
>  	this_cpu_write(cpu_current_top_of_stack, task_top_of_stack(next_p));
>  
> +	switch_fpu_finish(next_fpu, cpu);
> +
>  	/* Reload sp0. */
>  	update_task_stack(next_p);
>  

Those moves need at least a comment in the commit message or a separate
patch.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ