lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181109185202.GF21243@zn.tnic>
Date:   Fri, 9 Nov 2018 19:52:02 +0100
From:   Borislav Petkov <bp@...en8.de>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>
Cc:     linux-kernel@...r.kernel.org, x86@...nel.org,
        Andy Lutomirski <luto@...nel.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        kvm@...r.kernel.org, "Jason A. Donenfeld" <Jason@...c4.com>,
        Rik van Riel <riel@...riel.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>
Subject: Re: [PATCH 02/23] x86/fpu: Remove fpu->initialized usage in
 __fpu__restore_sig()

On Fri, Nov 09, 2018 at 06:35:21PM +0100, Sebastian Andrzej Siewior wrote:
> fpu__drop() stets ->initialized to 0. As a result the context switch

"... the context switch path landing in switch_fpu_prepare()... " is what you
mean, right?

> will not save current FPU registers and so _not_ write to fpu->state.
> This also means that CPU's FPU register will be random (inherited from
> the last context)

You mean, the FPU regs will have random values, yes.

> after the context switch. This is also true for usage
> in softirq via kernel_fpu_begin().

So far so good.

Except maybe because I'm dense about FPU, I still am missing something.

You have this path:

__fpu__restore_sig
|-> fpu__clear
 |-> fpu__drop

and that happens on the sigreturn() path.

Now, the context switch happens ... when exactly?

After the sigreturn is done?

It must be because then you'd get that ->state corruption after
->initialized has been cleared.

Right?

<snip a bunch of stuff, we'll get back to it later>

> So. The fix would be:
> @@ -344,10 +344,10 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
>                         sanitize_restored_xstate(tsk, &env, xfeatures, fx_only);
>                 }
>  
> +               local_bh_disable();
>                 fpu->initialized = 1;
> -               preempt_disable();
>                 fpu__restore(fpu);
> -               preempt_enable();
> +               local_bh_enable();
>  
>                 return err;
>         } else {
> 
> local_bh_disable() due to possible kernel_fpu_begin() usage in softirq.
> How much do we care here about a theoretical race on 32bit anyway? I
> don't think someone complained :) I would have to rebase my queue…
> otherwise…

Funny, you should mention that.

But this very much rings a bell about a very elusive bug we had on
32-bit at the time. See attached mbox (yeah, the web archives were crap
and couldn't find the links so I'm sending you the whole thread).

And at the time Ingo said that there's something still missing about
*why* it would happen.

And I think it is this context switch happening right after the
sigreturn - *AFAICT* - which would cause this.

I could very well be off but this smells very similar to your thing.

Hmmm.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

Download attachment "fpu.mbox" of type "application/mbox" (145503 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ