linux-kernel - Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu_

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160217081646.GA32354@gmail.com>
Date:	Wed, 17 Feb 2016 09:16:46 +0100
From:	Ingo Molnar <mingo@...nel.org>
To:	Andy Lutomirski <luto@...capital.net>
Cc:	Borislav Petkov <bp@...en8.de>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	X86 ML <x86@...nel.org>
Subject: Re: WARNING: CPU: 0 PID: 3031 at
 ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()


* Andy Lutomirski <luto@...capital.net> wrote:

> On Feb 15, 2016 12:14 PM, "Borislav Petkov" <bp@...en8.de> wrote:
> >
> > ---
> > From: Borislav Petkov <bp@...e.de>
> > Date: Mon, 15 Feb 2016 19:50:33 +0100
> > Subject: [RFC PATCH] x86/FPU: Fix double FPU regs activation
> >
> > On the entry_INT80_32->do_syscall_32_irqs_on path on 32-bit we run with
> > interrupts enabled.
> 
> I would change this a little bit.
> 
> sys_sigreturn calls fpu__restore_sig with interrupts enabled.  When
> restoring a 32-bit signal frame, it can happen that...
> 
> > And it can happen that we get preempted right after
> > setting ->fpstate_active in a task's FPU.
> >
> > After we get preempted, we switch between tasks merrily and eventually
> > are about to switch to that task above whose ->fpstate_active we
> > set. We enter __switch_to() and do switch_fpu_prepare(). Our task gets
> > ->fpregs_active set, we find ourselves back on the call stack below and
> > especially in __fpu__restore_sig() which sets ->fpregs_active again.
> >
> > Leading to that whoops below.

So I'm wondering why this started triggering only now. Is this a pre-existing bug 
that somehow got triggered via:

  58122bf1d856 x86/fpu: Default eagerfpu=on on all CPUs

? If yes then we need a plausible theory of how that never triggered on modern 
Intel CPUs that had eagerfpu enabled for years.

Or perhaps was it caused by one of the other changes in tip:x86/fpu:

  c6ab109f7e0e x86/fpu: Speed up lazy FPU restores slightly
  a20d7297045f x86/fpu: Fold fpu_copy() into fpu__copy()
  5ed73f40735c x86/fpu: Fix FNSAVE usage in eagerfpu mode
  4ecd16ec7059 x86/fpu: Fix math emulation in eager fpu mode

?

Which would make this a recently introduced regression.

Thanks,

	Ingo