linux-kernel - Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu_

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CALCETrVU8RvcDAUPfwoW9FVvgyn3z-5R86+4-mXtubpTd4YiKg@mail.gmail.com>
Date:	Thu, 11 Feb 2016 17:16:00 -0800
From:	Andy Lutomirski <luto@...capital.net>
To:	Borislav Petkov <bp@...en8.de>
Cc:	x86-ml <x86@...nel.org>, lkml <linux-kernel@...r.kernel.org>
Subject: Re: WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530
 fpu__restore+0x90/0x130()

On Thu, Feb 11, 2016 at 3:47 PM, Andy Lutomirski <luto@...capital.net> wrote:
> On Thu, Feb 11, 2016 at 11:27 AM, Borislav Petkov <bp@...en8.de> wrote:
>> Hey Andy,
>>
>> can you make any sense of this:
>>
>> [   90.573923] ------------[ cut here ]------------
>> [   90.574977] WARNING: CPU: 0 PID: 3031 at ./arch/x86/include/asm/fpu/internal.h:530 fpu__restore+0x90/0x130()
>> [   90.576108] Modules linked in: hid_generic usbhid hid snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic iTCO_wdt iTCO_vendor_support arc4 x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crc32_pclmul crc32c_intel iwldvm mac80211 aesni_intel xts snd_hda_intel input_leds aes_i586 snd_hda_codec sdhci_pci lrw iwlwifi snd_hwdep gf128mul snd_hda_core ablk_helper cryptd ehci_pci pcspkr serio_raw xhci_pci sdhci snd_pcm sg mmc_core i2c_i801 cfg80211 lpc_ich mfd_core e1000e snd_timer ehci_hcd xhci_hcd thinkpad_acpi nvram wmi snd battery soundcore led_class ac thermal
>> [   90.580570] CPU: 0 PID: 3031 Comm: bash Not tainted 4.5.0-rc3+ #1
>> [   90.581380] Hardware name: LENOVO 2320CTO/2320CTO, BIOS G2ET86WW (2.06 ) 11/13/2012
>> [   90.582325]  00000000 00000286 f158be4c c12cce56 00000000 00000000 f158be80 c10567fb
>> [   90.583179]  c1866c2c 00000000 00000bd7 c1859e8c 00000212 c1025ab0 00000212 c1025ab0
>> [   90.584142]  f2012b00 f2011f00 f2012d80 f158be90 c10568d2 00000009 00000000 f158bea4
>> [   90.585002] Call Trace:
>> [   90.585854]  [<c12cce56>] dump_stack+0x5f/0x89
>> [   90.586703]  [<c10567fb>] warn_slowpath_common+0x8b/0xc0
>> [   90.587559]  [<c1025ab0>] ? fpu__restore+0x90/0x130
>> [   90.588520]  [<c1025ab0>] ? fpu__restore+0x90/0x130
>> [   90.589353]  [<c10568d2>] warn_slowpath_null+0x22/0x30
>> [   90.590175]  [<c1025ab0>] fpu__restore+0x90/0x130
>> [   90.590993]  [<c1027098>] __fpu__restore_sig+0x268/0x4c0
>> [   90.591816]  [<c102751f>] fpu__restore_sig+0x2f/0x50
>> [   90.592636]  [<c101a6c9>] restore_sigcontext+0xe9/0x110
>> [   90.593449]  [<c101af3c>] sys_sigreturn+0x9c/0xb0
>> [   90.594263]  [<c1001bd9>] do_syscall_32_irqs_on+0x59/0x340
>> [   90.595079]  [<c169979d>] entry_INT80_32+0x31/0x31
>> [   90.595922] ---[ end trace be617bef2982f473 ]---
>>
>> This is rc3 + latest tip/master and it happened when I did "make
>> mrproper" in the kernel repo.
>>
>> From a quick stare, it looks to me we're running do_syscall_32_irqs_on()
>> with IRQs on, sys_sigreturn() does current_pt_regs() but
>> __fpu__restore_sig() derefs current again and could be that that second
>> "current" is another current which already has ->fpregs_active set ?
>>
>> FPU + signal handling code in a single backtrace. My favourite!
>>
>> :-\
>
> Ugh.
>
> Can you send all the fpu info that the kernel prints really early when it boots?
>

Are you running 32-bit userspace by any chance?  I'm guessing you're
hitting this in __fpu_restore_sig:

        fpu__drop(fpu);
        if (__copy_from_user(&fpu->state.xsave, buf_fx, state_size) ||
            __copy_from_user(&env, buf, sizeof(env))) {
            fpstate_init(&fpu->state);
            err = -1;
        } else {
            sanitize_restored_xstate(tsk, &env, xfeatures, fx_only);
        }

        fpu->fpstate_active = 1;

<-- preempted right here

        if (use_eager_fpu()) {
            preempt_disable();
            fpu__restore(fpu);
            preempt_enable();
        }

I don't see why this code deserves to work.  If I'm right, it can be
fixed by pulling the preempt_disable out of the if (use_eager_fpu())
to right above the fpstate_active = 1 line.  Don't bother trying to
optimize the !use_eager_fpu() case.

Once someone gets around to eagerly *allocating* the FPU context and
dropping CR0.TS usage entirely, then even that won't be enough unless
we do my suggesting of deferring FPU restore to
prepare_exit_to_usermode.  (Doing that will make all of this much,
much more sane.)


--Andy