lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150304190651.GA5589@redhat.com>
Date:	Wed, 4 Mar 2015 20:06:51 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	Dave Hansen <dave.hansen@...el.com>,
	Quentin Casasnovas <quentin.casasnovas@...cle.com>
Cc:	Andy Lutomirski <luto@...capital.net>,
	Borislav Petkov <bp@...e.de>, Ingo Molnar <mingo@...nel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Pekka Riikonen <priikone@....fi>,
	Rik van Riel <riel@...hat.com>,
	Suresh Siddha <sbsiddha@...il.com>,
	LKML <linux-kernel@...r.kernel.org>,
	"Yu, Fenghua" <fenghua.yu@...el.com>
Subject: Re: Oops with tip/x86/fpu

Thanks. I'll try to investigate tomorrow.

Well, the kernel crashes because xrstor_state() is buggy, Quentin already
has a fix.

But #GP should be explained...

On 03/04, Dave Hansen wrote:
>
> I'm running a commit from the tip/x86/fpu branch: ae486033b98.  It's on
> a system which I normally boot with 'noxsaves'.  When I boot without
> 'noxsaves' it is getting a GPF around the time that init is forked off.

And I assume that (before this commit) the kernel runs fine if you boot
without 'noxsaves'?

> 
> The full oops is below, but addr2line points to the "alternative_input("
> line in xrstor_state().
> 
> The one that oopses has this in bootup:
> 
>    xsave: enabled xstate_bv 0x1f, cntxt size 0x3c0 using compacted form
> 
> The one that works says:
> 
>    xsave: enabled xstate_bv 0x1f, cntxt size 0x440 using standard form
> 
> I bisected it down to:
> 
> > commit 110d7f7513bbb916b8654da9e2973ac5bed929a9
> > Author: Oleg Nesterov <oleg@...hat.com>
> > Date:   Mon Jan 19 19:52:12 2015 +0100
> > 
> >     x86/fpu: Don't abuse FPU in kernel threads if use_eager_fpu()
> >     
> >     AFAICS, there is no reason why kernel threads should have FPU context
> >     even if use_eager_fpu() == T. Now that interrupted_kernel_fpu_idle()
> >     does not check __thread_has_fpu() in the use_eager_fpu() case, we
> >     can remove the init_fpu() code from eager_fpu_init() and change
> >     flush_thread() called by do_execve() to initialize FPU.
> >     
> >     Note: of course, the change in flush_thread() is horrible and must be
> >     cleanuped. We need the new helper, and flush_thread() should return the
> >     error if init_fpu() fails.
> 
> It disassembles to:
> 
> > All code
> > ========
> >    0:	00 00                	add    %al,(%rax)
> >    2:	48 c7 c7 58 a4 12 82 	mov    $0xffffffff8212a458,%rdi
> >    9:	e8 03 13 14 00       	callq  0x141311
> >    e:	db e2                	fnclex 
> >   10:	0f 77                	emms   
> >   12:	db 83 3c 05 00 00    	fildl  0x53c(%rbx)
> >   18:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
> >   1d:	b8 ff ff ff ff       	mov    $0xffffffff,%eax
> >   22:	48 8b bb 40 05 00 00 	mov    0x540(%rbx),%rdi
> >   29:	89 c2                	mov    %eax,%edx
> >   2b:*	48 0f c7 1f          	xrstors64 (%rdi)		<-- trapping instruction
> >   2f:	31 c0                	xor    %eax,%eax
> >   31:	45 31 e4             	xor    %r12d,%r12d
> >   34:	85 c0                	test   %eax,%eax
> >   36:	48 c7 c7 a8 a4 12 82 	mov    $0xffffffff8212a4a8,%rdi
> >   3d:	41                   	rex.B
> >   3e:	0f                   	.byte 0xf
> >   3f:	95                   	xchg   %eax,%ebp
> 
> ...
> > [   14.193801] Freeing unused kernel memory: 560K (ffff880001974000 - ffff880001a00000)
> > [   14.203661] Freeing unused kernel memory: 1916K (ffff880001e21000 - ffff880002000000)
> > [   14.213132] general protection fault: 0000 [#1] SMP 
> > [   14.218786] Modules linked in:
> > [   14.222273] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.19.0-00430-gae48603-dirty #1428
> > [   14.231375] Hardware name: Intel Corporation Skylake Client platform/Skylake Y LPDDR3 RVP3, BIOS SKLSE2P1.86C.X062.R00.1411270820 11/27/2014
> > [   14.245698] task: ffff8801485a8000 ti: ffff880148620000 task.ti: ffff880148620000
> > [   14.254189] RIP: 0010:[<ffffffff81004eda>]  [<ffffffff81004eda>] math_state_restore+0x13a/0x380
> > [   14.264076] RSP: 0000:ffff880148623b98  EFLAGS: 00010296
> > [   14.270090] RAX: 00000000ffffffff RBX: ffff8801485a8000 RCX: 0000000000000000
> > [   14.278186] RDX: 00000000ffffffff RSI: 0000000000000000 RDI: ffff88007f5f0000
> > [   14.286277] RBP: ffff880148623bb8 R08: 0000000000000000 R09: ffff88007f5f0000
> > [   14.294371] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8801485a8000
> > [   14.302468] R13: ffff88007f5e0000 R14: ffff8801485a8000 R15: ffffffff821ca800
> > [   14.310574] FS:  0000000000000000(0000) GS:ffff88014e440000(0000) knlGS:0000000000000000
> > [   14.319794] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   14.326323] CR2: 0000000000000000 CR3: 000000007f820000 CR4: 00000000003407e0
> > [   14.334420] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [   14.342516] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [   14.350612] Stack:
> > [   14.352896]  ffff8801485a8000 0000000000000000 ffff8801485a8000 ffff88007f5e0000
> > [   14.361366]  ffff880148623be8 ffffffff8101210d 0000000000000000 ffff88007f590db0
> > [   14.369810]  ffff8801485a8000 ffff88007f5e0000 ffff880148623c58 ffffffff811f5074
> > [   14.378267] Call Trace:
> > [   14.381056]  [<ffffffff8101210d>] flush_thread+0x1ad/0x270
> > [   14.387281]  [<ffffffff811f5074>] flush_old_exec+0x774/0xee0
> > [   14.393702]  [<ffffffff81256703>] load_elf_binary+0x353/0x1870
> > [   14.400317]  [<ffffffff811f3f47>] ? search_binary_handler+0x97/0x1f0
> > [   14.407532]  [<ffffffff810c491c>] ? do_raw_read_unlock+0x2c/0x50
> > [   14.414361]  [<ffffffff811f3f38>] search_binary_handler+0x88/0x1f0
> > [   14.421374]  [<ffffffff81255fc4>] load_script+0x274/0x2b0
> > [   14.427503]  [<ffffffff811f3ee8>] ? search_binary_handler+0x38/0x1f0
> > [   14.434722]  [<ffffffff810c491c>] ? do_raw_read_unlock+0x2c/0x50
> > [   14.441563]  [<ffffffff811f3f38>] search_binary_handler+0x88/0x1f0
> > [   14.448577]  [<ffffffff811f6436>] do_execveat_common.isra.32+0x746/0xa30
> > [   14.456184]  [<ffffffff811f6386>] ? do_execveat_common.isra.32+0x696/0xa30
> > [   14.463988]  [<ffffffff8194ad50>] ? rest_init+0x150/0x150
> > [   14.470115]  [<ffffffff811f674c>] do_execve+0x2c/0x30
> > [   14.475848]  [<ffffffff8100023b>] run_init_process+0x2b/0x30
> > [   14.482264]  [<ffffffff8194ad92>] kernel_init+0x42/0xf0
> > [   14.488222]  [<ffffffff8196b67c>] ret_from_fork+0x7c/0xb0
> > [   14.494351]  [<ffffffff8194ad50>] ? rest_init+0x150/0x150
> > [   14.500481] Code: 00 00 48 c7 c7 58 a4 12 82 e8 03 13 14 00 db e2 0f 77 db 83 3c 05 00 00 0f 1f 44 00 00 b8 ff ff ff ff 48 8b bb 40 05 00 00 89 c2 <48> 0f c7 1f 31 c0 45 31 e4 85 c0 48 c7 c7 a8 a4 12 82 41 0f 95 
> > [   14.522792] RIP  [<ffffffff81004eda>] math_state_restore+0x13a/0x380
> > [   14.530031]  RSP <ffff880148623b98>
> > [   14.534061] ---[ end trace f99d58de7d83269b ]---
> > [   14.539711] usb 1-5: New USB device found, idVendor=14dd, idProduct=1007
> > [   14.549577] usb 1-5: New USB device strings: Mfr=1, Product=2, SerialNumber=7
> > [   14.560957] usb 1-5: Product: D2CIM-DVUSB
> > [   14.567717] usb 1-5: Manufacturer: Raritan
> > [   14.573636] usb 1-5: SerialNumber: HUX45017210000007
> > [   14.579421] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> > [   14.579421] 
> > [   14.580548] usb 1-5: ep 0x81 - rounding interval to 64 microframes, ep desc says 80 microframes
> > [   14.580595] usb 1-5: ep 0x82 - rounding interval to 64 microframes, ep desc says 80 microframes
> > [   14.580634] usb 1-5: ep 0x83 - rounding interval to 64 microframes, ep desc says 80 microframes
> > [   14.592305] input: Raritan D2CIM-DVUSB as /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5:1.0/0003:14DD:1007.0001/input/input7
> > [   14.632243] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
> > [   14.656356] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> > [   14.656356] 
> > 
> 
> Config is here:
> 
> https://www.sr71.net/~dave/intel/config-20150303

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ