[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20180502131227.2sgtgg7y44vduqgw@wfg-t540p.sh.intel.com>
Date: Wed, 2 May 2018 21:12:27 +0800
From: Fengguang Wu <fengguang.wu@...el.com>
To: x86@...nel.org
Cc: Paolo Bonzini <pbonzini@...hat.com>,
Radim Krčmář <rkrcmar@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>,
Andy Lutomirski <luto@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
kvm@...r.kernel.org, linux-kernel@...r.kernel.org, lkp@...org
Subject: [async_page_fault] PANIC: double fault, error_code: 0x0
Hello,
FYI this happens in mainline kernel 4.17.0-rc3.
It at least dates back to v4.16 .
It occurs in 2 out of 2 boots. It happens only with
CONFIG_IA32_EMULATION enabled.
[ 0.001000] Good, all 261 testcases passed! |
[ 0.001000] ---------------------------------
[ 0.001000] ACPI: Core revision 20180313
[ 0.001000] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns
[ 0.001000] hpet clockevent registered
[ 0.001000] PANIC: double fault, error_code: 0x0
[ 0.001000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.17.0-rc3 #248
[ 0.001000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 0.001000] RIP: 0010:async_page_fault+0x3/0x30:
async_page_fault at arch/x86/entry/entry_64.S:1163
[ 0.001000] RSP: 0000:ffffc90000000000 EFLAGS: 00010082
[ 0.001000] RAX: fffff5200000000e RBX: 0000000000000003 RCX: ffffffff82a00a20
[ 0.001000] RDX: dffffc0000000000 RSI: 0000000000000003 RDI: ffffffff8342c368
[ 0.001000] RBP: ffffc900000000f8 R08: 0000000000000000 R09: 0000000000000000
[ 0.001000] R10: 0000000000000000 R11: 0000000000000000 R12: ffffc90000000158
[ 0.001000] R13: fffff52000000048 R14: ffffffff8342bb80 R15: 0000000000000000
[ 0.001000] FS: 0000000000000000(0000) GS:ffffc90000000000(0000) knlGS:0000000000000000
[ 0.001000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.001000] CR2: ffffc8fffffffff8 CR3: 0000000003424000 CR4: 00000000000006b0
[ 0.001000] Call Trace:
[ 0.001000] Code: 48 89 e7 48 8b 74 24 78 48 c7 44 24 78 ff ff ff ff e8 02 1b 6d fe e9 fd 01 00 00 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 <e8> 08 01 00 00 48 89 e7 48 8b 74 24 78 48 c7 44 24 78 ff ff ff
[ 0.001000] Kernel panic - not syncing: Machine halted.
[ 0.001000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.17.0-rc3 #248
[ 0.001000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 0.001000] Call Trace:
[ 0.001000] <#DF>
[ 0.001000] dump_stack+0x162/0x221:
dump_stack at lib/dump_stack.c:115
[ 0.001000] ? arch_local_irq_restore+0x44/0x44:
rcu_read_lock at include/linux/rcupdate.h:629
arch_local_irq_restore+0x44/0x44:
cr4_set_bits at arch/x86/include/asm/tlbflush.h:264
arch_local_irq_restore+0x44/0x44:
dump_header at mm/oom_kill.c:423
arch_local_irq_restore+0x44/0x44:
dump_stack at lib/dump_stack.c:89
[ 0.001000] ? trace_hardirqs_off_caller+0x14f/0x350:
trace_hardirqs_off_caller at kernel/locking/lockdep.c:2922
[ 0.001000] panic+0x1ca/0x380:
panic at kernel/panic.c:195
[ 0.001000] ? refcount_error_report+0x290/0x290:
panic at kernel/panic.c:136
[ 0.001000] df_debug+0x2d/0x30:
df_debug at ??:?
[ 0.001000] do_double_fault+0xa0/0xc0:
do_double_fault at arch/x86/kernel/traps.c:450 (discriminator 1)
[ 0.001000] double_fault+0x23/0x30:
double_fault at arch/x86/entry/entry_64.S:994
[ 0.001000] RIP: 0010:async_page_fault+0x3/0x30:
async_page_fault at arch/x86/entry/entry_64.S:1163
[ 0.001000] RSP: 0000:ffffc90000000000 EFLAGS: 00010082
[ 0.001000] RAX: fffff5200000000e RBX: 0000000000000003 RCX: ffffffff82a00a20
[ 0.001000] RDX: dffffc0000000000 RSI: 0000000000000003 RDI: ffffffff8342c368
[ 0.001000] RBP: ffffc900000000f8 R08: 0000000000000000 R09: 0000000000000000
[ 0.001000] R10: 0000000000000000 R11: 0000000000000000 R12: ffffc90000000158
[ 0.001000] R13: fffff52000000048 R14: ffffffff8342bb80 R15: 0000000000000000
[ 0.001000] ? restore_regs_and_return_to_kernel+0x2e/0x2e:
native_irq_return_iret at arch/x86/entry/entry_64.S:752
[ 0.001000] </#DF>
Attached the full dmesg, kconfig and reproduce scripts.
Out of them, there are 2 occurrences of "BUG: stack guard page was hit":
[ 1.717675] gfs2: path_lookup on rootfs returned error -2
[ 1.719152] mount (320) used greatest stack depth: 13960 bytes left
Configuring network interfaces...
Kernel tests: Boot OK!
[ 12.877799] trinity-main uses obsolete (PF_INET,SOCK_PACKET)
[ 12.915462] BUG: stack guard page was hit at 0000000056aa719a (stack is 00000000f79fc0b1..00000000bac11f32)
[ 12.916489] kernel stack overflow (double-fault): 0000 [#1] SMP
[ 12.917111] Modules linked in:
[ 12.917435] CPU: 0 PID: 314 Comm: bootlogd Not tainted 4.16.0 #328
[ 12.918082] RIP: 0010:async_page_fault+0xa/0x50
[ 12.918553] RSP: 0018:ffffc900004b0000 EFLAGS: 00010046
[ 12.919099] RAX: 0000000000000000 RBX: ffffe8ffffc02a20 RCX: 0000000000000000
[ 12.919834] RDX: ffffc900004b00d8 RSI: ffffe8ffffc02a20 RDI: ffffffff820504e0
[ 12.920559] RBP: ffffc900004b0068 R08: 0000000000000000 R09: 0000000000000000
[ 12.921294] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 12.922027] R13: ffffe8ffffc02a20 R14: ffff88001a150a08 R15: 0000000000000000
[ 12.923097] FS: 00007f4c60af5700(0000) GS:ffff88001f800000(0000) knlGS:0000000000000000
[ 12.924166] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 12.924757] CR2: ffffc900004afff8 CR3: 000000001c715004 CR4: 00000000001606f0
[ 12.925497] Call Trace:
[ 12.925770] ? perf_trace_x86_exceptions+0x29/0xd0
[ 12.926332] do_page_fault+0xf6/0x137
[ 12.926719] do_async_page_fault+0x2f/0xa6
[ 12.927156] async_page_fault+0x25/0x50
[ 12.927557] RIP: 0010:perf_trace_x86_exceptions+0x29/0xd0
[ 12.928122] RSP: 0018:ffffc900004b0180 EFLAGS: 00010046
[ 12.928664] RAX: 0000000000000000 RBX: ffffe8ffffc02a20 RCX: 0000000000000000
[ 12.929396] RDX: ffffc900004b0228 RSI: ffffe8ffffc02a20 RDI: ffffffff820504e0
[ 12.930135] RBP: ffffc900004b01b8 R08: 0000000000000000 R09: 0000000000000000
[ 12.930870] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 12.931595] R13: ffffe8ffffc02a20 R14: ffff88001a150a08 R15: 0000000000000000
[ 12.932330] do_page_fault+0xf6/0x137
[ 12.932714] do_async_page_fault+0x2f/0xa6
Kernel tests: Boot OK!
[ 3.976899] random: trinity: uninitialized urandom read (4 bytes read)
[ 4.010691] random: trinity: uninitialized urandom read (4 bytes read)
01 00 00 00 40 00
[ 17.082506] trinity-main uses obsolete (PF_INET,SOCK_PACKET)
[ 17.164797] BUG: stack guard page was hit at (ptrval) (stack is (ptrval).. (ptrval))
[ 17.165932] kernel stack overflow (double-fault): 0000 [#1] SMP PTI
[ 17.166657] Modules linked in:
[ 17.167041] CPU: 0 PID: 335 Comm: bootlogd Not tainted 4.17.0-rc2 #100
[ 17.167811] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 17.168801] RIP: 0010:trace_hardirqs_off_thunk+0xf/0x1c
[ 17.169423] RSP: 0018:ffffc90000554000 EFLAGS: 00010087
[ 17.170023] RAX: 0000000081800a70 RBX: 0000000000000001 RCX: ffffffff81800a70
[ 17.170833] RDX: 0000000000000000 RSI: ffffffff81800ea8 RDI: ffffffff82038fe0
[ 17.171651] RBP: ffffc90000554040 R08: 0000000000000000 R09: 0000000000000000
[ 17.172470] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 17.173279] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 17.174148] FS: 00007f2b380b0700(0000) GS:ffff88001f800000(0000) knlGS:0000000000000000
[ 17.175043] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 17.175692] CR2: ffffc90000553ff8 CR3: 000000001c684002 CR4: 00000000001606b0
[ 17.176515] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 17.177322] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 17.178129] Call Trace:
[ 17.178430] ? restore_regs_and_return_to_kernel+0x33/0x33
[ 17.179050] ? async_page_fault+0x8/0x30
[ 17.179476] error_entry+0x84/0x100
[ 17.179872] RIP: 0010:perf_trace_x86_exceptions+0x44/0xf0
[ 17.180502] RSP: 0018:ffffc90000554100 EFLAGS: 00010046 ORIG_RAX: 0000000000000000
[ 17.181370] RAX: 0000000000000000 RBX: ffffe8ffffc021d8 RCX: 0000000000000000
[ 17.182202] RDX: ffffc90000554188 RSI: ffffe8ffffc021d8 RDI: ffffffff82038fe0
[ 17.183028] RBP: ffffc90000554140 R08: 0000000000000000 R09: 0000000000000000
[ 17.183839] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff82038fe0
[ 17.184638] R13: ffffe8ffffc021d8 R14: 0000000000000000 R15: ffffc90000554188
[ 17.185463] ? async_page_fault+0x8/0x30
[ 17.185924] do_page_fault+0x230/0x320
[ 17.186370] async_page_fault+0x1e/0x30
[ 17.186806] RIP: 0010:perf_trace_x86_exceptions+0x44/0xf0
[ 17.187431] RSP: 0018:ffffc90000554230 EFLAGS: 00010046
[ 17.188031] RAX: 0000000000000000 RBX: ffffe8ffffc021d8 RCX: 0000000000000000
[ 17.188821] RDX: ffffc900005542b8 RSI: ffffe8ffffc021d8 RDI: ffffffff82038fe0
[ 17.189645] RBP: ffffc90000554270 R08: 0000000000000000 R09: 0000000000000000
[ 17.190480] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff82038fe0
[ 17.191282] R13: ffffe8ffffc021d8 R14: 0000000000000000 R15: ffffc900005542b8
[ 17.192109] do_page_fault+0x230/0x320
[ 17.192555] async_page_fault+0x1e/0x30
[ 17.193005] RIP: 0010:perf_trace_x86_exceptions+0x44/0xf0
...
Thanks,
Fengguang
View attachment "dmesg-vm-lkp-nhm-dp1-yocto-ia32-4:20180502034036:x86_64-randconfig-s4-05020254:4.17.0-rc3:248" of type "text/plain" (22259 bytes)
View attachment ".config" of type "text/plain" (107467 bytes)
View attachment "job-script" of type "text/plain" (3954 bytes)
View attachment "reproduce-vm-lkp-nhm-dp1-yocto-ia32-4:20180502034036:x86_64-randconfig-s4-05020254:4.17.0-rc3:248" of type "text/plain" (2014 bytes)
Powered by blists - more mailing lists