[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171009141706.5xbqsvgwer4a246s@wfg-t540p.sh.intel.com>
Date: Mon, 9 Oct 2017 22:17:06 +0800
From: Fengguang Wu <fengguang.wu@...el.com>
To: Josh Poimboeuf <jpoimboe@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Byungchul Park <byungchul.park@....com>,
Ingo Molnar <mingo@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
LKP <lkp@...org>
Subject: Re: [lockdep] b09be676e0 BUG: unable to handle kernel NULL pointer
dereference at 000001f2
On Mon, Oct 09, 2017 at 08:26:05AM -0500, Josh Poimboeuf wrote:
>On Mon, Oct 09, 2017 at 08:55:04PM +0800, Fengguang Wu wrote:
>> On Mon, Oct 09, 2017 at 08:21:13PM +0800, Fengguang Wu wrote:
>> > On Mon, Oct 09, 2017 at 12:50:55PM +0200, Peter Zijlstra wrote:
>> > > > Fengguang, if you're still listening, could you please rerun the tests
>> > > > on top of ce07a9415f26, with the attached patches also applied?
>> > >
>> > > Ping!? it would be very good to get feedback on this asap.
>> >
>> > Sorry for the delay!
>> >
>> > > > From e7840ad76515f0b5061fcdd098b57b7c01b61482 Mon Sep 17 00:00:00 2001
>> > > > Message-Id: <e7840ad76515f0b5061fcdd098b57b7c01b61482.1507215196.git.jpoimboe@...hat.com>
>> > > > From: Josh Poimboeuf <jpoimboe@...hat.com>
>> > > > Date: Thu, 5 Oct 2017 09:43:59 -0500
>> > > > Subject: [PATCH 1/2] unwinder fixes
>> > > >
>> > > > ---
>> > > > arch/x86/kernel/unwind_frame.c | 33 ++++++++++++++++++++++++++++++---
>> >
>> > I just test 316 boots and see 7 WARNINGs:
>> >
>> > [ 404.948035] WARNING: kernel stack frame pointer at c6ea3ecd in init:212 has bad value (null)
>> > [ 298.118383] WARNING: kernel stack frame pointer at cde07dad in init:1 has bad value bc000000
>> > [ 112.848677] WARNING: kernel stack frame pointer at cde07dbd in swapper/0:1 has bad value c2000000
>> > [ 127.942417] WARNING: kernel stack frame pointer at cf95de71 in rb_producer:50 has bad value 03cf95de
>> > [ 4.736938] WARNING: kernel stack frame pointer at bf643d59 in kworker/0:1:15 has bad value b5000000
>> > [ 308.260066] WARNING: kernel stack frame pointer at bde07da5 in udevd:155 has bad value b5bfa17b
>> >
>> > [ 277.473596] WARNING: CPU: 0 PID: 520 at kernel/locking/lockdep.c:3841 check_flags+0x119/0x1b0
>
>The unwinder patch I sent had a few bugs: it broke frame pointer
>encoding (causing the '?' entries on the lockdep stack trace) and it
>didn't disable the frame pointer warnings. Here's the fixed version.
>
>Fengguang, can you do a round of tests with this patch and the lockdep
>patch I sent before? Thanks!
It works! I tried 500 boots and only find 1 occurrence of this error,
which looks irrelevant to the current issue.
[ 187.855027] init: plymouth-splash main process (418) terminated with status 1
[ 187.953296] init: networking main process (419) terminated with status 1
[ 191.697721] ------------[ cut here ]------------
[ 191.699318] WARNING: CPU: 0 PID: 424 at kernel/locking/lockdep.c:3928 check_flags+0x119/0x1b0
[ 191.700967] CPU: 0 PID: 424 Comm: trinity-main Not tainted 4.14.0-rc3-00002-gc394639 #1
[ 191.702200] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[ 191.703476] task: c82fec80 task.stack: c8bbe000
[ 191.704194] EIP: check_flags+0x119/0x1b0
[ 191.704809] EFLAGS: 00010086 CPU: 0
[ 191.705380] EAX: 0000002e EBX: c82fec80 ECX: 00000107 EDX: b8afe274
[ 191.716483] ESI: c8003400 EDI: 00000000 EBP: c6de5c5c ESP: c6de5c54
[ 191.717457] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 191.718350] CR0: 80050033 CR2: 0858f014 CR3: 18035000 CR4: 00000690
[ 191.719405] Call Trace:
[ 191.719823] <SOFTIRQ>
[ 191.720241] lock_acquire+0x3d/0x230
[ 191.720819] ? perf_event_output_forward+0x14/0x180
[ 191.721601] ? __rcu_read_lock+0x3/0x20
[ 191.722218] perf_event_output_forward+0x41/0x180
[ 191.722958] ? perf_prepare_sample+0x830/0x830
[ 191.723657] ? __perf_event_account_interrupt+0x215/0x240
[ 191.724508] ? perf_prepare_sample+0x830/0x830
[ 191.725213] __perf_event_overflow+0x98/0x150
[ 191.725898] perf_swevent_overflow+0x9e/0xe0
[ 191.736679] perf_swevent_event+0x153/0x1a0
[ 191.737345] perf_tp_event+0x110/0x440
[ 191.737943] ? check_preemption_disabled+0x3d/0x1a0
[ 191.738759] ? check_preemption_disabled+0x3d/0x1a0
[ 191.739568] ? debug_smp_processor_id+0x12/0x20
[ 191.740332] ? perf_trace_buf_alloc+0xf9/0x1c0
[ 191.741107] perf_ftrace_function_call+0xe0/0xf0
[ 191.741866] ? __local_bh_enable+0x99/0xa0
[ 191.742586] ? ftrace_ops_no_ops+0x334/0x380
[ 191.743263] ftrace_ops_no_ops+0x334/0x380
[ 191.743924] ? check_preemption_disabled+0x3d/0x1a0
[ 191.744695] ? __local_bh_enable+0x99/0xa0
[ 191.745350] ? preempt_count_sub+0x8/0x2e0
[ 191.746000] ftrace_stub+0x14/0x1c
[ 191.756703] ? preempt_count_sub+0xd/0x2e0
[ 191.757404] ? trace_softirqs_on+0xf2/0x150
[ 191.758079] __local_bh_enable+0x99/0xa0
[ 191.758786] __do_softirq+0x6a5/0x9c0
[ 191.759440] ? __irqentry_text_end+0x6/0x6
[ 191.760096] do_softirq_own_stack+0x30/0x40
[ 191.760788] </SOFTIRQ>
[ 191.761230] irq_exit+0x56/0xd0
[ 191.761796] smp_apic_timer_interrupt+0x48d/0x6f0
[ 191.762571] apic_timer_interrupt+0x3a/0x40
[ 191.763298] EIP: lock_acquire+0x1d6/0x230
[ 191.763933] EFLAGS: 00000246 CPU: 0
[ 191.764548] EAX: 00000246 EBX: c82fec80 ECX: 6b96419b EDX: 00000000
[ 191.765567] ESI: 00000246 EDI: 00000000 EBP: c8bbfe64 ESP: c8bbfe30
[ 191.776469] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 191.777395] ? cgroup1_procs_write+0xb/0x20
[ 191.778068] ? handle_pte_fault+0x55f/0x1cc0
[ 191.778837] _raw_spin_lock+0x42/0x50
[ 191.779447] ? handle_pte_fault+0x55f/0x1cc0
[ 191.780119] handle_pte_fault+0x55f/0x1cc0
[ 191.780849] handle_mm_fault+0x531/0x700
[ 191.781504] ? handle_mm_fault+0x72/0x700
[ 191.782194] __do_page_fault+0xa8a/0xbd0
[ 191.782855] do_page_fault+0x2cc/0x422
[ 191.783471] ? kvm_read_and_reset_pf_reason+0x70/0x70
[ 191.784293] do_async_page_fault+0x26/0x60
[ 191.796340] common_exception+0x3d/0x42
[ 191.796973] EIP: 0xa7de393e
[ 191.797401] EFLAGS: 00010206 CPU: 0
[ 191.797924] EAX: 00000000 EBX: a7f0eff4 ECX: 00001ff1 EDX: 0858f010
[ 191.798902] ESI: 0858d008 EDI: 00002009 EBP: 00000004 ESP: afe7c120
[ 191.799828] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
[ 191.800679] ? kvm_read_and_reset_pf_reason+0x70/0x70
[ 191.801453] Code: 00 85 c0 74 75 e8 88 8d 2f 00 85 c0 74 6c 83 3d cc c1 57 ba 00 75 63 c7 44 24 04 34 da 92 b9 c7 04 24 0a 93 91 b9 e8 ec 3b 01 00 <0f> ff eb 4b 8d 76 00 8b 0d 88 2a 54 ba 85 c9 75 3e 64 a1 4c 68
[ 191.804438] ---[ end trace 70000c51373576aa ]---
[ 191.805188] irq event stamp: 362178
[ 191.805708] hardirqs last enabled at (362176): [<b9490faa>] restore_all+0xf/0x25
[ 191.816989] hardirqs last disabled at (362177): [<b9496420>] __do_softirq+0xf0/0x9c0
[ 191.818122] softirqs last enabled at (362178): [<b94969d5>] __do_softirq+0x6a5/0x9c0
[ 191.819324] softirqs last disabled at (362171): [<b8a10ce0>] do_softirq_own_stack+0x30/0x40
[ 196.174500] init: tty4 main process ended, respawning
[ 196.249472] init: tty5 main process (375) terminated with status 1
[ 196.251170] init: tty5 main process ended, respawning
[ 196.270393] init: tty2 main process (377) terminated with status 1
[ 196.272059] init: tty2 main process ended, respawning
[ 196.428413] init: tty3 main process ended, respawning
[ 196.430660] init: tty6 main process (379) terminated with status 1
[ 196.437890] init: tty6 main process ended, respawning
[init] Using pid_max = 32768
[init] Kernel was tainted on startup. Will ignore flags that are already set.
[init] Started watchdog process, PID is 430
[main] Main thread is alive.
[main] Setsockopt(1 a 80d3000 7e) on fd 8 [1:1:1]
[main] Setsockopt(1 a 80d3000 13) on fd 11 [1:5:1]
Thanks,
Fengguang
View attachment "dmesg-quantal-vp-31:20171009220929:i386-randconfig-i0-201739:4.14.0-rc3-00002-gc394639:1" of type "text/plain" (85273 bytes)
Powered by blists - more mailing lists