[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140206095646.GE17971@localhost>
Date: Thu, 6 Feb 2014 17:56:46 +0800
From: Fengguang Wu <fengguang.wu@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
LKML <linux-kernel@...r.kernel.org>
Subject: [sched/preempt] INFO: rcu_sched self-detected stall on CPU { 1}
Hi Peter,
We noticed the below RCU stalls which will block the system.
The problem is bisected to
commit 8cb75e0c4ec9786b81439761eac1d18d4a931af3
Author: Peter Zijlstra <peterz@...radead.org>
AuthorDate: Wed Nov 20 12:22:37 2013 +0100
Commit: Ingo Molnar <mingo@...nel.org>
CommitDate: Mon Jan 13 17:38:55 2014 +0100
sched/preempt: Fix up missed PREEMPT_NEED_RESCHED folding
arch/x86/include/asm/mwait.h | 2 +-
include/linux/preempt.h | 15 +++++++++++++++
include/linux/sched.h | 15 +++++++++++++++
kernel/cpu/idle.c | 17 ++++++++++-------
kernel/sched/core.c | 3 +--
5 files changed, 42 insertions(+), 10 deletions(-)
[ 85.786775] INFO: rcu_sched self-detected stall on CPU { 1} (t=15000 jiffies g=233 c=232 q=1940)
[ 85.788745] sending NMI to all CPUs:
[ 85.789539] NMI backtrace for cpu 0
[ 85.790389] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-rc8-wl-05610-gb524b38-dirty #1
[ 85.792043] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 85.793072] task: ffffffff82211440 ti: ffffffff82200000 task.ti: ffffffff82200000
[ 85.794631] RIP: 0010:[<ffffffff81067bfa>] [<ffffffff81067bfa>] native_safe_halt+0x6/0x8
[ 85.796314] RSP: 0018:ffffffff82201ec8 EFLAGS: 00000246
[ 85.797286] RAX: 0000000000000000 RBX: ffffffff82201fd8 RCX: 00000000ffffffff
[ 85.798507] RDX: 0100000000000000 RSI: 0000000000000000 RDI: 0000000000000046
[ 85.799680] RBP: ffffffff82201ec8 R08: 0000000000000000 R09: 0000000000000000
[ 85.800862] R10: 0000000000000000 R11: 0000000000000400 R12: 0000000000000000
[ 85.802055] R13: ffffffff82201fd8 R14: ffffffff82201fd8 R15: 00000000ffffffff
[ 85.803243] FS: 0000000000000000(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
[ 85.804872] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 85.806014] CR2: 00007feec94d2000 CR3: 000000007f7a2000 CR4: 00000000000006f0
[ 85.807265] Stack:
[ 85.807907] ffffffff82201ee8 ffffffff81041811 ffffffff82201fd8 ffffffff82201fd8
[ 85.809808] ffffffff82201ef8 ffffffff81041f3d ffffffff82201f40 ffffffff8110be4c
[ 85.811649] ffff88011ffa3ec0 35ceeb7d84596041 ffffffffffffffff ffffffff823ef8d0
[ 85.813483] Call Trace:
[ 85.814198] [<ffffffff81041811>] default_idle+0x38/0xc1
[ 85.815171] [<ffffffff81041f3d>] arch_cpu_idle+0x18/0x28
[ 85.816152] [<ffffffff8110be4c>] cpu_startup_entry+0x178/0x245
[ 85.817201] [<ffffffff81a022c3>] rest_init+0x87/0x89
[ 85.818204] [<ffffffff8234be04>] start_kernel+0x435/0x440
[ 85.819197] [<ffffffff8234b7dd>] ? repair_env_string+0x58/0x58
[ 85.820241] [<ffffffff8234b120>] ? early_idt_handlers+0x120/0x120
[ 85.821311] [<ffffffff8234b498>] x86_64_start_reservations+0x2a/0x2c
[ 85.822434] [<ffffffff8234b5d5>] x86_64_start_kernel+0x13b/0x148
[ 85.823545] Code: 48 89 e5 0f 09 5d c3 55 48 89 e5 9c 58 5d c3 55 48 89 e5 57 9d 5d c3 55 48 89 e5 fa 5d c3 55 48 89 e5 fb 5d c3 55 48 89 e5 fb f4 <5d> c3 55 48 89 e5 f4 5d c3 55 49 89 ca 49 89 d1 8b 07 48 89 e5
[ 608.380273] INFO: rcu_sched self-detected stall on CPU { 0} (t=15000 jiffies g=5755 c=5754 q=18379)
[ 608.383101] sending NMI to all CPUs:
[ 608.383940] NMI backtrace for cpu 0
[ 608.384786] CPU: 0 PID: 692 Comm: kswapd0 Not tainted 3.13.0-rc7-00177-g8cb75e0 #1
[ 608.392624] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[ 608.393750] task: ffff8801195f5a00 ti: ffff8800b6468000 task.ti: ffff8800b6468000
[ 608.395395] RIP: 0010:[<ffffffff81060dd2>] [<ffffffff81060dd2>] flat_send_IPI_mask+0x7e/0xac
[ 608.397299] RSP: 0018:ffff88011fc03dd0 EFLAGS: 00010046
[ 608.398342] RAX: 0000000000000c00 RBX: 0000000000000046 RCX: 0000000000000007
[ 608.399622] RDX: ffffffff82221ca0 RSI: 0000000000000002 RDI: 0000000000000300
[ 608.400863] RBP: ffff88011fc03df0 R08: 0000000000000000 R09: 0000000000000000
[ 608.402152] R10: 000000000000c120 R11: 0000000000000000 R12: 0000000000000002
[ 608.403398] R13: 0000000000000c00 R14: 000000000000000f R15: 0000000000000000
[ 608.404650] FS: 0000000000000000(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
[ 608.406431] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 608.407528] CR2: 00000000006e5dc8 CR3: 000000000220c000 CR4: 00000000000006f0
[ 608.408821] Stack:
[ 608.409485] 0000000000002710 ffff88011fc0d800 0000000000000000 ffffffff8224da00
[ 608.411472] ffff88011fc03e00 ffffffff81061079 ffff88011fc03e18 ffffffff8105e2b1
[ 608.413416] ffffffff8224da00 ffff88011fc03e70 ffffffff81111e9b ffff88011fc13040
[ 608.415381] Call Trace:
[ 608.416100] <IRQ>
[ 608.416398] [<ffffffff81061079>] flat_send_IPI_all+0x1f/0x4a
[ 608.418058] [<ffffffff8105e2b1>] arch_trigger_all_cpu_backtrace+0x52/0x84
[ 608.419311] [<ffffffff81111e9b>] rcu_check_callbacks+0x205/0x57f
[ 608.420480] [<ffffffff8111b057>] ? tick_sched_do_timer+0x2f/0x2f
[ 608.421628] [<ffffffff810d044e>] update_process_times+0x3d/0x65
[ 608.422779] [<ffffffff8111af6c>] tick_sched_handle+0x37/0x43
[ 608.423872] [<ffffffff8111b091>] tick_sched_timer+0x3a/0x58
[ 608.424958] [<ffffffff810e3ac7>] __run_hrtimer+0x96/0x19a
[ 608.426059] [<ffffffff810e42ff>] hrtimer_interrupt+0xe8/0x1e3
[ 608.427167] [<ffffffff8105cbf1>] local_apic_timer_interrupt+0x54/0x57
[ 608.428390] [<ffffffff81a0f123>] smp_apic_timer_interrupt+0x3f/0x50
[ 608.429561] [<ffffffff81a0de32>] apic_timer_interrupt+0x72/0x80
[ 608.430686] <EOI>
[ 608.430977] [<ffffffff814f2f9a>] ? find_last_bit+0x4a/0x4a
[ 608.432785] [<ffffffff8116d25f>] ? zone_watermark_ok_safe+0x61/0xad
[ 608.433954] [<ffffffff81177dac>] zone_balanced+0x1e/0x43
[ 608.435042] [<ffffffff8117b461>] balance_pgdat+0x3b6/0x501
[ 608.436132] [<ffffffff8117b8eb>] kswapd+0x33f/0x3c7
[ 608.437169] [<ffffffff810fb058>] ? __wake_up_sync+0x12/0x12
[ 608.438272] [<ffffffff8117b5ac>] ? balance_pgdat+0x501/0x501
[ 608.439367] [<ffffffff810e155c>] kthread+0xdb/0xe3
[ 608.440394] [<ffffffff810e1481>] ? kthread_create_on_node+0x16f/0x16f
[ 608.441586] [<ffffffff81a0d1bc>] ret_from_fork+0x7c/0xb0
[ 608.442637] [<ffffffff810e1481>] ? kthread_create_on_node+0x16f/0x16f
[ 608.443852] Code: eb f0 44 89 f0 c1 e0 18 89 04 25 10 33 5f ff 44 89 e0 44 09 e8 41 81 cd 00 04 00 00 41 83 fc 02 41 0f 44 c5 89 04 25 00 33 5f ff <f6> c7 02 75 11 48 89 df 57 9d 0f 1f 44 00 00 e8 6e 94 0e 00 eb
Full dmesgs and kconfig are attached.
Thanks,
Fengguang
View attachment "dmesg-ltp" of type "text/plain" (173831 bytes)
View attachment "dmesg-xfstests" of type "text/plain" (167293 bytes)
View attachment "x86_64-lkp" of type "text/plain" (80581 bytes)
Powered by blists - more mailing lists