lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140206095646.GE17971@localhost>
Date:	Thu, 6 Feb 2014 17:56:46 +0800
From:	Fengguang Wu <fengguang.wu@...el.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	LKML <linux-kernel@...r.kernel.org>
Subject: [sched/preempt] INFO: rcu_sched self-detected stall on CPU { 1}

Hi Peter,

We noticed the below RCU stalls which will block the system.
The problem is bisected to

commit 8cb75e0c4ec9786b81439761eac1d18d4a931af3
Author:     Peter Zijlstra <peterz@...radead.org>
AuthorDate: Wed Nov 20 12:22:37 2013 +0100
Commit:     Ingo Molnar <mingo@...nel.org>
CommitDate: Mon Jan 13 17:38:55 2014 +0100

    sched/preempt: Fix up missed PREEMPT_NEED_RESCHED folding
    
 arch/x86/include/asm/mwait.h |  2 +-
 include/linux/preempt.h      | 15 +++++++++++++++
 include/linux/sched.h        | 15 +++++++++++++++
 kernel/cpu/idle.c            | 17 ++++++++++-------
 kernel/sched/core.c          |  3 +--
 5 files changed, 42 insertions(+), 10 deletions(-)

[   85.786775] INFO: rcu_sched self-detected stall on CPU { 1}  (t=15000 jiffies g=233 c=232 q=1940)
[   85.788745] sending NMI to all CPUs:
[   85.789539] NMI backtrace for cpu 0
[   85.790389] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-rc8-wl-05610-gb524b38-dirty #1
[   85.792043] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   85.793072] task: ffffffff82211440 ti: ffffffff82200000 task.ti: ffffffff82200000
[   85.794631] RIP: 0010:[<ffffffff81067bfa>]  [<ffffffff81067bfa>] native_safe_halt+0x6/0x8
[   85.796314] RSP: 0018:ffffffff82201ec8  EFLAGS: 00000246
[   85.797286] RAX: 0000000000000000 RBX: ffffffff82201fd8 RCX: 00000000ffffffff
[   85.798507] RDX: 0100000000000000 RSI: 0000000000000000 RDI: 0000000000000046
[   85.799680] RBP: ffffffff82201ec8 R08: 0000000000000000 R09: 0000000000000000
[   85.800862] R10: 0000000000000000 R11: 0000000000000400 R12: 0000000000000000
[   85.802055] R13: ffffffff82201fd8 R14: ffffffff82201fd8 R15: 00000000ffffffff
[   85.803243] FS:  0000000000000000(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
[   85.804872] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[   85.806014] CR2: 00007feec94d2000 CR3: 000000007f7a2000 CR4: 00000000000006f0
[   85.807265] Stack:
[   85.807907]  ffffffff82201ee8 ffffffff81041811 ffffffff82201fd8 ffffffff82201fd8
[   85.809808]  ffffffff82201ef8 ffffffff81041f3d ffffffff82201f40 ffffffff8110be4c
[   85.811649]  ffff88011ffa3ec0 35ceeb7d84596041 ffffffffffffffff ffffffff823ef8d0
[   85.813483] Call Trace:
[   85.814198]  [<ffffffff81041811>] default_idle+0x38/0xc1
[   85.815171]  [<ffffffff81041f3d>] arch_cpu_idle+0x18/0x28
[   85.816152]  [<ffffffff8110be4c>] cpu_startup_entry+0x178/0x245
[   85.817201]  [<ffffffff81a022c3>] rest_init+0x87/0x89
[   85.818204]  [<ffffffff8234be04>] start_kernel+0x435/0x440
[   85.819197]  [<ffffffff8234b7dd>] ? repair_env_string+0x58/0x58
[   85.820241]  [<ffffffff8234b120>] ? early_idt_handlers+0x120/0x120
[   85.821311]  [<ffffffff8234b498>] x86_64_start_reservations+0x2a/0x2c
[   85.822434]  [<ffffffff8234b5d5>] x86_64_start_kernel+0x13b/0x148
[   85.823545] Code: 48 89 e5 0f 09 5d c3 55 48 89 e5 9c 58 5d c3 55 48 89 e5 57 9d 5d c3 55 48 89 e5 fa 5d c3 55 48 89 e5 fb 5d c3 55 48 89 e5 fb f4 <5d> c3 55 48 89 e5 f4 5d c3 55 49 89 ca 49 89 d1 8b 07 48 89 e5 


[  608.380273] INFO: rcu_sched self-detected stall on CPU { 0}  (t=15000 jiffies g=5755 c=5754 q=18379)
[  608.383101] sending NMI to all CPUs:
[  608.383940] NMI backtrace for cpu 0
[  608.384786] CPU: 0 PID: 692 Comm: kswapd0 Not tainted 3.13.0-rc7-00177-g8cb75e0 #1
[  608.392624] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  608.393750] task: ffff8801195f5a00 ti: ffff8800b6468000 task.ti: ffff8800b6468000
[  608.395395] RIP: 0010:[<ffffffff81060dd2>]  [<ffffffff81060dd2>] flat_send_IPI_mask+0x7e/0xac
[  608.397299] RSP: 0018:ffff88011fc03dd0  EFLAGS: 00010046
[  608.398342] RAX: 0000000000000c00 RBX: 0000000000000046 RCX: 0000000000000007
[  608.399622] RDX: ffffffff82221ca0 RSI: 0000000000000002 RDI: 0000000000000300
[  608.400863] RBP: ffff88011fc03df0 R08: 0000000000000000 R09: 0000000000000000
[  608.402152] R10: 000000000000c120 R11: 0000000000000000 R12: 0000000000000002
[  608.403398] R13: 0000000000000c00 R14: 000000000000000f R15: 0000000000000000
[  608.404650] FS:  0000000000000000(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
[  608.406431] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  608.407528] CR2: 00000000006e5dc8 CR3: 000000000220c000 CR4: 00000000000006f0
[  608.408821] Stack:
[  608.409485]  0000000000002710 ffff88011fc0d800 0000000000000000 ffffffff8224da00
[  608.411472]  ffff88011fc03e00 ffffffff81061079 ffff88011fc03e18 ffffffff8105e2b1
[  608.413416]  ffffffff8224da00 ffff88011fc03e70 ffffffff81111e9b ffff88011fc13040
[  608.415381] Call Trace:
[  608.416100]  <IRQ> 
[  608.416398]  [<ffffffff81061079>] flat_send_IPI_all+0x1f/0x4a
[  608.418058]  [<ffffffff8105e2b1>] arch_trigger_all_cpu_backtrace+0x52/0x84
[  608.419311]  [<ffffffff81111e9b>] rcu_check_callbacks+0x205/0x57f
[  608.420480]  [<ffffffff8111b057>] ? tick_sched_do_timer+0x2f/0x2f
[  608.421628]  [<ffffffff810d044e>] update_process_times+0x3d/0x65
[  608.422779]  [<ffffffff8111af6c>] tick_sched_handle+0x37/0x43
[  608.423872]  [<ffffffff8111b091>] tick_sched_timer+0x3a/0x58
[  608.424958]  [<ffffffff810e3ac7>] __run_hrtimer+0x96/0x19a
[  608.426059]  [<ffffffff810e42ff>] hrtimer_interrupt+0xe8/0x1e3
[  608.427167]  [<ffffffff8105cbf1>] local_apic_timer_interrupt+0x54/0x57
[  608.428390]  [<ffffffff81a0f123>] smp_apic_timer_interrupt+0x3f/0x50
[  608.429561]  [<ffffffff81a0de32>] apic_timer_interrupt+0x72/0x80
[  608.430686]  <EOI> 
[  608.430977]  [<ffffffff814f2f9a>] ? find_last_bit+0x4a/0x4a
[  608.432785]  [<ffffffff8116d25f>] ? zone_watermark_ok_safe+0x61/0xad
[  608.433954]  [<ffffffff81177dac>] zone_balanced+0x1e/0x43
[  608.435042]  [<ffffffff8117b461>] balance_pgdat+0x3b6/0x501
[  608.436132]  [<ffffffff8117b8eb>] kswapd+0x33f/0x3c7
[  608.437169]  [<ffffffff810fb058>] ? __wake_up_sync+0x12/0x12
[  608.438272]  [<ffffffff8117b5ac>] ? balance_pgdat+0x501/0x501
[  608.439367]  [<ffffffff810e155c>] kthread+0xdb/0xe3
[  608.440394]  [<ffffffff810e1481>] ? kthread_create_on_node+0x16f/0x16f
[  608.441586]  [<ffffffff81a0d1bc>] ret_from_fork+0x7c/0xb0
[  608.442637]  [<ffffffff810e1481>] ? kthread_create_on_node+0x16f/0x16f
[  608.443852] Code: eb f0 44 89 f0 c1 e0 18 89 04 25 10 33 5f ff 44 89 e0 44 09 e8 41 81 cd 00 04 00 00 41 83 fc 02 41 0f 44 c5 89 04 25 00 33 5f ff <f6> c7 02 75 11 48 89 df 57 9d 0f 1f 44 00 00 e8 6e 94 0e 00 eb 

Full dmesgs and kconfig are attached.

Thanks,
Fengguang

View attachment "dmesg-ltp" of type "text/plain" (173831 bytes)

View attachment "dmesg-xfstests" of type "text/plain" (167293 bytes)

View attachment "x86_64-lkp" of type "text/plain" (80581 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ