lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANfgCY2bSyw9n9tYZW46yRHXM-fF+XqobVOjjE_uOWcPaUWHCA@mail.gmail.com>
Date:	Fri, 11 Dec 2015 22:17:33 +1100
From:	Ross Green <rgkernel@...il.com>
To:	linux-kernel@...r.kernel.org
Cc:	paulmck@...ux.vnet.ibm.com, mingo@...nel.org,
	jiangshanlai@...il.com, dipankar@...ibm.com,
	akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
	josh@...htriplett.org, tglx@...utronix.de, peterz@...radead.org,
	rostedt@...dmis.org, dhowells@...hat.com, edumazet@...gle.com,
	dvhart@...ux.intel.com, fweisbec@...il.com, oleg@...hat.com,
	bobby.prani@...il.com
Subject: rcu_preempt self-detected stall on CPU from 4.4-rc4, since 3.17

I have been getting these stalls in kernels going back to 3.17.

This stall occurs usually under light load but often requires several
days to show itself. I have not found any simple way to trigger the
stall. Indeed heavy workloads seems not to show the fault.

Anyone have any thoughts here?

The recent patch by peterz with kernel/sched/wait.c I thought might
help the situation, but alas after a few days of running 4.4-rc4 the
following turned up.

[179922.003570] INFO: rcu_preempt self-detected stall on CPU
[179922.008178] INFO: rcu_preempt detected stalls on CPUs/tasks:

[179922.008178]         0-...: (1 ticks this GP) idle=a91/1/0
softirq=1296733/1296733 fqs=0
[179922.008178]
[179922.008209] (detected by 1, t=8775 jiffies, g=576439, c=576438, q=102)
[179922.008209] Task dump for CPU 0:
[179922.008209] swapper/0       R
[179922.008209]  running
[179922.008209]     0     0      0 0x00000000
[179922.008209] Backtrace:

[179922.008239] Backtrace aborted due to bad frame pointer <c0907f54>
[179922.008239] rcu_preempt kthread starved for 8775 jiffies! g576439
c576438 f0x0 s3 ->state=0x1

[179922.060302]         0-...: (1 ticks this GP) idle=a91/1/0
softirq=1296733/1296733 fqs=0
[179922.068023]          (t=8775 jiffies g=576439 c=576438 q=102)
[179922.073913] rcu_preempt kthread starved for 8775 jiffies! g576439
c576438 f0x2 s3 ->state=0x100
[179922.083587] Task dump for CPU 0:
[179922.087097] swapper/0       R running      0     0      0 0x00000000
[179922.093292] Backtrace:
[179922.096313] [<c0013ea8>] (dump_backtrace) from [<c00140a4>]
(show_stack+0x18/0x1c)
[179922.104675]  r7:c0908514 r6:80080193 r5:00000000 r4:c090aca8
[179922.110809] [<c001408c>] (show_stack) from [<c005a858>]
(sched_show_task+0xbc/0x110)
[179922.119049] [<c005a79c>] (sched_show_task) from [<c005ccd4>]
(dump_cpu_task+0x40/0x44)
[179922.127624]  r5:c0917680 r4:00000000
[179922.131042] [<c005cc94>] (dump_cpu_task) from [<c0082268>]
(rcu_dump_cpu_stacks+0x9c/0xdc)
[179922.140350]  r5:c0917680 r4:00000001
[179922.143157] [<c00821cc>] (rcu_dump_cpu_stacks) from [<c008637c>]
(rcu_check_callbacks+0x504/0x8e4)
[179922.153808]  r9:c0908514 r8:c0917680 r7:00000066 r6:2eeab000
r5:c0904300 r4:ef7af300
[179922.161499] [<c0085e78>] (rcu_check_callbacks) from [<c00895d0>]
(update_process_times+0x40/0x6c)
[179922.170898]  r10:c009a584 r9:00000001 r8:ef7abc4c r7:0000a3a3
r6:4ec3391c r5:00000000
[179922.179901]  r4:c090aca8
[179922.182708] [<c0089590>] (update_process_times) from [<c009a580>]
(tick_sched_handle+0x50/0x54)
[179922.192108]  r5:c0907f10 r4:ef7abe40
[179922.195983] [<c009a530>] (tick_sched_handle) from [<c009a5d4>]
(tick_sched_timer+0x50/0x94)
[179922.204895] [<c009a584>] (tick_sched_timer) from [<c0089fe4>]
(__hrtimer_run_queues+0x110/0x1a0)
[179922.214324]  r7:00000000 r6:ef7abc40 r5:ef7abe40 r4:ef7abc00
[179922.220428] [<c0089ed4>] (__hrtimer_run_queues) from [<c008a674>]
(hrtimer_interrupt+0xac/0x1f8)
[179922.227111]  r10:ef7abc78 r9:ef7abc98 r8:ef7abc14 r7:ef7abcb8
r6:ffffffff r5:00000003
[179922.238220]  r4:ef7abc00
[179922.238220] [<c008a5c8>] (hrtimer_interrupt) from [<c00170ec>]
(twd_handler+0x38/0x48)
[179922.238220]  r10:c09084e8 r9:fa241100 r8:00000011 r7:ef028780
r6:c092574c r5:ef005cc0
[179922.257110]  r4:00000001
[179922.257110] [<c00170b4>] (twd_handler) from [<c007c8f8>]
(handle_percpu_devid_irq+0x74/0x8c)
[179922.269683]  r5:ef005cc0 r4:ef7b1740
[179922.269683] [<c007c884>] (handle_percpu_devid_irq) from
[<c0078454>] (generic_handle_irq+0x2c/0x3c)
[179922.283233]  r9:fa241100 r8:ef008000 r7:00000001 r6:00000000
r5:00000000 r4:c09013e8
[179922.290985] [<c0078428>] (generic_handle_irq) from [<c007872c>]
(__handle_domain_irq+0x64/0xbc)
[179922.300842] [<c00786c8>] (__handle_domain_irq) from [<c00094c0>]
(gic_handle_irq+0x50/0x90)
[179922.303222]  r9:fa241100 r8:fa240100 r7:c0907f10 r6:fa24010c
r5:c09087a8 r4:c0925748
[179922.315216] [<c0009470>] (gic_handle_irq) from [<c0014bd4>]
(__irq_svc+0x54/0x90)
[179922.319000] Exception stack(0xc0907f10 to 0xc0907f58)
[179922.331542] 7f00:                                     00000000
ef7ab390 fe600000 00000000
[179922.331542] 7f20: c0906000 c090849c c0900364 c06a8124 c0907f80
c0944563 c09084e8 c0907f6c
[179922.349029] 7f40: c0907f4c c0907f60 c00287ac c0010ba8 60080113 ffffffff
[179922.349029]  r9:c0944563 r8:c0907f80 r7:c0907f44 r6:ffffffff
r5:60080113 r4:c0010ba8
[179922.357116] [<c0010b80>] (arch_cpu_idle) from [<c006f034>]
(default_idle_call+0x28/0x34)
[179922.368926] [<c006f00c>] (default_idle_call) from [<c006f154>]
(cpu_startup_entry+0x114/0x18c)
[179922.368926] [<c006f040>] (cpu_startup_entry) from [<c069fc6c>]
(rest_init+0x90/0x94)
[179922.385284]  r7:ffffffff r4:00000002
[179922.393463] [<c069fbdc>] (rest_init) from [<c08bbcec>]
(start_kernel+0x370/0x37c)
[179922.400421]  r5:c0947000 r4:00000000
[179922.400421] [<c08bb97c>] (start_kernel) from [<8000807c>] (0x8000807c)
$
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ