lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 21 Nov 2017 19:19:07 +0800
From:   Fengguang Wu <fengguang.wu@...el.com>
To:     netdev@...r.kernel.org
Cc:     Thomas Graf <tgraf@...g.ch>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "David S. Miller" <davem@...emloft.net>,
        Florian Westphal <fw@...len.de>, linux-kernel@...r.kernel.org,
        lkp@...org
Subject: [rht_deferred_worker] BUG: workqueue lockup - pool cpus=0 node=0
 flags=0x0 nice=0 stuck for 62s!

Hello,

FYI this happens in mainline kernel 4.14.0-10859-gcf9b077.
It at least dates back to v4.5 .

It occurs in 2 out of 2 boots.

[  204.036012] Average test time: 28300548370
[  204.045272] Testing concurrent rhashtable access from 10 threads
[  206.134970] Writes:  Total: 2  Max/Min: 0/0   Fail: 0
[  267.575158] Writes:  Total: 2  Max/Min: 0/0   Fail: 0
[  329.015853] Writes:  Total: 2  Max/Min: 0/0   Fail: 0
[  368.268565] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 62s!
[  368.294584] Showing busy workqueues and worker pools:
[  368.303142] workqueue events: flags=0x0
[  368.310492]   pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=4/256
[  368.312885]     in-flight: 3:rht_deferred_worker
[  368.312885]     pending: rht_deferred_worker, tsc_refine_calibration_work, vmstat_shepherd
[  368.342255] pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=62s workers=2 idle: 16
[  368.356480] INFO: task swapper/0:1 blocked for more than 120 seconds.
[  368.451098]       Not tainted 4.14.0-10859-gcf9b077 #13
[  368.564524] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  368.726003] swapper/0       D    0     1      0 0x80000000
[  368.845359] Call Trace:
[  368.908714]  ? __schedule+0x81a/0x84b:
context_switch at kernel/sched/core.c:2801
  (inlined by) __schedule at kernel/sched/core.c:3374
[  368.984507]  ? schedule+0x59/0x84:
__preempt_count_sub at arch/x86/include/asm/preempt.h:107 (discriminator 1)
  (inlined by) schedule at kernel/sched/core.c:3434 (discriminator 1)
[  369.067920]  ? schedule_timeout+0x1b/0xb9:
schedule_timeout at kernel/time/timer.c:1766
[  369.155414]  ? _raw_spin_unlock_irq+0x28/0x6a:
__raw_spin_unlock_irq at include/linux/spinlock_api_smp.h:168
  (inlined by) _raw_spin_unlock_irq at kernel/locking/spinlock.c:199
[  369.241353]  ? _raw_spin_unlock_irq+0x4a/0x6a:
__raw_spin_unlock_irq at include/linux/spinlock_api_smp.h:182
  (inlined by) _raw_spin_unlock_irq at kernel/locking/spinlock.c:199
[  369.334592]  ? __wait_for_common+0x186/0x24a:
do_wait_for_common at kernel/sched/completion.c:91
  (inlined by) __wait_for_common at kernel/sched/completion.c:112
[  369.431098]  ? console_conditional_schedule+0x3c/0x3c:
schedule_timeout at kernel/time/timer.c:1751
[  369.557839]  ? do_task_dead+0x46/0x46:
default_wake_function at kernel/sched/core.c:3626
[  369.651937]  ? wait_for_completion+0x19/0x1c:
wait_for_completion at kernel/sched/completion.c:145
[  369.752007]  ? kthread_stop+0x196/0x326:
kthread_stop at kernel/kthread.c:531
[  369.847831]  ? test_rht_init+0x963/0x9da:
test_rht_init at lib/test_rhashtable.c:663
[  369.942078]  ? ___siphash_aligned+0x10/0x10
[  370.077915]  ? test_rhltable+0xd1e/0xd1e:
test_rht_init at lib/test_rhashtable.c:565
[  370.171118]  ? do_one_initcall+0x96/0x19b:
do_one_initcall at init/main.c:827
[  370.272905]  ? parse_args+0x19d/0x250:
parse_one at kernel/params.c:156
  (inlined by) parse_args at kernel/params.c:191
[  370.358882]  ? kernel_init_freeable+0xd2/0x1b8:
do_initcall_level at init/main.c:886
  (inlined by) do_initcalls at init/main.c:901
  (inlined by) do_basic_setup at init/main.c:919
  (inlined by) kernel_init_freeable at init/main.c:1067
[  370.465397]  ? kernel_init_freeable+0xee/0x1b8:
do_initcall_level at init/main.c:892
  (inlined by) do_initcalls at init/main.c:901
  (inlined by) do_basic_setup at init/main.c:919
  (inlined by) kernel_init_freeable at init/main.c:1067
[  370.571169]  ? rest_init+0xb3/0xb3:
kernel_init at init/main.c:991
[  370.658826]  ? kernel_init+0xd/0x15f:
kernel_init at init/main.c:996
[  370.746031]  ? ret_from_fork+0x19/0x24:
ret_from_fork at arch/x86/entry/entry_32.S:299
[  370.839497]
[  370.839497] Showing all locks held in the system:
[  370.978033] 2 locks held by khungtaskd/17:
[  370.985872]  #0:  (rcu_read_lock){....}, at: [<7911eae2>] rcu_read_lock+0x0/0x6c:
rcu_read_lock at include/linux/rcupdate.h:628

rcu_read_lock+0x0/0x6c:
rcu_read_lock at include/linux/rcupdate.h:628

rcu_read_lock+0x0/0x6c:
rcu_read_lock at include/linux/rcupdate.h:628

rcu_read_lock+0x0/0x6c:
rcu_read_lock at include/linux/rcupdate.h:628

rcu_read_lock+0x0/0x6c:
rcu_read_lock at include/linux/rcupdate.h:628

rcu_read_lock+0x0/0x6c:
rcu_read_lock at include/linux/rcupdate.h:628

rcu_read_lock+0x0/0x6c:
rcu_read_lock at include/linux/rcupdate.h:628
[  370.999617]  #1:  (tasklist_lock){.+.+}, at: [<790bb514>] debug_show_all_locks+0x3a/0x13d:
debug_show_all_locks at kernel/locking/lockdep.c:4554
[  371.014651]
[  371.018049] =============================================
[  371.018049]
[  371.651885] NMI backtrace for cpu 0
[  371.668798] CPU: 0 PID: 17 Comm: khungtaskd Not tainted 4.14.0-10859-gcf9b077 #13
[  371.670706] Call Trace:
[  371.670706]  ? show_stack+0x5e/0x66:
show_stack at arch/x86/kernel/dumpstack.c:179
[  371.670706]  ? dump_stack+0xc1/0x121:
__dump_stack at lib/dump_stack.c:17
  (inlined by) dump_stack at lib/dump_stack.c:53
[  371.670706]  ? nmi_cpu_backtrace+0xf7/0x110:
nmi_cpu_backtrace at lib/nmi_backtrace.c:106
[  371.670706]  ? nmi_trigger_cpumask_backtrace+0xcf/0x1b2:
cpumask_empty at include/linux/cpumask.h:494
  (inlined by) nmi_trigger_cpumask_backtrace at lib/nmi_backtrace.c:67
[  371.670706]  ? lapic_can_unplug_cpu+0xb5/0xb5:
nmi_raise_cpu_backtrace at arch/x86/kernel/apic/hw_nmi.c:32
[  371.670706]  ? watchdog+0x524/0x593:
check_hung_task at kernel/hung_task.c:133
  (inlined by) check_hung_uninterruptible_tasks at kernel/hung_task.c:190
  (inlined by) watchdog at kernel/hung_task.c:249
[  371.670706]  ? kthread+0x154/0x159:
kthread at kernel/kthread.c:238
[  371.670706]  ? rcu_read_lock+0x6c/0x6c:
rootfs_mount at init/do_mounts.c:609

rcu_read_lock+0x6c/0x6c:
rcu_read_unlock at include/linux/rcupdate.h:682

rcu_read_lock+0x6c/0x6c:
ptrace_event_pid at include/linux/ptrace.h:172

rcu_read_lock+0x6c/0x6c:
write_seqlock at include/linux/seqlock.h:448

rcu_read_lock+0x6c/0x6c:
copy_overflow at include/linux/thread_info.h:125

rcu_read_lock+0x6c/0x6c:
__ptrace_may_access at kernel/ptrace.c:272

rcu_read_lock+0x6c/0x6c:
__sigqueue_alloc at kernel/signal.c:368
[  371.670706]  ? queuing_blocked+0x68/0x68:
kthread at kernel/kthread.c:198
[  371.670706]  ? ret_from_fork+0x19/0x24:
ret_from_fork at arch/x86/entry/entry_32.S:299
[  372.922491] Kernel panic - not syncing: hung_task: blocked tasks
[  372.923689] CPU: 0 PID: 17 Comm: khungtaskd Not tainted 4.14.0-10859-gcf9b077 #13
[  372.923689] Call Trace:
[  372.923689]  ? show_stack+0x5e/0x66:
show_stack at arch/x86/kernel/dumpstack.c:179
[  372.923689]  ? dump_stack+0xc1/0x121:
__dump_stack at lib/dump_stack.c:17
  (inlined by) dump_stack at lib/dump_stack.c:53
[  372.923689]  ? panic+0xec/0x2b6:
panic at kernel/panic.c:192
[  372.923689]  ? watchdog+0x52e/0x593:
__read_once_size at include/linux/compiler.h:178
  (inlined by) check_hung_uninterruptible_tasks at kernel/hung_task.c:180
  (inlined by) watchdog at kernel/hung_task.c:249
[  372.923689]  ? kthread+0x154/0x159:
kthread at kernel/kthread.c:238
[  372.923689]  ? rcu_read_lock+0x6c/0x6c:
rootfs_mount at init/do_mounts.c:609

rcu_read_lock+0x6c/0x6c:
rcu_read_unlock at include/linux/rcupdate.h:682

rcu_read_lock+0x6c/0x6c:
ptrace_event_pid at include/linux/ptrace.h:172

rcu_read_lock+0x6c/0x6c:
write_seqlock at include/linux/seqlock.h:448

rcu_read_lock+0x6c/0x6c:
copy_overflow at include/linux/thread_info.h:125

rcu_read_lock+0x6c/0x6c:
__ptrace_may_access at kernel/ptrace.c:272

rcu_read_lock+0x6c/0x6c:
__sigqueue_alloc at kernel/signal.c:368
[  372.923689]  ? queuing_blocked+0x68/0x68:
kthread at kernel/kthread.c:198
[  372.923689]  ? ret_from_fork+0x19/0x24:
ret_from_fork at arch/x86/entry/entry_32.S:299
[  372.923689] Kernel Offset: disabled

Attached the full dmesg, kconfig and reproduce scripts.

Thanks,
Fengguang

View attachment "dmesg-vm-lkp-wsx03-openwrt-i386-21:20171117091656:i386-randconfig-i0-201746:4.14.0-10859-gcf9b077:13" of type "text/plain" (38757 bytes)

View attachment ".config" of type "text/plain" (91660 bytes)

View attachment "job-script" of type "text/plain" (3732 bytes)

View attachment "reproduce-vm-lkp-wsx03-openwrt-i386-21:20171117091656:i386-randconfig-i0-201746:4.14.0-10859-gcf9b077:13" of type "text/plain" (2182 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ