linux-kernel - [btrfs_destroy_workqueue] WARNING: CPU: 0 PID: 6954 at kernel/workqueue.c:4142 destroy

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20180419052548.ewio4pf6i7ptqj54@wfg-t540p.sh.intel.com>
Date:   Thu, 19 Apr 2018 13:25:48 +0800
From:   Fengguang Wu <fengguang.wu@...el.com>
To:     linux-btrfs@...r.kernel.org
Cc:     Chris Mason <clm@...com>, Josef Bacik <jbacik@...com>,
        David Sterba <dsterba@...e.com>, Jeff Mahoney <jeffm@...e.com>,
        linux-kernel@...r.kernel.org, lkp@...org
Subject: [btrfs_destroy_workqueue] WARNING: CPU: 0 PID: 6954 at
 kernel/workqueue.c:4142 destroy_workqueue+0x64/0x1e0

Hello,

FYI this happens in mainline kernel 4.17.0-rc1.
It at least dates back to v4.14-rc2 .

It's triggered when running fio tests. It's really hard to reproduce
(only happened once in 4.17-rc1 and several times in v4.14-rc2) and
all bisects failed so far.

[  133.751073] WARNING: stack going in the wrong direction? ip=__schedule+0x489/0x830:
						perf_sw_event_sched at include/linux/perf_event.h:1062
						 (inlined by) perf_event_task_sched_out at include/linux/perf_event.h:1100
						 (inlined by) prepare_task_switch at kernel/sched/core.c:2636
						 (inlined by) context_switch at kernel/sched/core.c:2813
						 (inlined by) __schedule at kernel/sched/core.c:3490
[  134.048965] perf: interrupt took too long (9682 > 9626), lowering kernel.perf_event_max_sample_rate to 20000
[  134.472390] perf: interrupt took too long (12178 > 12102), lowering kernel.perf_event_max_sample_rate to 16000
[  234.324541] 2018-04-17 16:08:50 umount /fs/pmem0
[  234.324546]
[  240.185400] WARNING: CPU: 0 PID: 6954 at kernel/workqueue.c:4142 destroy_workqueue+0x64/0x1e0:
						destroy_workqueue at kernel/workqueue.c:4142 (discriminator 1)
[  240.197915] Modules linked in: btrfs xor zstd_decompress zstd_compress xxhash raid6_pq dm_mod sr_mod cdrom intel_rapl sd_mod sg sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp mgag200 kvm_intel ttm kvm irqbypass crct10dif_pclmul crc32_pclmul drm_kms_helper crc32c_intel ghash_clmulni_intel syscopyarea nd_pmem(O) dax_pmem(O) snd_pcm pcbc sysfillrect device_dax(O) nd_btt(O) snd_timer sysimgblt aesni_intel fb_sys_fops nd_e820(O) crypto_simd ipmi_si libnvdimm(O) snd soundcore ahci mxm_wmi cryptd ipmi_devintf wdat_wdt dcdbas nfit_test_iomap(O) libahci pcspkr drm megaraid_sas glue_helper libata ipmi_msghandler wmi acpi_power_meter shpchp ip_tables
[  240.267813] CPU: 0 PID: 6954 Comm: umount Tainted: G           O      4.17.0-rc1 #1
[  240.277473] Hardware name: Dell Inc. PowerEdge R630/0CNCJW, BIOS 2.1.7 06/16/2016
[  240.286967] RIP: 0010:destroy_workqueue+0x64/0x1e0:
						destroy_workqueue at kernel/workqueue.c:4142 (discriminator 1)
[  240.293463] RSP: 0018:ffffc90021b0fde0 EFLAGS: 00010202
[  240.300462] RAX: ffff884072f4c058 RBX: ffff88407a17dc00 RCX: ffff884072f4c000
[  240.309628] RDX: ffff884072f4c058 RSI: 0000000000000000 RDI: ffffffff820cfa30
[  240.318802] RBP: ffff88407a17dc20 R08: ffffc90021b0fd40 R09: 0000000000000000
[  240.327990] R10: ffffc90021b0fdb8 R11: 0000000000000000 R12: ffff8820347f4fc0
[  240.337306] R13: ffff884079c23138 R14: ffffffff82d47aa0 R15: 0000000000000000
[  240.346546] FS:  00007f941fee2840(0000) GS:ffff882067600000(0000) knlGS:0000000000000000
[  240.356876] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  240.364602] CR2: 00007ff793645258 CR3: 0000004074f4a005 CR4: 00000000001606f0
[  240.373917] Call Trace:
[  240.378060]  btrfs_destroy_workqueue+0x40/0x110 [btrfs]
[  240.385322]  btrfs_stop_all_workers+0x2d/0xf0 [btrfs]
[  240.392397]  close_ctree+0x133/0x2f0 [btrfs]
[  240.398581]  generic_shutdown_super+0x6c/0x120:
						__read_once_size at include/linux/compiler.h:188
						 (inlined by) list_empty at include/linux/list.h:203
						 (inlined by) generic_shutdown_super at fs/super.c:442
[  240.404956]  kill_anon_super+0xe/0x20:
						kill_anon_super at fs/super.c:1038
[  240.410482]  btrfs_kill_super+0x13/0x100 [btrfs]
[  240.417076]  deactivate_locked_super+0x3f/0x70:
						deactivate_locked_super at fs/super.c:320
[  240.423483]  cleanup_mnt+0x3b/0x70:
						cleanup_mnt at fs/namespace.c:1174
[  240.428737]  task_work_run+0xa3/0xe0:
						task_work_run at kernel/task_work.c:115 (discriminator 1)
[  240.434199]  exit_to_usermode_loop+0x9e/0xa0:
						tracehook_notify_resume at include/linux/tracehook.h:191
						 (inlined by) exit_to_usermode_loop at arch/x86/entry/common.c:166
[  240.440456]  do_syscall_64+0x16c/0x180:
						prepare_exit_to_usermode at arch/x86/entry/common.c:196
						 (inlined by) syscall_return_slowpath at arch/x86/entry/common.c:265
						 (inlined by) do_syscall_64 at arch/x86/entry/common.c:290
[  240.446133]  entry_SYSCALL_64_after_hwframe+0x44/0xa9:
						entry_SYSCALL_64_after_hwframe at arch/x86/entry/entry_64.S:247
[  240.453285] RIP: 0033:0x7f941f7c7277
[  240.458782] RSP: 002b:00007fffb9a17cf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[  240.468778] RAX: 0000000000000000 RBX: 000000000158b6e0 RCX: 00007f941f7c7277
[  240.478299] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000000000158b8c0
[  240.487824] RBP: 000000000158b8c0 R08: 0000000000000000 R09: 0000000000000015
[  240.497317] R10: 00000000000006b0 R11: 0000000000000246 R12: 00007f941fcc9e44
[  240.506810] R13: 0000000000000000 R14: 0000000000000000 R15: 00007fffb9a17f80
[  240.516315] Code: c2 74 19 8b 30 85 f6 74 f1 0f 0b 48 89 ef e8 84 9f 8c 00 5b 5d 41 5c e9 cb fa ff ff 48 39 8b a0 00 00 00 74 0a 83 79 18 01 7e 04 <0f> 0b eb dc 8b 41 58 85 c0 0f 85 3d 01 00 00 48 8b 41 60 48 8d
[  240.540655] ---[ end trace faf649c5bf432714 ]---
[  240.547594] Showing busy workqueues and worker pools:

Attached the full dmesg and kconfig.

Thanks,
Fengguang

View attachment "dmesg-lkp-hsw-ep6:20180417161212:x86_64-rhel-7.2:gcc-7:4.17.0-rc1:1" of type "text/plain" (155102 bytes)

View attachment ".config" of type "text/plain" (164169 bytes)