[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ed075ba8-b916-9a5e-8c11-c99f47cd9ce9@candelatech.com>
Date: Wed, 9 Aug 2023 11:28:20 -0700
From: Ben Greear <greearb@...delatech.com>
To: Yu Kuai <yukuai1@...weicloud.com>,
David Howells <dhowells@...hat.com>,
Theodore Ts'o <tytso@....edu>,
Andrew Morton <akpm@...ux-foundation.org>,
Song Liu <song@...nel.org>, Christoph Hellwig <hch@....de>,
Alasdair Kergon <agk@...hat.com>
Cc: linux-raid@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, dm-devel@...hat.com,
linux-ext4@...r.kernel.org, "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [dm-devel] Processes hung in "D" state in ext4, mm, md and
dmcrypt (lockdep output)
On 7/26/23 19:38, Yu Kuai wrote:
> Hi,
>
> 在 2023/07/26 18:02, David Howells 写道:
>> Hi,
>>
>> With 6.5-rc2 (6.5.0-0.rc2.20230721gitf7e3a1bafdea.20.fc39.x86_64), I'm seeing
>> a bunch of processes getting stuck in the D state on my desktop after a few
>> hours of reading email and compiling stuff. It's happened every day this week
>> so far and I managed to grab stack traces of the stuck processes this morning
>> (see attached).
>>
>> There are two blockdevs involved below, /dev/md2 and /dev/md3. md3 is a raid1
>> array with two partitions with an ext4 partition on it. md2 is similar but
>> it's dm-crypted and ext4 is on top of that.
Hello,
I have been hitting deadlocks in 6.5 series, and I finally managed to catch one with
lockdep kernel and serial console active.
My kernels have a lot of wifi patches on top of mainline, but I do not see indication
in this backtrace that it is network related. A quick search of the mailing list found
this thread, and it looks like it might be similar to what I see. I am able to test
patches and try other debug options if someone has suggestions:
INFO: task mdX_raid1:488 blocked for more than 180 seconds.
Tainted: G W 6.5.0-rc4+ #6
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:mdX_raid1 state:D stack:0 pid:488 ppid:2 flags:0x00004000
Call Trace:
<TASK>
__schedule+0x7aa/0x16e0
? io_schedule_timeout+0xb0/0xb0
? lockdep_hardirqs_on_prepare+0x132/0x200
? _raw_spin_unlock_irqrestore+0x2b/0x50
? _raw_spin_unlock_irqrestore+0x2b/0x50
schedule+0x89/0x110
md_super_wait+0x110/0x160
? md_super_write+0x1b0/0x1b0
? print_usage_bug.part.0+0x3d0/0x3d0
? __mutex_lock+0xd22/0x1200
? destroy_sched_domains_rcu+0x30/0x30
? lock_acquire+0x15a/0x3b0
? md_bitmap_daemon_work+0x2fc/0x6a0
? find_held_lock+0x83/0xa0
md_bitmap_wait_writes+0x15f/0x170
? behind_writes_used_reset+0x40/0x40
? mark_held_locks+0x65/0x90
? lockdep_hardirqs_on_prepare+0x132/0x200
? _raw_spin_unlock_irq+0x1f/0x40
md_bitmap_daemon_work+0x483/0x6a0
? md_bitmap_write_all+0xe0/0xe0
? lock_acquire+0x15a/0x3b0
? lock_timer_base+0x93/0xd0
md_check_recovery+0x7b/0xa50
raid1d+0xdb/0x30b0 [raid1]
? lockdep_hardirqs_on_prepare+0x200/0x200
? lockdep_hardirqs_on_prepare+0x200/0x200
? lockdep_hardirqs_on_prepare+0x132/0x200
? _raw_spin_unlock_irqrestore+0x2b/0x50
? raid1_read_request+0x13b0/0x13b0 [raid1]
? finish_wait+0x4a/0xc0
? find_held_lock+0x83/0xa0
? lock_release+0x1d1/0x3b0
? md_thread+0x1fe/0x280
? reacquire_held_locks+0x270/0x270
? mark_held_locks+0x24/0x90
? lockdep_hardirqs_on_prepare+0x132/0x200
md_thread+0x175/0x280
? md_submit_discard_bio+0x210/0x210
? mark_held_locks+0x24/0x90
? destroy_sched_domains_rcu+0x30/0x30
? lockdep_hardirqs_on_prepare+0x132/0x200
? _raw_spin_unlock_irqrestore+0x2b/0x50
? md_submit_discard_bio+0x210/0x210
kthread+0x173/0x1b0
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x28/0x50
? kthread_complete_and_exit+0x20/0x20
ret_from_fork_asm+0x11/0x20
</TASK>
INFO: task jbd2/dm-4-8:526 blocked for more than 180 seconds.
Tainted: G W 6.5.0-rc4+ #6
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:jbd2/dm-4-8 state:D stack:0 pid:526 ppid:2 flags:0x00004000
Call Trace:
<TASK>
__schedule+0x7aa/0x16e0
? io_schedule_timeout+0xb0/0xb0
? lock_sync+0xf0/0xf0
? find_held_lock+0x83/0xa0
? __wait_on_bit+0x95/0x100
? reacquire_held_locks+0x270/0x270
? submit_bio_noacct_nocheck+0x516/0x5a0
schedule+0x89/0x110
io_schedule+0x68/0xa0
bit_wait_io+0x12/0x80
__wait_on_bit+0x3a/0x100
? bit_wait+0x80/0x80
out_of_line_wait_on_bit+0xd7/0x110
? __wait_on_bit+0x100/0x100
? cpuacct_css_alloc+0xd0/0xd0
jbd2_journal_commit_transaction+0x2767/0x2ec0
? jbd2_journal_finish_inode_data_buffers+0x50/0x50
? find_held_lock+0x83/0xa0
? lock_release+0x1d1/0x3b0
? __timer_delete_sync+0x170/0x210
? reacquire_held_locks+0x270/0x270
? lockdep_hardirqs_on_prepare+0x132/0x200
? _raw_spin_unlock_irqrestore+0x2b/0x50
? timer_shutdown+0xe0/0xe0
? do_raw_read_unlock+0x80/0x80
kjournald2+0x16c/0x560
? jbd2_write_superblock+0x450/0x450
? mark_held_locks+0x24/0x90
? destroy_sched_domains_rcu+0x30/0x30
? lockdep_hardirqs_on_prepare+0x132/0x200
? _raw_spin_unlock_irqrestore+0x2b/0x50
? jbd2_write_superblock+0x450/0x450
kthread+0x173/0x1b0
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x28/0x50
? kthread_complete_and_exit+0x20/0x20
ret_from_fork_asm+0x11/0x20
</TASK>
INFO: task journal-offline:190685 blocked for more than 180 seconds.
Tainted: G W 6.5.0-rc4+ #6
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:journal-offline state:D stack:0 pid:190685 ppid:1 flags:0x00000002
Call Trace:
<TASK>
__schedule+0x7aa/0x16e0
? io_schedule_timeout+0xb0/0xb0
? folio_wait_bit_common+0x11e/0x470
? find_held_lock+0x83/0xa0
? lock_release+0x1d1/0x3b0
? reacquire_held_locks+0x270/0x270
? do_raw_spin_lock+0x114/0x1a0
schedule+0x89/0x110
io_schedule+0x68/0xa0
folio_wait_bit_common+0x20b/0x470
? filemap_get_folios_contig+0x4e0/0x4e0
? filemap_invalidate_unlock_two+0x30/0x30
? lockdep_hardirqs_on_prepare+0x200/0x200
folio_wait_writeback+0x16/0x100
__filemap_fdatawait_range+0xc1/0x150
? filemap_get_folios_tag+0x5c0/0x5c0
? find_held_lock+0x83/0xa0
? __filemap_fdatawrite_range+0xc6/0x120
? delete_from_page_cache_batch+0x450/0x450
? strscpy+0xc5/0x190
file_write_and_wait_range+0x9b/0xc0
ext4_sync_file+0x134/0x640
__x64_sys_fsync+0x3e/0x60
do_syscall_64+0x34/0xb0
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f9088d0868a
RSP: 002b:00007f90869fd9f0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
RAX: ffffffffffffffda RBX: 000055efd35f5940 RCX: 00007f9088d0868a
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000000002a
RBP: 000055efd361a510 R08: 0000000000000000 R09: 00007fff65ac277f
R10: 00007f9088c8cc1e R11: 0000000000000293 R12: 000055efd2c527a0
R13: 0000000000000002 R14: 00007fff65ac2680 R15: 00007f90861fe000
</TASK>
INFO: task jbd2/dm-14-8:958 blocked for more than 180 seconds.
Tainted: G W 6.5.0-rc4+ #6
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:jbd2/dm-14-8 state:D stack:0 pid:958 ppid:2 flags:0x00004000
Call Trace:
<TASK>
__schedule+0x7aa/0x16e0
? io_schedule_timeout+0xb0/0xb0
? blk_mq_flush_plug_list+0x10/0x30
? __blk_flush_plug+0x1be/0x210
? lockdep_hardirqs_on_prepare+0x132/0x200
? _raw_spin_unlock_irqrestore+0x2b/0x50
? _raw_spin_unlock_irqrestore+0x2b/0x50
schedule+0x89/0x110
md_write_start+0x45e/0x590
? md_seq_show+0xc80/0xc80
? mark_lock+0x10b/0xd30
? destroy_sched_domains_rcu+0x30/0x30
? __lock_acquire+0x910/0x3240
raid1_make_request+0x17c/0x18d0 [raid1]
? __lock_acquire+0x910/0x3240
? lockdep_hardirqs_on_prepare+0x200/0x200
? lock_acquire+0x15a/0x3b0
? md_handle_request+0x230/0x5e0
? find_held_lock+0x83/0xa0
? raid1_end_write_request+0x720/0x720 [raid1]
? lock_release+0x1d1/0x3b0
? md_handle_request+0x29b/0x5e0
? reacquire_held_locks+0x270/0x270
? lock_acquire+0x15a/0x3b0
? bio_associate_blkg_from_css+0x209/0x9d0
? find_held_lock+0x83/0xa0
? lock_is_held_type+0xa6/0x110
md_handle_request+0x2dc/0x5e0
? mddev_init_writes_pending+0x60/0x60
? lock_release+0x1d1/0x3b0
? bio_associate_blkg+0x71/0x130
? reacquire_held_locks+0x270/0x270
? bio_associate_blkg_from_css+0x2e1/0x9d0
raid_map+0x61/0x80 [dm_raid]
__map_bio+0x85/0x290
dm_submit_bio+0x4b9/0xca0
? dm_get_live_table+0xd0/0xd0
? reacquire_held_locks+0x270/0x270
? lock_release+0x1d1/0x3b0
? lock_is_held_type+0xa6/0x110
__submit_bio+0x139/0x180
? __bio_queue_enter+0x350/0x350
? lockdep_hardirqs_on_prepare+0x132/0x200
? seqcount_lockdep_reader_access.constprop.0+0x82/0x90
? seqcount_lockdep_reader_access.constprop.0+0x82/0x90
? ktime_get+0x1c/0xb0
submit_bio_noacct_nocheck+0x2a1/0x5a0
? fscrypt_dio_supported+0xb0/0xb0
? blk_alloc_queue+0x310/0x310
? __probestub_jbd2_shrink_scan_enter+0x1/0x10
? submit_bio_noacct+0x310/0xa60
jbd2_journal_commit_transaction+0x18ed/0x2ec0
? jbd2_journal_finish_inode_data_buffers+0x50/0x50
? find_held_lock+0x83/0xa0
? lock_release+0x1d1/0x3b0
? __timer_delete_sync+0x170/0x210
? reacquire_held_locks+0x270/0x270
? lockdep_hardirqs_on_prepare+0x132/0x200
? _raw_spin_unlock_irqrestore+0x2b/0x50
? timer_shutdown+0xe0/0xe0
? mark_held_locks+0x65/0x90
kjournald2+0x16c/0x560
? jbd2_write_superblock+0x450/0x450
? mark_held_locks+0x24/0x90
? destroy_sched_domains_rcu+0x30/0x30
? lockdep_hardirqs_on_prepare+0x132/0x200
? _raw_spin_unlock_irqrestore+0x2b/0x50
? jbd2_write_superblock+0x450/0x450
kthread+0x173/0x1b0
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x28/0x50
? kthread_complete_and_exit+0x20/0x20
ret_from_fork_asm+0x11/0x20
</TASK>
INFO: task kworker/u16:0:145831 blocked for more than 181 seconds.
Tainted: G W 6.5.0-rc4+ #6
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u16:0 state:D stack:0 pid:145831 ppid:2 flags:0x00004000
Workqueue: writeback wb_workfn (flush-252:14)
Call Trace:
<TASK>
__schedule+0x7aa/0x16e0
? io_schedule_timeout+0xb0/0xb0
? blk_mq_flush_plug_list+0x10/0x30
? __blk_flush_plug+0x1be/0x210
? lockdep_hardirqs_on_prepare+0x132/0x200
? _raw_spin_unlock_irqrestore+0x2b/0x50
? _raw_spin_unlock_irqrestore+0x2b/0x50
schedule+0x89/0x110
md_write_start+0x45e/0x590
? md_seq_show+0xc80/0xc80
? mark_lock+0x10b/0xd30
? __writeback_inodes_wb+0x6b/0x130
? destroy_sched_domains_rcu+0x30/0x30
? __lock_acquire+0x910/0x3240
raid1_make_request+0x17c/0x18d0 [raid1]
? __lock_acquire+0x910/0x3240
? lockdep_hardirqs_on_prepare+0x200/0x200
? lock_acquire+0x15a/0x3b0
? md_handle_request+0x230/0x5e0
? find_held_lock+0x83/0xa0
? raid1_end_write_request+0x720/0x720 [raid1]
? lock_release+0x1d1/0x3b0
? md_handle_request+0x29b/0x5e0
? reacquire_held_locks+0x270/0x270
? lock_acquire+0x15a/0x3b0
? bio_associate_blkg_from_css+0x209/0x9d0
? find_held_lock+0x83/0xa0
? lock_is_held_type+0xa6/0x110
md_handle_request+0x2dc/0x5e0
? mddev_init_writes_pending+0x60/0x60
? lock_release+0x1d1/0x3b0
? bio_associate_blkg+0x71/0x130
? reacquire_held_locks+0x270/0x270
? bio_associate_blkg_from_css+0x2e1/0x9d0
raid_map+0x61/0x80 [dm_raid]
__map_bio+0x85/0x290
dm_submit_bio+0x4b9/0xca0
? dm_get_live_table+0xd0/0xd0
? reacquire_held_locks+0x270/0x270
? lock_release+0x1d1/0x3b0
? lock_is_held_type+0xa6/0x110
__submit_bio+0x139/0x180
? __bio_queue_enter+0x350/0x350
? lockdep_hardirqs_on_prepare+0x132/0x200
? seqcount_lockdep_reader_access.constprop.0+0x82/0x90
? seqcount_lockdep_reader_access.constprop.0+0x82/0x90
? ktime_get+0x1c/0xb0
submit_bio_noacct_nocheck+0x2a1/0x5a0
? fscrypt_set_bio_crypt_ctx+0x6c/0x120
? fscrypt_dio_supported+0xb0/0xb0
? blk_alloc_queue+0x310/0x310
? guard_bio_eod+0x51/0x310
? submit_bio_noacct+0x310/0xa60
__block_write_full_folio+0x3ae/0x6d0
? bdev_statx_dioalign+0xe0/0xe0
? block_invalidate_folio+0x2d0/0x2d0
writepage_cb+0x34/0xb0
write_cache_pages+0x257/0x650
? dirty_background_bytes_handler+0x20/0x20
? folio_clear_dirty_for_io+0x1b0/0x1b0
? do_writepages+0x25e/0x300
? find_held_lock+0x83/0xa0
? reacquire_held_locks+0x270/0x270
? do_raw_spin_lock+0x114/0x1a0
? spin_bug+0xe0/0xe0
do_writepages+0x192/0x300
? dirty_ratio_handler+0x50/0x50
? lock_release+0x1d1/0x3b0
? writeback_sb_inodes+0x2b2/0x7c0
? reacquire_held_locks+0x270/0x270
? do_raw_spin_lock+0x114/0x1a0
__writeback_single_inode+0x8f/0x620
writeback_sb_inodes+0x36d/0x7c0
? sync_inode_metadata+0xc0/0xc0
? down_read_trylock+0x16d/0x290
? trylock_super+0x11/0x60
__writeback_inodes_wb+0x6b/0x130
wb_writeback+0x447/0x540
? __writeback_inodes_wb+0x130/0x130
? mark_held_locks+0x65/0x90
? get_nr_inodes+0x5f/0xa0
wb_workfn+0x5b4/0x7c0
? inode_wait_for_writeback+0x30/0x30
? lock_acquire+0x15a/0x3b0
? process_one_work+0x53f/0xa30
? lock_sync+0xf0/0xf0
? mark_held_locks+0x24/0x90
process_one_work+0x622/0xa30
? lock_sync+0xf0/0xf0
? pwq_dec_nr_in_flight+0x100/0x100
? spin_bug+0xe0/0xe0
? mark_held_locks+0x24/0x90
worker_thread+0x8a/0x6f0
? process_one_work+0xa30/0xa30
kthread+0x173/0x1b0
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x28/0x50
? kthread_complete_and_exit+0x20/0x20
ret_from_fork_asm+0x11/0x20
</TASK>
INFO: task kworker/u16:3:174452 blocked for more than 181 seconds.
Tainted: G W 6.5.0-rc4+ #6
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u16:3 state:D stack:0 pid:174452 ppid:2 flags:0x00004000
Workqueue: writeback wb_workfn (flush-252:4)
Call Trace:
<TASK>
__schedule+0x7aa/0x16e0
? io_schedule_timeout+0xb0/0xb0
? blk_mq_flush_plug_list+0x10/0x30
? __blk_flush_plug+0x1be/0x210
? lockdep_hardirqs_on_prepare+0x132/0x200
? _raw_spin_unlock_irqrestore+0x2b/0x50
? _raw_spin_unlock_irqrestore+0x2b/0x50
schedule+0x89/0x110
md_super_wait+0x110/0x160
? md_super_write+0x1b0/0x1b0
? destroy_sched_domains_rcu+0x30/0x30
? write_page+0x1a3/0x890
md_bitmap_wait_writes+0x15f/0x170
? behind_writes_used_reset+0x40/0x40
? __percpu_counter_sum+0x16/0xe0
? find_held_lock+0x83/0xa0
? lock_release+0x1d1/0x3b0
? __percpu_counter_sum+0xca/0xe0
md_bitmap_unplug+0x1c5/0x210
flush_bio_list+0x19a/0x1a0 [raid1]
raid1_unplug+0x63/0x130 [raid1]
__blk_flush_plug+0x189/0x210
? blk_start_plug_nr_ios+0xd0/0xd0
blk_finish_plug+0x37/0x50
wb_writeback+0x48c/0x540
? __writeback_inodes_wb+0x130/0x130
wb_workfn+0x51a/0x7c0
? inode_wait_for_writeback+0x30/0x30
? lock_acquire+0x15a/0x3b0
? process_one_work+0x53f/0xa30
? lock_sync+0xf0/0xf0
? mark_held_locks+0x24/0x90
process_one_work+0x622/0xa30
? lock_sync+0xf0/0xf0
? pwq_dec_nr_in_flight+0x100/0x100
? spin_bug+0xe0/0xe0
? mark_held_locks+0x24/0x90
worker_thread+0x8a/0x6f0
? process_one_work+0xa30/0xa30
kthread+0x173/0x1b0
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x28/0x50
? kthread_complete_and_exit+0x20/0x20
ret_from_fork_asm+0x11/0x20
</TASK>
INFO: task btserver:188934 blocked for more than 181 seconds.
Tainted: G W 6.5.0-rc4+ #6
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:btserver state:D stack:0 pid:188934 ppid:188373 flags:0x00000002
Call Trace:
<TASK>
__schedule+0x7aa/0x16e0
? io_schedule_timeout+0xb0/0xb0
? lock_release+0x1d1/0x3b0
? mark_held_locks+0x65/0x90
schedule+0x89/0x110
io_schedule+0x68/0xa0
bit_wait_io+0x12/0x80
__wait_on_bit_lock+0xcf/0x120
? bit_wait+0x80/0x80
out_of_line_wait_on_bit_lock+0xd7/0x110
? __wait_on_bit_lock+0x120/0x120
? cpuacct_css_alloc+0xd0/0xd0
do_get_write_access+0x458/0x830
jbd2_journal_get_write_access+0x76/0xa0
__ext4_journal_get_write_access+0xea/0x210
ext4_orphan_add+0x432/0x8b0
? stop_this_handle+0x210/0x210
? kasan_set_track+0x21/0x30
? __kasan_slab_alloc+0x62/0x70
? ext4_fc_destroy_dentry_cache+0x10/0x10
? kmem_cache_alloc+0x22e/0x290
? rcu_is_watching+0x1f/0x40
? jbd2__journal_start+0x22d/0x340
ext4_truncate+0x43c/0x930
? ext4_punch_hole+0x7a0/0x7a0
? __do_fault+0x290/0x290
ext4_setattr+0xc43/0x1100
? make_vfsgid+0x76/0xf0
notify_change+0x306/0x6c0
do_truncate+0xdc/0x160
? file_open_root+0x200/0x200
? ext4_file_write_iter+0xcd0/0xcd0
path_openat+0xf37/0x1120
? path_lookupat+0x2e0/0x2e0
? ___slab_alloc+0x995/0x1300
do_filp_open+0x151/0x270
? may_open_dev+0x50/0x50
? lock_is_held_type+0xa6/0x110
? reacquire_held_locks+0x270/0x270
? do_raw_spin_lock+0x114/0x1a0
? spin_bug+0xe0/0xe0
? do_raw_spin_unlock+0x8e/0xe0
? _raw_spin_unlock+0x1a/0x30
? alloc_fd+0x19b/0x320
do_sys_openat2+0x103/0x130
? build_open_flags+0x240/0x240
__x64_sys_openat+0xe0/0x130
? __ia32_compat_sys_open+0x120/0x120
? lockdep_hardirqs_on_prepare+0x132/0x200
? syscall_enter_from_user_mode+0x1c/0x50
do_syscall_64+0x34/0xb0
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f63771018e8
RSP: 002b:00007fffd4942040 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
RAX: ffffffffffffffda RBX: 0000000000000241 RCX: 00007f63771018e8
RDX: 0000000000000241 RSI: 0000000003c27440 RDI: 00000000ffffff9c
RBP: 0000000003c27440 R08: 0000000000000004 R09: 0000000000000001
R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000003c27440
R13: 0000000000a316da R14: 0000000000000001 R15: 0000000000b0bbb0
</TASK>
Showing all locks held in the system:
1 lock held by rcu_tasks_kthre/12:
#0: ffffffff83840590 (rcu_tasks.tasks_gp_mutex){+.+.}-{4:4}, at: rcu_tasks_one_gp+0x32/0x690
1 lock held by rcu_tasks_trace/13:
#0: ffffffff83840290 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{4:4}, at: rcu_tasks_one_gp+0x32/0x690
1 lock held by khungtaskd/64:
#0: ffffffff83841160 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x29/0x230
1 lock held by mdX_raid1/488:
#0: ffff88810aeac5e8 (&mddev->bitmap_info.mutex){+.+.}-{4:4}, at: md_bitmap_daemon_work+0x83/0x6a0
1 lock held by mdX_raid1/913:
#0: ffff888121ccc5e8 (&mddev->bitmap_info.mutex){+.+.}-{4:4}, at: md_bitmap_daemon_work+0x83/0x6a0
1 lock held by jbd2/dm-14-8/958:
#0: ffff88810dffe680 (&md->io_barrier){.+.+}-{0:0}, at: dm_get_live_table+0x0/0xd0
1 lock held by fuse mainloop/2513:
#0: ffff88811f35e468 (&pipe->mutex/1){+.+.}-{4:4}, at: splice_file_to_pipe+0x21/0xd0
1 lock held by fuse mainloop/2515:
#0: ffff88811c744468 (&pipe->mutex/1){+.+.}-{4:4}, at: splice_file_to_pipe+0x21/0xd0
4 locks held by kworker/u16:0/145831:
#0: ffff88810b797938 ((wq_completion)writeback){+.+.}-{0:0}, at: process_one_work+0x53f/0xa30
#1: ffff88814efc7dc0 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: process_one_work+0x53f/0xa30
#2: ffff88810a3080e0 (&type->s_umount_key#53){.+.+}-{4:4}, at: trylock_super+0x11/0x60
#3: ffff88810dffe680 (&md->io_barrier){.+.+}-{0:0}, at: dm_get_live_table+0x0/0xd0
2 locks held by kworker/u16:3/174452:
#0: ffff88810b797938 ((wq_completion)writeback){+.+.}-{0:0}, at: process_one_work+0x53f/0xa30
#1: ffff88812bd77dc0 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: process_one_work+0x53f/0xa30
2 locks held by kworker/4:1/184807:
#0: ffff88810dc6e138 ((wq_completion)md){+.+.}-{0:0}, at: process_one_work+0x53f/0xa30
#1: ffff888145e5fdc0 ((work_completion)(&mddev->flush_work)){+.+.}-{0:0}, at: process_one_work+0x53f/0xa30
4 locks held by btserver/188934:
#0: ffff88812f7ee400 (sb_writers#9){.+.+}-{0:0}, at: path_openat+0xd77/0x1120
#1: ffff888147320400 (&sb->s_type->i_mutex_key#15){++++}-{4:4}, at: do_truncate+0xca/0x160
#2: ffff888147320590 (mapping.invalidate_lock#2){++++}-{4:4}, at: ext4_setattr+0x671/0x1100
#3: ffff88812f7ec980 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0x91b/0xaf0
=============================================
INFO: task mdX_raid1:488 blocked for more than 364 seconds.
Tainted: G W 6.5.0-rc4+ #6
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:mdX_raid1 state:D stack:0 pid:488 ppid:2 flags:0x00004000
Call Trace:
<TASK>
__schedule+0x7aa/0x16e0
? io_schedule_timeout+0xb0/0xb0
? lockdep_hardirqs_on_prepare+0x132/0x200
? _raw_spin_unlock_irqrestore+0x2b/0x50
? _raw_spin_unlock_irqrestore+0x2b/0x50
schedule+0x89/0x110
md_super_wait+0x110/0x160
? md_super_write+0x1b0/0x1b0
? print_usage_bug.part.0+0x3d0/0x3d0
? __mutex_lock+0xd22/0x1200
? destroy_sched_domains_rcu+0x30/0x30
? lock_acquire+0x15a/0x3b0
? md_bitmap_daemon_work+0x2fc/0x6a0
? find_held_lock+0x83/0xa0
md_bitmap_wait_writes+0x15f/0x170
? behind_writes_used_reset+0x40/0x40
? mark_held_locks+0x65/0x90
? lockdep_hardirqs_on_prepare+0x132/0x200
? _raw_spin_unlock_irq+0x1f/0x40
md_bitmap_daemon_work+0x483/0x6a0
? md_bitmap_write_all+0xe0/0xe0
? lock_acquire+0x15a/0x3b0
? lock_timer_base+0x93/0xd0
md_check_recovery+0x7b/0xa50
raid1d+0xdb/0x30b0 [raid1]
? lockdep_hardirqs_on_prepare+0x200/0x200
? lockdep_hardirqs_on_prepare+0x200/0x200
? lockdep_hardirqs_on_prepare+0x132/0x200
? _raw_spin_unlock_irqrestore+0x2b/0x50
? raid1_read_request+0x13b0/0x13b0 [raid1]
? finish_wait+0x4a/0xc0
? find_held_lock+0x83/0xa0
? lock_release+0x1d1/0x3b0
? md_thread+0x1fe/0x280
? reacquire_held_locks+0x270/0x270
? mark_held_locks+0x24/0x90
? lockdep_hardirqs_on_prepare+0x132/0x200
md_thread+0x175/0x280
? md_submit_discard_bio+0x210/0x210
? mark_held_locks+0x24/0x90
? destroy_sched_domains_rcu+0x30/0x30
? lockdep_hardirqs_on_prepare+0x132/0x200
? _raw_spin_unlock_irqrestore+0x2b/0x50
? md_submit_discard_bio+0x210/0x210
kthread+0x173/0x1b0
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x28/0x50
? kthread_complete_and_exit+0x20/0x20
ret_from_fork_asm+0x11/0x20
</TASK>
INFO: task jbd2/dm-4-8:526 blocked for more than 364 seconds.
Tainted: G W 6.5.0-rc4+ #6
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:jbd2/dm-4-8 state:D stack:0 pid:526 ppid:2 flags:0x00004000
Call Trace:
<TASK>
__schedule+0x7aa/0x16e0
? io_schedule_timeout+0xb0/0xb0
? lock_sync+0xf0/0xf0
? find_held_lock+0x83/0xa0
? __wait_on_bit+0x95/0x100
? reacquire_held_locks+0x270/0x270
? submit_bio_noacct_nocheck+0x516/0x5a0
schedule+0x89/0x110
io_schedule+0x68/0xa0
bit_wait_io+0x12/0x80
__wait_on_bit+0x3a/0x100
? bit_wait+0x80/0x80
out_of_line_wait_on_bit+0xd7/0x110
? __wait_on_bit+0x100/0x100
? cpuacct_css_alloc+0xd0/0xd0
jbd2_journal_commit_transaction+0x2767/0x2ec0
? jbd2_journal_finish_inode_data_buffers+0x50/0x50
? find_held_lock+0x83/0xa0
? lock_release+0x1d1/0x3b0
? __timer_delete_sync+0x170/0x210
? reacquire_held_locks+0x270/0x270
? lockdep_hardirqs_on_prepare+0x132/0x200
? _raw_spin_unlock_irqrestore+0x2b/0x50
? timer_shutdown+0xe0/0xe0
? do_raw_read_unlock+0x80/0x80
kjournald2+0x16c/0x560
? jbd2_write_superblock+0x450/0x450
? mark_held_locks+0x24/0x90
? destroy_sched_domains_rcu+0x30/0x30
? lockdep_hardirqs_on_prepare+0x132/0x200
? _raw_spin_unlock_irqrestore+0x2b/0x50
? jbd2_write_superblock+0x450/0x450
kthread+0x173/0x1b0
? kthread_complete_and_exit+0x20/0x20
ret_from_fork+0x28/0x50
? kthread_complete_and_exit+0x20/0x20
ret_from_fork_asm+0x11/0x20
</TASK>
INFO: task journal-offline:190685 blocked for more than 365 seconds.
Tainted: G W 6.5.0-rc4+ #6
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:journal-offline state:D stack:0 pid:190685 ppid:1 flags:0x00000002
Call Trace:
<TASK>
__schedule+0x7aa/0x16e0
? io_schedule_timeout+0xb0/0xb0
? folio_wait_bit_common+0x11e/0x470
? find_held_lock+0x83/0xa0
? lock_release+0x1d1/0x3b0
? reacquire_held_locks+0x270/0x270
? do_raw_spin_lock+0x114/0x1a0
schedule+0x89/0x110
io_schedule+0x68/0xa0
folio_wait_bit_common+0x20b/0x470
? filemap_get_folios_contig+0x4e0/0x4e0
? filemap_invalidate_unlock_two+0x30/0x30
? lockdep_hardirqs_on_prepare+0x200/0x200
folio_wait_writeback+0x16/0x100
__filemap_fdatawait_range+0xc1/0x150
? filemap_get_folios_tag+0x5c0/0x5c0
? find_held_lock+0x83/0xa0
? __filemap_fdatawrite_range+0xc6/0x120
? delete_from_page_cache_batch+0x450/0x450
? strscpy+0xc5/0x190
file_write_and_wait_range+0x9b/0xc0
ext4_sync_file+0x134/0x640
__x64_sys_fsync+0x3e/0x60
do_syscall_64+0x34/0xb0
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f9088d0868a
RSP: 002b:00007f90869fd9f0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
RAX: ffffffffffffffda RBX: 000055efd35f5940 RCX: 00007f9088d0868a
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000000000000002a
RBP: 000055efd361a510 R08: 0000000000000000 R09: 00007fff65ac277f
R10: 00007f9088c8cc1e R11: 0000000000000293 R12: 000055efd2c527a0
R13: 0000000000000002 R14: 00007fff65ac2680 R15: 00007f90861fe000
</TASK>
Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings
Showing all locks held in the system:
1 lock held by rcu_tasks_kthre/12:
#0: ffffffff83840590 (rcu_tasks.tasks_gp_mutex){+.+.}-{4:4}, at: rcu_tasks_one_gp+0x32/0x690
1 lock held by rcu_tasks_trace/13:
#0: ffffffff83840290 (rcu_tasks_trace.tasks_gp_mutex){+.+.}-{4:4}, at: rcu_tasks_one_gp+0x32/0x690
1 lock held by khungtaskd/64:
#0: ffffffff83841160 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x29/0x230
1 lock held by mdX_raid1/488:
#0: ffff88810aeac5e8 (&mddev->bitmap_info.mutex){+.+.}-{4:4}, at: md_bitmap_daemon_work+0x83/0x6a0
1 lock held by mdX_raid1/913:
#0: ffff888121ccc5e8 (&mddev->bitmap_info.mutex){+.+.}-{4:4}, at: md_bitmap_daemon_work+0x83/0x6a0
1 lock held by jbd2/dm-14-8/958:
#0: ffff88810dffe680 (&md->io_barrier){.+.+}-{0:0}, at: dm_get_live_table+0x0/0xd0
1 lock held by fuse mainloop/2513:
#0: ffff88811f35e468 (&pipe->mutex/1){+.+.}-{4:4}, at: splice_file_to_pipe+0x21/0xd0
1 lock held by fuse mainloop/2515:
#0: ffff88811c744468 (&pipe->mutex/1){+.+.}-{4:4}, at: splice_file_to_pipe+0x21/0xd0
4 locks held by kworker/u16:0/145831:
#0: ffff88810b797938 ((wq_completion)writeback){+.+.}-{0:0}, at: process_one_work+0x53f/0xa30
#1: ffff88814efc7dc0 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: process_one_work+0x53f/0xa30
#2: ffff88810a3080e0 (&type->s_umount_key#53){.+.+}-{4:4}, at: trylock_super+0x11/0x60
#3: ffff88810dffe680 (&md->io_barrier){.+.+}-{0:0}, at: dm_get_live_table+0x0/0xd0
2 locks held by kworker/u16:3/174452:
#0: ffff88810b797938 ((wq_completion)writeback){+.+.}-{0:0}, at: process_one_work+0x53f/0xa30
#1: ffff88812bd77dc0 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: process_one_work+0x53f/0xa30
2 locks held by kworker/4:1/184807:
#0: ffff88810dc6e138 ((wq_completion)md){+.+.}-{0:0}, at: process_one_work+0x53f/0xa30
#1: ffff888145e5fdc0 ((work_completion)(&mddev->flush_work)){+.+.}-{0:0}, at: process_one_work+0x53f/0xa30
4 locks held by btserver/188934:
#0: ffff88812f7ee400 (sb_writers#9){.+.+}-{0:0}, at: path_openat+0xd77/0x1120
#1: ffff888147320400 (&sb->s_type->i_mutex_key#15){++++}-{4:4}, at: do_truncate+0xca/0x160
#2: ffff888147320590 (mapping.invalidate_lock#2){++++}-{4:4}, at: ext4_setattr+0x671/0x1100
#3: ffff88812f7ec980 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0x91b/0xaf0
3 locks held by grep/190700:
#0: ffff888124a38848 (&f->f_pos_lock){+.+.}-{4:4}, at: __fdget_pos+0x40/0x50
#1: ffff88812f7ee400 (sb_writers#9){.+.+}-{0:0}, at: filemap_read+0x5ab/0x5c0
#2: ffff88812f7ec980 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0x91b/0xaf0
=============================================
Thanks,
Ben
--
Ben Greear <greearb@...delatech.com>
Candela Technologies Inc http://www.candelatech.com
Powered by blists - more mailing lists