[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z2I1HAKhKrCR51XO@fedora>
Date: Wed, 18 Dec 2024 10:36:12 +0800
From: Ming Lei <ming.lei@...hat.com>
To: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
Cc: Christoph Hellwig <hch@....de>, axboe@...nel.dk,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
Linux regressions mailing list <regressions@...ts.linux.dev>,
linux-block@...r.kernel.org, Damien Le Moal <dlemoal@...nel.org>
Subject: Re: 6.13/regression/bisected - after commit f1be1788a32e I see in
the kernel log "possible circular locking dependency detected"
On Wed, Dec 18, 2024 at 06:51:31AM +0500, Mikhail Gavrilov wrote:
> Hi,
> After commit f1be1788a32e I see in the kernel log "possible circular
> locking dependency detected" with follow stack trace:
> [ 740.877178] ======================================================
> [ 740.877180] WARNING: possible circular locking dependency detected
> [ 740.877182] 6.13.0-rc3-f44d154d6e3d+ #392 Tainted: G W L
> [ 740.877184] ------------------------------------------------------
> [ 740.877186] btrfs-transacti/839 is trying to acquire lock:
> [ 740.877188] ffff888182336a50
> (&q->q_usage_counter(io)#2){++++}-{0:0}, at: __submit_bio+0x335/0x520
> [ 740.877197]
> but task is already holding lock:
> [ 740.877198] ffff8881826f7048 (btrfs-tree-00){++++}-{4:4}, at:
> btrfs_tree_read_lock_nested+0x27/0x170
> [ 740.877205]
> which lock already depends on the new lock.
>
> [ 740.877206]
> the existing dependency chain (in reverse order) is:
> [ 740.877207]
> -> #4 (btrfs-tree-00){++++}-{4:4}:
> [ 740.877211] lock_release+0x397/0xd90
> [ 740.877215] up_read+0x1b/0x30
> [ 740.877217] btrfs_search_slot+0x16c9/0x31f0
> [ 740.877220] btrfs_lookup_inode+0xa9/0x360
> [ 740.877222] __btrfs_update_delayed_inode+0x131/0x760
> [ 740.877225] btrfs_async_run_delayed_root+0x4bc/0x630
> [ 740.877226] btrfs_work_helper+0x1b5/0xa50
> [ 740.877228] process_one_work+0x899/0x14b0
> [ 740.877231] worker_thread+0x5e6/0xfc0
> [ 740.877233] kthread+0x2d2/0x3a0
> [ 740.877235] ret_from_fork+0x31/0x70
> [ 740.877238] ret_from_fork_asm+0x1a/0x30
> [ 740.877240]
> -> #3 (&delayed_node->mutex){+.+.}-{4:4}:
> [ 740.877244] __mutex_lock+0x1ab/0x12c0
> [ 740.877247] __btrfs_release_delayed_node.part.0+0xa0/0xd40
> [ 740.877249] btrfs_evict_inode+0x44d/0xc20
> [ 740.877252] evict+0x3a4/0x840
> [ 740.877255] dispose_list+0xf0/0x1c0
> [ 740.877257] prune_icache_sb+0xe3/0x160
> [ 740.877259] super_cache_scan+0x30d/0x4f0
> [ 740.877261] do_shrink_slab+0x349/0xd60
> [ 740.877264] shrink_slab+0x7a4/0xd20
> [ 740.877266] shrink_one+0x403/0x830
> [ 740.877268] shrink_node+0x2337/0x3a60
> [ 740.877270] balance_pgdat+0xa4f/0x1500
> [ 740.877272] kswapd+0x4f3/0x940
> [ 740.877274] kthread+0x2d2/0x3a0
> [ 740.877276] ret_from_fork+0x31/0x70
> [ 740.877278] ret_from_fork_asm+0x1a/0x30
> [ 740.877280]
> -> #2 (fs_reclaim){+.+.}-{0:0}:
> [ 740.877283] fs_reclaim_acquire+0xc9/0x110
> [ 740.877286] __kmalloc_noprof+0xeb/0x690
> [ 740.877288] sd_revalidate_disk.isra.0+0x4356/0x8e00
> [ 740.877291] sd_probe+0x869/0xfa0
> [ 740.877293] really_probe+0x1e0/0x8a0
> [ 740.877295] __driver_probe_device+0x18c/0x370
> [ 740.877297] driver_probe_device+0x4a/0x120
> [ 740.877299] __device_attach_driver+0x162/0x270
> [ 740.877300] bus_for_each_drv+0x115/0x1a0
> [ 740.877303] __device_attach_async_helper+0x1a0/0x240
> [ 740.877305] async_run_entry_fn+0x97/0x4f0
> [ 740.877307] process_one_work+0x899/0x14b0
> [ 740.877309] worker_thread+0x5e6/0xfc0
> [ 740.877310] kthread+0x2d2/0x3a0
> [ 740.877312] ret_from_fork+0x31/0x70
> [ 740.877314] ret_from_fork_asm+0x1a/0x30
> [ 740.877316]
> -> #1 (&q->limits_lock){+.+.}-{4:4}:
> [ 740.877320] __mutex_lock+0x1ab/0x12c0
> [ 740.877321] nvme_update_ns_info_block+0x476/0x2630 [nvme_core]
> [ 740.877332] nvme_update_ns_info+0xbe/0xa60 [nvme_core]
> [ 740.877339] nvme_alloc_ns+0x1589/0x2c40 [nvme_core]
> [ 740.877346] nvme_scan_ns+0x579/0x660 [nvme_core]
> [ 740.877353] async_run_entry_fn+0x97/0x4f0
> [ 740.877355] process_one_work+0x899/0x14b0
> [ 740.877357] worker_thread+0x5e6/0xfc0
> [ 740.877358] kthread+0x2d2/0x3a0
> [ 740.877360] ret_from_fork+0x31/0x70
> [ 740.877362] ret_from_fork_asm+0x1a/0x30
> [ 740.877364]
> -> #0 (&q->q_usage_counter(io)#2){++++}-{0:0}:
This is another deadlock caused by dependency between q->limits_lock and
q->q_usage_counter, same with the one under discussion:
https://lore.kernel.org/linux-block/20241216080206.2850773-2-ming.lei@redhat.com/
The dependency of queue_limits_start_update() over blk_mq_freeze_queue()
should be cut.
Thanks,
Ming
Powered by blists - more mailing lists