[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250826074204.390111-2-zhengqixing@huaweicloud.com>
Date: Tue, 26 Aug 2025 15:42:03 +0800
From: Zheng Qixing <zhengqixing@...weicloud.com>
To: agk@...hat.com,
snitzer@...nel.org,
mpatocka@...hat.com,
axboe@...nel.dk,
ming.lei@...hat.com
Cc: dm-devel@...ts.linux.dev,
linux-kernel@...r.kernel.org,
yukuai3@...wei.com,
yi.zhang@...wei.com,
yangerkun@...wei.com,
houtao1@...wei.com,
zhengqixing@...wei.com
Subject: [PATCH-next 1/2] dm: fix queue start/stop imbalance under suspend/load/resume races
From: Zheng Qixing <zhengqixing@...wei.com>
When suspend and load run concurrently, before q->mq_ops is set in
blk_mq_init_allocated_queue(), __dm_suspend() skip dm_stop_queue(). As a
result, the queue's quiesce depth is not incremented.
Later, once table load has finished and __dm_resume() runs, which triggers
q->quiesce_depth ==0 warning in blk_mq_unquiesce_queue():
Call Trace:
<TASK>
dm_start_queue+0x16/0x20 [dm_mod]
__dm_resume+0xac/0xb0 [dm_mod]
dm_resume+0x12d/0x150 [dm_mod]
do_resume+0x2c2/0x420 [dm_mod]
dev_suspend+0x30/0x130 [dm_mod]
ctl_ioctl+0x402/0x570 [dm_mod]
dm_ctl_ioctl+0x23/0x30 [dm_mod]
Fix this by explicitly tracking whether the request queue was
stopped in __dm_suspend() via a new DMF_QUEUE_STOPPED flag.
Only call dm_start_queue() in __dm_resume() if the queue was
actually stopped.
Fixes: e70feb8b3e68 ("blk-mq: support concurrent queue quiesce/unquiesce")
Signed-off-by: Zheng Qixing <zhengqixing@...wei.com>
---
drivers/md/dm-core.h | 1 +
drivers/md/dm.c | 8 +++++---
2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h
index c889332e533b..0070e4462ee2 100644
--- a/drivers/md/dm-core.h
+++ b/drivers/md/dm-core.h
@@ -162,6 +162,7 @@ struct mapped_device {
#define DMF_SUSPENDED_INTERNALLY 7
#define DMF_POST_SUSPENDING 8
#define DMF_EMULATE_ZONE_APPEND 9
+#define DMF_QUEUE_STOPPED 10
static inline sector_t dm_get_size(struct mapped_device *md)
{
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index a44e8c2dccee..7222f20c1a83 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2960,8 +2960,10 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map,
* Stop md->queue before flushing md->wq in case request-based
* dm defers requests to md->wq from md->queue.
*/
- if (dm_request_based(md))
+ if (dm_request_based(md)) {
dm_stop_queue(md->queue);
+ set_bit(DMF_QUEUE_STOPPED, &md->flags);
+ }
flush_workqueue(md->wq);
@@ -2983,7 +2985,7 @@ static int __dm_suspend(struct mapped_device *md, struct dm_table *map,
if (r < 0) {
dm_queue_flush(md);
- if (dm_request_based(md))
+ if (test_and_clear_bit(DMF_QUEUE_STOPPED, &md->flags))
dm_start_queue(md->queue);
unlock_fs(md);
@@ -3067,7 +3069,7 @@ static int __dm_resume(struct mapped_device *md, struct dm_table *map)
* so that mapping of targets can work correctly.
* Request-based dm is queueing the deferred I/Os in its request_queue.
*/
- if (dm_request_based(md))
+ if (test_and_clear_bit(DMF_QUEUE_STOPPED, &md->flags))
dm_start_queue(md->queue);
unlock_fs(md);
--
2.39.2
Powered by blists - more mailing lists