[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20231110172834.3939490-9-yukuai1@huaweicloud.com>
Date: Sat, 11 Nov 2023 01:28:34 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: song@...nel.org, xni@...hat.com, yukuai3@...wei.com, neilb@...e.de
Cc: linux-kernel@...r.kernel.org, linux-raid@...r.kernel.org,
yukuai1@...weicloud.com, yi.zhang@...wei.com, yangerkun@...wei.com
Subject: [PATCH -next 8/8] dm-raid: fix a deadlock in md_stop()
From: Yu Kuai <yukuai3@...wei.com>
After commit db5e653d7c9f ("md: delay choosing sync action to
md_start_sync()"), md_start_sync() will hold 'reconfig_mutex', however,
in order to make sure event_work is done, __md_stop() will flush
workqueue with reconfig_mutex grabbed, hence if sync_work is still
pending, deadlock will be triggered.
md_stop md_start_sync
mddev_lock
mddev_lock
flush_workqueue -> deadlock
Currently, __md_stop() is the only place to flush workqueue with
'reconfig_mutex' grabbed, and event_work is only used for dm-raid, instead
of split sync_work out of the workqueue, fix this problem the easy way by
moving flush_workqueue to dm-raid where 'reconfig_mutex' is not held, this
is safe because do_table_event() doesn't relate to mdadm and can be called
after md_stop().
Fixes: db5e653d7c9f ("md: delay choosing sync action to md_start_sync()")
Signed-off-by: Yu Kuai <yukuai3@...wei.com>
Signed-off-by: Yu Kuai <yukuai1@...weicloud.com>
---
drivers/md/dm-raid.c | 3 +++
drivers/md/md.c | 3 ---
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c
index a4692f8f98ee..51f15c20f621 100644
--- a/drivers/md/dm-raid.c
+++ b/drivers/md/dm-raid.c
@@ -3317,6 +3317,9 @@ static void raid_dtr(struct dm_target *ti)
mddev_lock_nointr(&rs->md);
md_stop(&rs->md);
mddev_unlock(&rs->md);
+
+ if (work_pending(&rs->md.event_work))
+ flush_work(&rs->md.event_work);
raid_set_free(rs);
}
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 35f3dd7db369..8f5df249448d 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6378,9 +6378,6 @@ static void __md_stop(struct mddev *mddev)
struct md_personality *pers = mddev->pers;
md_bitmap_destroy(mddev);
mddev_detach(mddev);
- /* Ensure ->event_work is done */
- if (mddev->event_work.func)
- flush_workqueue(md_misc_wq);
spin_lock(&mddev->lock);
mddev->pers = NULL;
spin_unlock(&mddev->lock);
--
2.39.2
Powered by blists - more mailing lists