[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240130021843.3608859-2-yukuai1@huaweicloud.com>
Date: Tue, 30 Jan 2024 10:18:30 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: mpatocka@...hat.com,
	heinzm@...hat.com,
	xni@...hat.com,
	agk@...hat.com,
	snitzer@...nel.org,
	dm-devel@...ts.linux.dev,
	song@...nel.org,
	yukuai3@...wei.com,
	jbrassow@....redhat.com,
	neilb@...e.de,
	shli@...com,
	akpm@...l.org
Cc: linux-kernel@...r.kernel.org,
	linux-raid@...r.kernel.org,
	yukuai1@...weicloud.com,
	yi.zhang@...wei.com,
	yangerkun@...wei.com
Subject: [PATCH v4 01/14] md: don't ignore suspended array in md_check_recovery()
From: Yu Kuai <yukuai3@...wei.com>
mddev_suspend() never stop sync_thread, hence it doesn't make sense to
ignore suspended array in md_check_recovery(), which might cause
sync_thread can't be unregistered.
After commit f52f5c71f3d4 ("md: fix stopping sync thread"), following
hang can be triggered by test shell/integrity-caching.sh:
1) suspend the array:
raid_postsuspend
 mddev_suspend
2) stop the array:
raid_dtr
 md_stop
  __md_stop_writes
   stop_sync_thread
    set_bit(MD_RECOVERY_INTR, &mddev->recovery);
    md_wakeup_thread_directly(mddev->sync_thread);
    wait_event(..., !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
3) sync thread done:
md_do_sync
 set_bit(MD_RECOVERY_DONE, &mddev->recovery);
 md_wakeup_thread(mddev->thread);
4) daemon thread can't unregister sync thread:
md_check_recovery
 if (mddev->suspended)
   return; -> return directly
 md_read_sync_thread
 clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
 -> MD_RECOVERY_RUNNING can't be cleared, hence step 2 hang;
This problem is not just related to dm-raid, fix it by ignoring
suspended array in md_check_recovery(). And follow up patches will
improve dm-raid better to frozen sync thread during suspend.
Reported-by: Mikulas Patocka <mpatocka@...hat.com>
Closes: https://lore.kernel.org/all/8fb335e-6d2c-dbb5-d7-ded8db5145a@redhat.com/
Fixes: 68866e425be2 ("MD: no sync IO while suspended")
Fixes: f52f5c71f3d4 ("md: fix stopping sync thread")
Signed-off-by: Yu Kuai <yukuai3@...wei.com>
---
 drivers/md/md.c | 3 ---
 1 file changed, 3 deletions(-)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 2266358d8074..07b80278eaa5 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -9469,9 +9469,6 @@ static void md_start_sync(struct work_struct *ws)
  */
 void md_check_recovery(struct mddev *mddev)
 {
-	if (READ_ONCE(mddev->suspended))
-		return;
-
 	if (mddev->bitmap)
 		md_bitmap_daemon_work(mddev);
 
-- 
2.39.2
Powered by blists - more mailing lists
 
