[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPhsuW4sF=jAyA+Q=2tFBBAApjcW=gWXndDNX6t3nrAfnk_zZA@mail.gmail.com>
Date: Wed, 6 Dec 2023 00:30:10 -0800
From: Song Liu <song@...nel.org>
To: Yu Kuai <yukuai1@...weicloud.com>
Cc: agk@...hat.com, snitzer@...nel.org, mpatocka@...hat.com,
dm-devel@...ts.linux.dev, yukuai3@...wei.com,
janpieter.sollie@...net.be, linux-kernel@...r.kernel.org,
linux-raid@...r.kernel.org, yi.zhang@...wei.com,
yangerkun@...wei.com
Subject: Re: [PATCH -next] md: split MD_RECOVERY_NEEDED out of mddev_resume
On Sun, Dec 3, 2023 at 7:18 PM Yu Kuai <yukuai1@...weicloud.com> wrote:
>
> From: Yu Kuai <yukuai3@...wei.com>
>
> New mddev_resume() calls are added to synchroniza IO with array
> reconfiguration, however, this introduce a regression while adding it in
> md_start_sync():
>
> 1) someone set MD_RECOVERY_NEEDED first;
> 2) daemon thread grab reconfig_mutex, then clear MD_RECOVERY_NEEDED and
> queue a new sync work;
> 3) daemon thread release reconfig_mutex;
> 4) in md_start_sync
> a) check that there are spares that can be added/removed, then suspend
> the array;
> b) remove_and_add_spares may not be called, or called without really
> add/remove spares;
> c) resume the array, then set MD_RECOVERY_NEEDED again!
>
> Loop between 2 - 4, then mddev_suspend() will be called quite often, for
> consequence, normal IO will be quite slow.
>
> Fix this problem by spliting MD_RECOVERY_NEEDED out of mddev_resume(), so
> that md_start_sync() won't set such flag and hence the loop will be broken.
I hope we don't leak set_bit MD_RECOVERY_NEEDED to all call
sites of mddev_resume().
How about something like the following instead?
Please also incorporate feedback from Paul in the next version.
Thanks,
Song
diff --git i/drivers/md/md.c w/drivers/md/md.c
index c94373d64f2c..2d53e1b57070 100644
--- i/drivers/md/md.c
+++ w/drivers/md/md.c
@@ -490,7 +490,7 @@ int mddev_suspend(struct mddev *mddev, bool interruptible)
}
EXPORT_SYMBOL_GPL(mddev_suspend);
-void mddev_resume(struct mddev *mddev)
+static void __mddev_resume(struct mddev *mddev, bool recovery_needed)
{
lockdep_assert_not_held(&mddev->reconfig_mutex);
@@ -507,12 +507,18 @@ void mddev_resume(struct mddev *mddev)
percpu_ref_resurrect(&mddev->active_io);
wake_up(&mddev->sb_wait);
- set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
+ if (recovery_needed)
+ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
md_wakeup_thread(mddev->thread);
md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
mutex_unlock(&mddev->suspend_mutex);
}
+
+void mddev_resume(struct mddev *mddev)
+{
+ __mddev_resume(mddev, true);
+}
EXPORT_SYMBOL_GPL(mddev_resume);
/*
@@ -9403,7 +9409,9 @@ static void md_start_sync(struct work_struct *ws)
goto not_running;
}
- suspend ? mddev_unlock_and_resume(mddev) : mddev_unlock(mddev);
+ mddev_unlock(mddev);
+ if (suspend)
+ __mddev_resume(mddev, false);
md_wakeup_thread(mddev->sync_thread);
sysfs_notify_dirent_safe(mddev->sysfs_action);
md_new_event();
@@ -9415,7 +9423,9 @@ static void md_start_sync(struct work_struct *ws)
clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
- suspend ? mddev_unlock_and_resume(mddev) : mddev_unlock(mddev);
+ mddev_unlock(mddev);
+ if (suspend)
+ __mddev_resume(mddev, false);
wake_up(&resync_wait);
if (test_and_clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery) &&
Powered by blists - more mailing lists