[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <269ac5cb-aa09-02ca-4150-c90cd5a72e06@huaweicloud.com>
Date: Wed, 6 Dec 2023 19:36:41 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Song Liu <song@...nel.org>, Yu Kuai <yukuai1@...weicloud.com>
Cc: agk@...hat.com, snitzer@...nel.org, mpatocka@...hat.com,
dm-devel@...ts.linux.dev, janpieter.sollie@...net.be,
linux-kernel@...r.kernel.org, linux-raid@...r.kernel.org,
yi.zhang@...wei.com, yangerkun@...wei.com,
"yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH -next] md: split MD_RECOVERY_NEEDED out of mddev_resume
Hi,
在 2023/12/06 16:30, Song Liu 写道:
> On Sun, Dec 3, 2023 at 7:18 PM Yu Kuai <yukuai1@...weicloud.com> wrote:
>>
>> From: Yu Kuai <yukuai3@...wei.com>
>>
>> New mddev_resume() calls are added to synchroniza IO with array
>> reconfiguration, however, this introduce a regression while adding it in
>> md_start_sync():
>>
>> 1) someone set MD_RECOVERY_NEEDED first;
>> 2) daemon thread grab reconfig_mutex, then clear MD_RECOVERY_NEEDED and
>> queue a new sync work;
>> 3) daemon thread release reconfig_mutex;
>> 4) in md_start_sync
>> a) check that there are spares that can be added/removed, then suspend
>> the array;
>> b) remove_and_add_spares may not be called, or called without really
>> add/remove spares;
>> c) resume the array, then set MD_RECOVERY_NEEDED again!
>>
>> Loop between 2 - 4, then mddev_suspend() will be called quite often, for
>> consequence, normal IO will be quite slow.
>>
>> Fix this problem by spliting MD_RECOVERY_NEEDED out of mddev_resume(), so
>> that md_start_sync() won't set such flag and hence the loop will be broken.
>
> I hope we don't leak set_bit MD_RECOVERY_NEEDED to all call
> sites of mddev_resume().
There are also some other mddev_resume() that is added later and don't
need recovery, so md_start_sync() is not the only place:
- md_setup_drive
- rdev_attr_store
- suspend_lo_store
- suspend_hi_store
- autorun_devices
- md_ioct
- r5c_disable_writeback_async
- error path from new_dev_store(), ...
I'm not sure add a new helper is a good idea, because all above apis
should use new helper as well.
>
> How about something like the following instead?
>
> Please also incorporate feedback from Paul in the next version.
Of course.
Thanks,
Kuai
>
> Thanks,
> Song
>
> diff --git i/drivers/md/md.c w/drivers/md/md.c
> index c94373d64f2c..2d53e1b57070 100644
> --- i/drivers/md/md.c
> +++ w/drivers/md/md.c
> @@ -490,7 +490,7 @@ int mddev_suspend(struct mddev *mddev, bool interruptible)
> }
> EXPORT_SYMBOL_GPL(mddev_suspend);
>
> -void mddev_resume(struct mddev *mddev)
> +static void __mddev_resume(struct mddev *mddev, bool recovery_needed)
> {
> lockdep_assert_not_held(&mddev->reconfig_mutex);
>
> @@ -507,12 +507,18 @@ void mddev_resume(struct mddev *mddev)
> percpu_ref_resurrect(&mddev->active_io);
> wake_up(&mddev->sb_wait);
>
> - set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
> + if (recovery_needed)
> + set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
> md_wakeup_thread(mddev->thread);
> md_wakeup_thread(mddev->sync_thread); /* possibly kick off a reshape */
>
> mutex_unlock(&mddev->suspend_mutex);
> }
> +
> +void mddev_resume(struct mddev *mddev)
> +{
> + __mddev_resume(mddev, true);
> +}
> EXPORT_SYMBOL_GPL(mddev_resume);
>
> /*
> @@ -9403,7 +9409,9 @@ static void md_start_sync(struct work_struct *ws)
> goto not_running;
> }
>
> - suspend ? mddev_unlock_and_resume(mddev) : mddev_unlock(mddev);
> + mddev_unlock(mddev);
> + if (suspend)
> + __mddev_resume(mddev, false);
> md_wakeup_thread(mddev->sync_thread);
> sysfs_notify_dirent_safe(mddev->sysfs_action);
> md_new_event();
> @@ -9415,7 +9423,9 @@ static void md_start_sync(struct work_struct *ws)
> clear_bit(MD_RECOVERY_REQUESTED, &mddev->recovery);
> clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);
> clear_bit(MD_RECOVERY_RUNNING, &mddev->recovery);
> - suspend ? mddev_unlock_and_resume(mddev) : mddev_unlock(mddev);
> + mddev_unlock(mddev);
> + if (suspend)
> + __mddev_resume(mddev, false);
>
> wake_up(&resync_wait);
> if (test_and_clear_bit(MD_RECOVERY_RECOVER, &mddev->recovery) &&
>
> .
>
Powered by blists - more mailing lists