[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6c7008df-942e-13b1-2e70-a058e96ab0e9@cloud.ionos.com>
Date: Tue, 26 Jan 2021 12:14:49 +0100
From: Guoqing Jiang <guoqing.jiang@...ud.ionos.com>
To: Donald Buczek <buczek@...gen.mpg.de>, Song Liu <song@...nel.org>,
linux-raid@...r.kernel.org,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
it+raid@...gen.mpg.de
Subject: Re: md_raid: mdX_raid6 looping after sync_action "check" to "idle"
transition
Hi Donald,
On 1/26/21 10:50, Donald Buczek wrote:
[...]
>>>
>>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>>> index 2d21c298ffa7..f40429843906 100644
>>> --- a/drivers/md/md.c
>>> +++ b/drivers/md/md.c
>>> @@ -4687,11 +4687,13 @@ action_store(struct mddev *mddev, const char
>>> *page, size_t len)
>>> clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
>>> if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) &&
>>> mddev_lock(mddev) == 0) {
>>> + set_bit(MD_ALLOW_SB_UPDATE, &mddev->flags);
>>> flush_workqueue(md_misc_wq);
>>> if (mddev->sync_thread) {
>>> set_bit(MD_RECOVERY_INTR, &mddev->recovery);
>>> md_reap_sync_thread(mddev);
>>> }
>>> + clear_bit(MD_ALLOW_SB_UPDATE, &mddev->flags);
>>> mddev_unlock(mddev);
>>> }
>>> } else if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
>>
>> Yes, it could break the deadlock issue, but I am not sure if it is the
>> right way given we only set ALLOW_SB_UPDATE in suspend which makes
>> sense since the io will be quiesced, but write idle action can't
>> guarantee the similar thing.
>
> Thinking (and documenting) MD_ALLOW_SB_UPDATE as "the holder of
> reconfig_mutex promises not to make any changes which would exclude
> superblocks from being written" might make it easier to accept the usage.
I am not sure it is safe to set the flag here since write idle can't
prevent IO from fs while mddev_suspend can guarantee that.
>
>> I prefer to make resync thread not wait forever here.
>>
[...]
>>
>> - sh = raid5_get_active_stripe(conf, new_sector, previous,
>> + sh = raid5_get_active_stripe(conf, new_sector, previous, 0,
>
>
> Mistake here (fourth argument added instead of third)
Thanks for checking.
[...]
> Unfortunately, this patch did not fix the issue.
>
> root@...th:~/linux# cat /proc/$(pgrep md3_resync)/stack
> [<0>] raid5_get_active_stripe+0x1e7/0x6b0
> [<0>] raid5_sync_request+0x2a7/0x3d0
> [<0>] md_do_sync.cold+0x3ee/0x97c
> [<0>] md_thread+0xab/0x160
> [<0>] kthread+0x11b/0x140
> [<0>] ret_from_fork+0x22/0x30
>
> which is ( according to objdump -d -Sl drivers/md/raid5.o ) at
> https://elixir.bootlin.com/linux/v5.11-rc5/source/drivers/md/raid5.c#L735
>
> Isn't it still the case that the superblocks are not written, therefore
> stripes are not processed, therefore number of active stripes are not
> decreasing? Who is expected to wake up conf->wait_for_stripe waiters?
Hmm, how about wake the waiter up in the while loop of raid5d?
@@ -6520,6 +6532,11 @@ static void raid5d(struct md_thread *thread)
md_check_recovery(mddev);
spin_lock_irq(&conf->device_lock);
}
+
+ if ((atomic_read(&conf->active_stripes)
+ < (conf->max_nr_stripes * 3 / 4) ||
+ (test_bit(MD_RECOVERY_INTR, &mddev->recovery))))
+ wake_up(&conf->wait_for_stripe);
}
pr_debug("%d stripes handled\n", handled);
If the issue still appears then I will change the waiter to break just
if MD_RECOVERY_INTR is set, something like.
wait_event_lock_irq(conf->wait_for_stripe,
(test_bit(MD_RECOVERY_INTR, &mddev->recovery) && sync_req) ||
/* the previous condition */,
*(conf->hash_locks + hash));
Thanks,
Guoqing
Powered by blists - more mailing lists