[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d760721e-face-7a19-ba3b-7cc59475ef77@huaweicloud.com>
Date: Wed, 19 Nov 2025 16:29:34 +0800
From: Li Nan <linan666@...weicloud.com>
To: Yu Kuai <yukuai@...as.com>, linux-raid@...r.kernel.org
Cc: linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] md/raid5: fix IO hang when array is broken with IO
inflight
在 2025/11/17 16:55, Yu Kuai 写道:
> Following test can cause IO hang:
>
> mdadm -CvR /dev/md0 -l10 -n4 /dev/sd[abcd] --assume-clean --chunk=64K --bitmap=none
> sleep 5
> echo 1 > /sys/block/sda/device/delete
> echo 1 > /sys/block/sdb/device/delete
> echo 1 > /sys/block/sdc/device/delete
> echo 1 > /sys/block/sdd/device/delete
>
> dd if=/dev/md0 of=/dev/null bs=8k count=1 iflag=direct
>
> Root cause:
>
> 1) all disks removed, however all rdevs in the array is still in sync,
> IO will be issued normally.
>
> 2) IO failure from sda, and set badblocks failed, sda will be faulty
> and MD_SB_CHANGING_PENDING will be set.
>
> 3) error recovery try to recover this IO from other disks, IO will be
> issued to sdb, sdc, and sdd.
>
> 4) IO failure from sdb, and set badblocks failed again, now array is
> broken and will become read-only.
>
> 5) IO failure from sdc and sdd, however, stripe can't be handled anymore
> because MD_SB_CHANGING_PENDING is set:
>
> handle_stripe
> handle_stripe
> if (test_bit MD_SB_CHANGING_PENDING)
> set_bit STRIPE_HANDLE
> goto finish
> // skip handling failed stripe
>
> release_stripe
> if (test_bit STRIPE_HANDLE)
> list_add_tail conf->hand_list
>
> 6) later raid5d can't handle failed stripe as well:
>
> raid5d
> md_check_recovery
> md_update_sb
> if (!md_is_rdwr())
> // can't clear pending bit
> return
> if (test_bit MD_SB_CHANGING_PENDING)
> break;
> // can't handle failed stripe
>
> Since MD_SB_CHANGING_PENDING can never be cleared for read-only array,
> fix this problem by skip this checking for read-only array.
>
> Fixes: d87f064f5874 ("md: never update metadata when array is read-only.")
> Signed-off-by: Yu Kuai <yukuai@...as.com>
> ---
> drivers/md/raid5.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index cdbc7eba5c54..e57ce3295292 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -4956,7 +4956,8 @@ static void handle_stripe(struct stripe_head *sh)
> goto finish;
>
> if (s.handle_bad_blocks ||
> - test_bit(MD_SB_CHANGE_PENDING, &conf->mddev->sb_flags)) {
> + (md_is_rdwr(conf->mddev) &&
> + test_bit(MD_SB_CHANGE_PENDING, &conf->mddev->sb_flags))) {
I am not sure where mddev->ro is set to MD_RDONLY — is it via a user's ioctl?
> set_bit(STRIPE_HANDLE, &sh->state);
> goto finish;
> }
> @@ -6768,7 +6769,8 @@ static void raid5d(struct md_thread *thread)
> int batch_size, released;
> unsigned int offset;
>
> - if (test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags))
> + if (md_is_rdwr(mddev) &&
> + test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags))
> break;
>
> released = release_stripe_list(conf, conf->temp_inactive_list);
--
Thanks,
Nan
Powered by blists - more mailing lists