linux-kernel - Re: [PATCH 2/2] md/raid5: fix IO hang when array is broken with IO inflight

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d760721e-face-7a19-ba3b-7cc59475ef77@huaweicloud.com>
Date: Wed, 19 Nov 2025 16:29:34 +0800
From: Li Nan <linan666@...weicloud.com>
To: Yu Kuai <yukuai@...as.com>, linux-raid@...r.kernel.org
Cc: linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] md/raid5: fix IO hang when array is broken with IO
 inflight



在 2025/11/17 16:55, Yu Kuai 写道:
> Following test can cause IO hang:
> 
> mdadm -CvR /dev/md0 -l10 -n4 /dev/sd[abcd] --assume-clean --chunk=64K --bitmap=none
> sleep 5
> echo 1 > /sys/block/sda/device/delete
> echo 1 > /sys/block/sdb/device/delete
> echo 1 > /sys/block/sdc/device/delete
> echo 1 > /sys/block/sdd/device/delete
> 
> dd if=/dev/md0 of=/dev/null bs=8k count=1 iflag=direct
> 
> Root cause:
> 
> 1) all disks removed, however all rdevs in the array is still in sync,
> IO will be issued normally.
> 
> 2) IO failure from sda, and set badblocks failed, sda will be faulty
> and MD_SB_CHANGING_PENDING will be set.
> 
> 3) error recovery try to recover this IO from other disks, IO will be
> issued to sdb, sdc, and sdd.
> 
> 4) IO failure from sdb, and set badblocks failed again, now array is
> broken and will become read-only.
> 
> 5) IO failure from sdc and sdd, however, stripe can't be handled anymore
> because MD_SB_CHANGING_PENDING is set:
> 
> handle_stripe
>   handle_stripe
>   if (test_bit MD_SB_CHANGING_PENDING)
>    set_bit STRIPE_HANDLE
>    goto finish
>    // skip handling failed stripe
> 
> release_stripe
>   if (test_bit STRIPE_HANDLE)
>    list_add_tail conf->hand_list
> 
> 6) later raid5d can't handle failed stripe as well:
> 
> raid5d
>   md_check_recovery
>    md_update_sb
>     if (!md_is_rdwr())
>      // can't clear pending bit
>      return
>   if (test_bit MD_SB_CHANGING_PENDING)
>    break;
>    // can't handle failed stripe
> 
> Since MD_SB_CHANGING_PENDING can never be cleared for read-only array,
> fix this problem by skip this checking for read-only array.
> 
> Fixes: d87f064f5874 ("md: never update metadata when array is read-only.")
> Signed-off-by: Yu Kuai <yukuai@...as.com>
> ---
>   drivers/md/raid5.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index cdbc7eba5c54..e57ce3295292 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -4956,7 +4956,8 @@ static void handle_stripe(struct stripe_head *sh)
>   		goto finish;
>   
>   	if (s.handle_bad_blocks ||
> -	    test_bit(MD_SB_CHANGE_PENDING, &conf->mddev->sb_flags)) {
> +	    (md_is_rdwr(conf->mddev) &&
> +	     test_bit(MD_SB_CHANGE_PENDING, &conf->mddev->sb_flags))) {


I am not sure where mddev->ro is set to MD_RDONLY — is it via a user's ioctl?


>   		set_bit(STRIPE_HANDLE, &sh->state);
>   		goto finish;
>   	}
> @@ -6768,7 +6769,8 @@ static void raid5d(struct md_thread *thread)
>   		int batch_size, released;
>   		unsigned int offset;
>   
> -		if (test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags))
> +		if (md_is_rdwr(mddev) &&
> +		    test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags))
>   			break;
>   
>   		released = release_stripe_list(conf, conf->temp_inactive_list);

-- 
Thanks,
Nan