linux-kernel - Re: [PATCH] md/raid1,raid10: don't broken array on failfast metadata write fails

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <36f78ba0-ac3b-5d97-89f3-2b09d49d1701@huaweicloud.com>
Date: Wed, 13 Aug 2025 08:59:02 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Kenta Akagi <k@...l.me>, Song Liu <song@...nel.org>,
 Mariusz Tkaczyk <mtkaczyk@...nel.org>
Cc: linux-raid@...r.kernel.org, linux-kernel@...r.kernel.org,
 "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH] md/raid1,raid10: don't broken array on failfast metadata
 write fails

Hi,

在 2025/08/12 17:01, Kenta Akagi 写道:
> It is not intended for the array to fail when a metadata write with
> MD_FAILFAST fails.
> After commit 9631abdbf406 ("md: Set MD_BROKEN for RAID1 and RAID10"),
> when md_error is called on the last device in RAID1/10,
> the MD_BROKEN flag is set on the array.
> Because of this, a failfast metadata write failure will
> make the array "broken" state.
> 
> If rdev is not Faulty even after calling md_error,
> the rdev is the last device, and there is nothing except
> MD_BROKEN that prevents writes to the array.
> Therefore, by clearing MD_BROKEN, the array will not become
> "broken" after a failfast metadata write failure.

I don't understand here, I think MD_BROKEN is expected, the last
rdev has IO error while updating metadata, the array is now broken
and you can only read it afterwards. Allow using this broken array
read-write might causing more severe problem like data loss.

Thanks,
Kuai

> 
> Fixes: 9631abdbf406 ("md: Set MD_BROKEN for RAID1 and RAID10")
> Signed-off-by: Kenta Akagi <k@...l.me>
> ---
>   drivers/md/md.c | 1 +
>   drivers/md/md.h | 2 +-
>   2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/md/md.c b/drivers/md/md.c
> index ac85ec73a409..3ec4abf02fa0 100644
> --- a/drivers/md/md.c
> +++ b/drivers/md/md.c
> @@ -1002,6 +1002,7 @@ static void super_written(struct bio *bio)
>   		md_error(mddev, rdev);
>   		if (!test_bit(Faulty, &rdev->flags)
>   		    && (bio->bi_opf & MD_FAILFAST)) {
> +			clear_bit(MD_BROKEN, &mddev->flags);
>   			set_bit(MD_SB_NEED_REWRITE, &mddev->sb_flags);
>   			set_bit(LastDev, &rdev->flags);
>   		}
> diff --git a/drivers/md/md.h b/drivers/md/md.h
> index 51af29a03079..2f87bcc5d834 100644
> --- a/drivers/md/md.h
> +++ b/drivers/md/md.h
> @@ -332,7 +332,7 @@ struct md_cluster_operations;
>    *			       resync lock, need to release the lock.
>    * @MD_FAILFAST_SUPPORTED: Using MD_FAILFAST on metadata writes is supported as
>    *			    calls to md_error() will never cause the array to
> - *			    become failed.
> + *			    become failed while fail_last_dev is not set.
>    * @MD_HAS_PPL:  The raid array has PPL feature set.
>    * @MD_HAS_MULTIPLE_PPLS: The raid array has multiple PPLs feature set.
>    * @MD_NOT_READY: do_md_run() is active, so 'array_state', ust not report that
>