linux-kernel - Re: [PATCH v4 4/9] md/raid1,raid10: Don't set MD

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e88ac955-9733-4e57-830b-d326557d189a@fwd.mgml.me>
Date: Sat, 20 Sep 2025 15:30:29 +0900
From: Kenta Akagi <k@....mgml.me>
To: Yu Kuai <hailan@...uai.org.cn>, yukuai1@...weicloud.com, song@...nel.org,
        mtkaczyk@...nel.org, shli@...com, jgq516@...il.com
Cc: linux-raid@...r.kernel.org, linux-kernel@...r.kernel.org,
        yukuai3@...wei.com, k@....mgml.me
Subject: Re: [PATCH v4 4/9] md/raid1,raid10: Don't set MD_BROKEN on failfast
 bio failure

Hi,

I have changed my email address because our primary MX server
suddenly started rejecting non-DKIM mail.

On 2025/09/19 10:36, Yu Kuai wrote:
> Hi,
> 
> 在 2025/9/18 23:22, Kenta Akagi 写道:
>>>> @@ -470,7 +470,7 @@ static void raid1_end_write_request(struct bio *bio)
>>>>                (bio->bi_opf & MD_FAILFAST) &&
>>>>                /* We never try FailFast to WriteMostly devices */
>>>>                !test_bit(WriteMostly, &rdev->flags)) {
>>>> -            md_error(r1_bio->mddev, rdev);
>>>> +            md_bio_failure_error(r1_bio->mddev, rdev, bio);
>>>>            }
>>> Can following check of faulty replaced with return value?
>> In the case where raid1_end_write_request is called for a non-failfast IO,
>> and the rdev has already been marked Faulty by another bio, it must not retry too.
>> I think it would be simpler not to use a return value here.
> 
> You can just add Faulty check inside md_bio_failure_error() as well, and both
> failfast and writemostly check.

Sorry, I'm not sure I understand this part. 
In raid1_end_write_request, this code path is also used for a regular bio,
not only for FailFast.

You mean to change md_bio_failure_error as follows:
* If the rdev is Faulty, immediately return true.
* If the given bio is Failfast and the rdev is not the lastdev, call md_error.
* If the given bio is not Failfast, do nothing and return false.

And then apply this?
This is complicated. Wouldn't it be better to keep the Faulty check as it is?

@@ -466,18 +466,12 @@ static void raid1_end_write_request(struct bio *bio)
                        set_bit(MD_RECOVERY_NEEDED, &
                                conf->mddev->recovery);

-               if (test_bit(FailFast, &rdev->flags) &&
-                   (bio->bi_opf & MD_FAILFAST) &&
-                   /* We never try FailFast to WriteMostly devices */
-                   !test_bit(WriteMostly, &rdev->flags)) {
-                       md_error(r1_bio->mddev, rdev);
-               }
-
                /*
                 * When the device is faulty, it is not necessary to
                 * handle write error.
                 */
-               if (!test_bit(Faulty, &rdev->flags))
+               if (!test_bit(Faulty, &rdev->flags) ||
+                   !md_bio_failure_error(r1_bio->mddev, rdev, bio))
                        set_bit(R1BIO_WriteError, &r1_bio->state);
                else {
                        /* Finished with this branch */


Or do you mean a fix like this?

@@ -466,23 +466,24 @@ static void raid1_end_write_request(struct bio *bio)
                        set_bit(MD_RECOVERY_NEEDED, &
                                conf->mddev->recovery);

-               if (test_bit(FailFast, &rdev->flags) &&
-                   (bio->bi_opf & MD_FAILFAST) &&
-                   /* We never try FailFast to WriteMostly devices */
-                   !test_bit(WriteMostly, &rdev->flags)) {
-                       md_error(r1_bio->mddev, rdev);
-               }
-
                /*
                 * When the device is faulty, it is not necessary to
                 * handle write error.
                 */
-               if (!test_bit(Faulty, &rdev->flags))
-                       set_bit(R1BIO_WriteError, &r1_bio->state);
-               else {
+               if (test_bit(Faulty, &rdev->flags) ||
+                   (
+                   test_bit(FailFast, &rdev->flags) &&
+                   (bio->bi_opf & MD_FAILFAST) &&
+                   /* We never try FailFast to WriteMostly devices */
+                   !test_bit(WriteMostly, &rdev->flags) &&
+                   md_bio_failure_error(r1_bio->mddev, rdev, bio)
+                   )
+               ) {
                        /* Finished with this branch */
                        r1_bio->bios[mirror] = NULL;
                        to_put = bio;
+               } else {
+                       set_bit(R1BIO_WriteError, &r1_bio->state);
                }
        } else {
                /*

Thanks,
Akagi

> Thanks,
> Kuai
> 
> 
>