lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0106019957e1b14a-9a2f2aec-fca2-40f9-9c65-bcdabac6299b-000000@ap-northeast-1.amazonses.com>
Date: Wed, 17 Sep 2025 13:33:52 +0000
From: Kenta Akagi <k@...l.me>
To: linan666@...weicloud.com, song@...nel.org, yukuai3@...wei.com, 
	mtkaczyk@...nel.org, shli@...com, jgq516@...il.com
Cc: linux-raid@...r.kernel.org, linux-kernel@...r.kernel.org, k@...l.me
Subject: Re: [PATCH v4 6/9] md/raid1,raid10: Fix missing retries Failfast
 write bios on no-bbl rdevs



On 2025/09/17 19:06, Li Nan wrote:
> 
> 
> 在 2025/9/15 11:42, Kenta Akagi 写道:
>> In the current implementation, write failures are not retried on rdevs
>> with badblocks disabled. This is because narrow_write_error, which issues
>> retry bios, immediately returns when badblocks are disabled. As a result,
>> a single write failure on such an rdev will immediately mark it as Faulty.
>>
> 
> IMO, there's no need to add extra logic for scenarios where badblocks
> is not enabled. Do you have real-world scenarios where badblocks is
> disabled?

No, badblocks are enabled in my environment.
I'm fine if it's not added, but I still think it's worth adding WARN_ON like:

@@ -2553,13 +2554,17 @@ static bool narrow_write_error(struct r1bio *r1_bio, int i)
  fail = true;
+ WARN_ON( (bio->bi_opf & MD_FAILFAST) && (rdev->badblocks.shift < 0) );
  if (!narrow_write_error(r1_bio, m))

What do you think?


Thanks,
Akagi

>> The retry mechanism appears to have been implemented under the assumption
>> that a bad block is involved in the failure. However, the retry after
>> MD_FAILFAST write failure depend on this code, and a Failfast write request
>> may fail for reasons unrelated to bad blocks.
>>
>> Consequently, if failfast is enabled and badblocks are disabled on all
>> rdevs, and all rdevs encounter a failfast write bio failure at the same
>> time, no retries will occur and the entire array can be lost.
>>
>> This commit adds a path in narrow_write_error to retry writes even on rdevs
>> where bad blocks are disabled, and failed bios marked with MD_FAILFAST will
>> use this path. For non-failfast cases, the behavior remains unchanged: no
>> retry writes are attempted to rdevs with bad blocks disabled.
>>
>> Fixes: 1919cbb23bf1 ("md/raid10: add failfast handling for writes.")
>> Fixes: 212e7eb7a340 ("md/raid1: add failfast handling for writes.")
>> Signed-off-by: Kenta Akagi <k@...l.me>
>> ---
>>   drivers/md/raid1.c  | 44 +++++++++++++++++++++++++++++---------------
>>   drivers/md/raid10.c | 37 ++++++++++++++++++++++++-------------
>>   2 files changed, 53 insertions(+), 28 deletions(-)
>>                  !test_bit(Faulty, &rdev->flags) &&
> 
> -- 
> Thanks,
> Nan
> 
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ