linux-kernel - Re: [PATCH] md/raid1: skip recovery of already synced areas

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <26827.5265.32245.268569@quad.stoffel.home>
Date: Wed, 17 Sep 2025 16:05:37 -0400
From: "John Stoffel" <john@...ffel.org>
To: linan666@...weicloud.com
Cc: song@...nel.org,
    yukuai3@...wei.com,
    linux-raid@...r.kernel.org,
    linux-kernel@...r.kernel.org,
    yangerkun@...wei.com,
    yi.zhang@...wei.com
Subject: Re: [PATCH] md/raid1: skip recovery of already synced areas

>>>>> "linan666" == linan666  <linan666@...weicloud.com> writes:

> From: Li Nan <linan122@...wei.com>
> When a new disk is added during running recovery, the kernel may
> restart recovery from the beginning of the device and submit write
> io to ranges that have already been synchronized.

Isn't it beter to be safe than sorry?  If the resync fails for some
reason, how can we be sure the devices really are in sync if you don't
force the re-write?  

> Reproduce:
>   mdadm -CR /dev/md0 -l1 -n3 /dev/sda missing missing
>   mdadm --add /dev/md0 /dev/sdb
>   sleep 10
>   cat /proc/mdstat	# partially synchronized
>   mdadm --add /dev/md0 /dev/sdc
>   cat /proc/mdstat	# start from 0
>   iostat 1 sdb sdc	# sdb has io, too

> If 'rdev->recovery_offset' is ahead of the current recovery sector,
> read from that device instead of issuing a write. It prevents
> unnecessary writes while still preserving the chance to back up data
> if it is the last copy.


Are you saying that sdb here can continute writing from block N, but
that your change will only force sdc to start writing from block 0?
Your description of the problem isn't really clear.  

I think it's because you're using the word device for both the MD
device itself, as well as the underlying device(s) or component(s) of
the MD device.  



> Signed-off-by: Li Nan <linan122@...wei.com>
> ---
>  drivers/md/raid1.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)

> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 3e422854cafb..ac5a9b73157a 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -2894,7 +2894,8 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr,
>  		    test_bit(Faulty, &rdev->flags)) {
>  			if (i < conf->raid_disks)
>  				still_degraded = true;
> -		} else if (!test_bit(In_sync, &rdev->flags)) {
> +		} else if (!test_bit(In_sync, &rdev->flags) &&
> +			   rdev->recovery_offset <= sector_nr) {
bio-> bi_opf = REQ_OP_WRITE;
bio-> bi_end_io = end_sync_write;
>  			write_targets ++;
> @@ -2903,6 +2904,9 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr,
>  			sector_t first_bad = MaxSector;
>  			sector_t bad_sectors;
 
> +			if (!test_bit(In_sync, &rdev->flags))
> +				good_sectors = min(rdev->recovery_offset - sector_nr,
> +						   (u64)good_sectors);
>  			if (is_badblock(rdev, sector_nr, good_sectors,
>  					&first_bad, &bad_sectors)) {
>  				if (first_bad > sector_nr)
> -- 
> 2.39.2