lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <65ab0815-daf0-c614-7fa5-cab095d513b1@huaweicloud.com>
Date: Tue, 5 Aug 2025 09:15:43 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: Xiao Ni <xni@...hat.com>, Yu Kuai <yukuai1@...weicloud.com>
Cc: hch@....de, corbet@....net, song@...nel.org, agk@...hat.com,
 snitzer@...nel.org, mpatocka@...hat.com, linan122@...wei.com, hare@...e.de,
 linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
 linux-raid@...r.kernel.org, dm-devel@...ts.linux.dev, yi.zhang@...wei.com,
 yangerkun@...wei.com, johnny.chenyi@...wei.com,
 "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH v5 08/11] md/md-bitmap: add a new method blocks_synced()
 in bitmap_operations

Hi,

在 2025/08/04 17:15, Xiao Ni 写道:
> Hi Kuai
> 
> I found one thing. The interface ->blocks_synced doesn't work in the
> write io path. So there is a risk of data corruption. For example:
> 
> mdadm -CR /dev/md0 -l5 -n5 /dev/loop[0-4] --bitmap=lockless (all bits
> are unwritten because lazy initial recovery)
> 1. D1 D2 D3 D4 P1, a small write hits D2. rmw is 2 (need to read old
> data of D2 and P1), rcw is 3 (need to read D1 D3 and D4).
> 2. D2 disk fails
> 3. read data from disk2. It needs to calculate the data from other
> disks. But the result is not the real data which was written to D2
> 
> So ->blocks_synced needs to be checked in handle_stripe_dirtying.

This is correct, the write path is missing. Will fix it in the next
version.

Thanks,
Kuai

> 
> Best Regards
> Xiao
> 
> On Fri, Aug 1, 2025 at 3:11 PM Yu Kuai <yukuai1@...weicloud.com> wrote:
>>
>> From: Yu Kuai <yukuai3@...wei.com>
>>
>> Currently, raid456 must perform a whole array initial recovery to build
>> initail xor data, then IO to the array won't have to read all the blocks
>> in underlying disks.
>>
>> This behavior will affect IO performance a lot, and nowadays there are
>> huge disks and the initial recovery can take a long time. Hence llbitmap
>> will support lazy initial recovery in following patches. This method is
>> used to check if data blocks is synced or not, if not then IO will still
>> have to read all blocks for raid456.
>>
>> Signed-off-by: Yu Kuai <yukuai3@...wei.com>
>> Reviewed-by: Christoph Hellwig <hch@....de>
>> Reviewed-by: Hannes Reinecke <hare@...e.de>
>> Reviewed-by: Xiao Ni <xni@...hat.com>
>> Reviewed-by: Li Nan <linan122@...wei.com>
>> ---
>>   drivers/md/md-bitmap.h | 1 +
>>   drivers/md/raid5.c     | 6 ++++++
>>   2 files changed, 7 insertions(+)
>>
>> diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h
>> index 95453696c68e..5f41724cbcd8 100644
>> --- a/drivers/md/md-bitmap.h
>> +++ b/drivers/md/md-bitmap.h
>> @@ -90,6 +90,7 @@ struct bitmap_operations {
>>          md_bitmap_fn *end_discard;
>>
>>          sector_t (*skip_sync_blocks)(struct mddev *mddev, sector_t offset);
>> +       bool (*blocks_synced)(struct mddev *mddev, sector_t offset);
>>          bool (*start_sync)(struct mddev *mddev, sector_t offset,
>>                             sector_t *blocks, bool degraded);
>>          void (*end_sync)(struct mddev *mddev, sector_t offset, sector_t *blocks);
>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>> index 5285e72341a2..2121f0ff5e30 100644
>> --- a/drivers/md/raid5.c
>> +++ b/drivers/md/raid5.c
>> @@ -3748,6 +3748,7 @@ static int want_replace(struct stripe_head *sh, int disk_idx)
>>   static int need_this_block(struct stripe_head *sh, struct stripe_head_state *s,
>>                             int disk_idx, int disks)
>>   {
>> +       struct mddev *mddev = sh->raid_conf->mddev;
>>          struct r5dev *dev = &sh->dev[disk_idx];
>>          struct r5dev *fdev[2] = { &sh->dev[s->failed_num[0]],
>>                                    &sh->dev[s->failed_num[1]] };
>> @@ -3762,6 +3763,11 @@ static int need_this_block(struct stripe_head *sh, struct stripe_head_state *s,
>>                   */
>>                  return 0;
>>
>> +       /* The initial recover is not done, must read everything */
>> +       if (mddev->bitmap_ops && mddev->bitmap_ops->blocks_synced &&
>> +           !mddev->bitmap_ops->blocks_synced(mddev, sh->sector))
>> +               return 1;
>> +
>>          if (dev->toread ||
>>              (dev->towrite && !test_bit(R5_OVERWRITE, &dev->flags)))
>>                  /* We need this block to directly satisfy a request */
>> --
>> 2.39.2
>>
> 
> 
> .
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ