lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 31 May 2023 09:22:54 +0800
From:   Yu Kuai <yukuai1@...weicloud.com>
To:     Guoqing Jiang <guoqing.jiang@...ux.dev>,
        Yu Kuai <yukuai1@...weicloud.com>, song@...nel.org,
        pmenzel@...gen.mpg.de
Cc:     linux-raid@...r.kernel.org, linux-kernel@...r.kernel.org,
        yi.zhang@...wei.com, yangerkun@...wei.com,
        "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH v2] md/raid5: don't allow concurrent reshape with recovery

Hi,

在 2023/05/31 9:06, Guoqing Jiang 写道:
> 
> 
> On 5/29/23 21:34, Yu Kuai wrote:
>> From: Yu Kuai <yukuai3@...wei.com>
>>
>> Commit 0aecb06e2249 ("md/raid5: don't allow replacement while reshape
>> is in progress") fixes that replacement can be set if reshape is
>> interrupted, which will cause that array can't be assembled.
>>
>> There is a similar problem on the other side, if recovery is
>> interrupted, then reshape can start, which will cause the same problem.
>>
>> Fix the problem by not starting to reshape while recovery is still in
>> progress.
>>
>> Signed-off-by: Yu Kuai <yukuai3@...wei.com>
>> ---
>> Changes in v2:
>>   - fix some typo in commit message.
>>
>>   drivers/md/raid5.c | 8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>> index 8686d629e3f2..6615abf54d3f 100644
>> --- a/drivers/md/raid5.c
>> +++ b/drivers/md/raid5.c
>> @@ -8525,6 +8525,7 @@ static int raid5_start_reshape(struct mddev *mddev)
>>       struct r5conf *conf = mddev->private;
>>       struct md_rdev *rdev;
>>       int spares = 0;
>> +    int i;
>>       unsigned long flags;
>>       if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
>> @@ -8536,6 +8537,13 @@ static int raid5_start_reshape(struct mddev 
>> *mddev)
>>       if (has_failed(conf))
>>           return -EINVAL;
>> +    /* raid5 can't handle concurrent reshape and recovery */
>> +    if (mddev->recovery_cp < MaxSector)
>> +        return -EBUSY;
>> +    for (i = 0; i < conf->raid_disks; i++)
>> +        if (rdev_mdlock_deref(mddev, conf->disks[i].replacement))
>> +            return -EBUSY;
>> +
> 
> Does it mean reshape and recovery  can happen in parallel without the 
> change?
> I really doubt about it given any kind of internal io (resync, reshape 
> and recovery)
> is handled by resync thread. And IIUC either md_do_sync or 
> md_check_recovery
> should avoid it, no need to do it in personality layer.
> 

They can't, in this case recovery is interrupted, then recovery can't
make progress, and md_check_recovery() will start reshape, and after
reshape is done, recovery will continue, and data will be corrupted
because raid456 reshape doesn't handle replacement.

And by the way in raid456 is that if system reboot, this array can't be
assembled, raid5_run() will fail if reshape and replacement are both
set.

Thanks,
Kuai

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ