lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <59a46919-6c6d-46cb-1fe4-5ded849617e1@huaweicloud.com>
Date: Mon, 23 Sep 2024 16:18:34 +0800
From: Yu Kuai <yukuai1@...weicloud.com>
To: John Garry <john.g.garry@...cle.com>, Yu Kuai <yukuai1@...weicloud.com>,
 axboe@...nel.dk, hch@....de
Cc: linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
 linux-raid@...r.kernel.org, martin.petersen@...cle.com,
 "yangerkun@...wei.com" <yangerkun@...wei.com>,
 "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH RFC 5/6] md/raid1: Handle bio_split() errors

Hi,

在 2024/09/23 15:44, John Garry 写道:
> On 23/09/2024 07:15, Yu Kuai wrote:
>>>>
>>>> This way, BLK_STS_IOERR will always be returned, perhaps what you want
>>>> is to return the error code from bio_split()?
>>>
>>> Yeah, I would like to return that error code, so maybe I can encode 
>>> it in the master_bio directly or pass as an arg to raid_end_bio_io().
>>
>> That's fine, however, I think the change can introduce problems in some
>> corner cases, for example there is a rdev with badblocks and a slow rdev
>> with full copy. Currently raid1_read_request() will split this bio to
>> read some from fast rdev, and read the badblocks region from slow rdev.
>>
>> We need a new branch in read_balance() to choose a rdev with full copy.
> 
> Sure, I do realize that the mirror'ing personalities need more 
> sophisticated error handling changes (than what I presented).
> 
> However, in raid1_read_request() we do the read_balance() and then the 
> bio_split() attempt. So what are you suggesting we do for the 
> bio_split() error? Is it to retry without the bio_split()?
> 
> To me bio_split() should not fail. If it does, it is likely ENOMEM or 
> some other bug being exposed, so I am not sure that retrying with 
> skipping bio_split() is the right approach (if that is what you are 
> suggesting).

bio_split_to_limits() is already called from md_submit_bio(), so here
bio should only be splitted because of badblocks or resync. We have to
return error for resync, however, for badblocks, we can still try to
find a rdev without badblocks so bio_split() is not needed. And we need
to retry and inform read_balance() to skip rdev with badblocks in this
case.

This can only happen if the full copy only exist in slow disks. This
really is corner case, and this is not related to your new error path by
atomic write. I don't mind this version for now, just something
I noticed if bio_spilit() can fail.

Thanks,
Kuai

> 
> Thanks,
> John
> 
> .
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ