linux-kernel - Re: [PATCH RFC 5/6] md/raid1: Handle bio

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0cf7985e-e7ac-4503-827b-eb2a0fd6ef67@oracle.com>
Date: Wed, 23 Oct 2024 12:16:19 +0100
From: John Garry <john.g.garry@...cle.com>
To: Yu Kuai <yukuai1@...weicloud.com>, axboe@...nel.dk, hch@....de
Cc: linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-raid@...r.kernel.org, martin.petersen@...cle.com,
        "yangerkun@...wei.com" <yangerkun@...wei.com>,
        "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH RFC 5/6] md/raid1: Handle bio_split() errors

On 23/09/2024 10:38, Yu Kuai wrote:
>>>>>
>>>>> We need a new branch in read_balance() to choose a rdev with full 
>>>>> copy.
>>>>
>>>> Sure, I do realize that the mirror'ing personalities need more 
>>>> sophisticated error handling changes (than what I presented).
>>>>
>>>> However, in raid1_read_request() we do the read_balance() and then 
>>>> the bio_split() attempt. So what are you suggesting we do for the 
>>>> bio_split() error? Is it to retry without the bio_split()?
>>>>
>>>> To me bio_split() should not fail. If it does, it is likely ENOMEM 
>>>> or some other bug being exposed, so I am not sure that retrying with 
>>>> skipping bio_split() is the right approach (if that is what you are 
>>>> suggesting).
>>>
>>> bio_split_to_limits() is already called from md_submit_bio(), so here
>>> bio should only be splitted because of badblocks or resync. We have to
>>> return error for resync, however, for badblocks, we can still try to
>>> find a rdev without badblocks so bio_split() is not needed. And we need
>>> to retry and inform read_balance() to skip rdev with badblocks in this
>>> case.
>>>
>>> This can only happen if the full copy only exist in slow disks. This
>>> really is corner case, and this is not related to your new error path by
>>> atomic write. I don't mind this version for now, just something
>>> I noticed if bio_spilit() can fail.
>>

Hi Kuai,

I am just coming back to this topic now.

Previously I was saying that we should error and end the bio if we need 
to split for an atomic write due to BB. Continued below..

>> Are you saying that some improvement needs to be made to the current 
>> code for badblocks handling, like initially try to skip bio_split()?
>>
>> Apart from that, what about the change in raid10_write_request(), 
>> w.r.t error handling?
>>
>> There, for an error in bio_split(), I think that we need to do some 
>> tidy-up if bio_split() fails, i.e. undo increase in rdev->nr_pending 
>> when looping conf->copies
>>
>> BTW, feel free to comment in patch 6/6 for that.
> 
> Yes, raid1/raid10 write are the same. If you want to enable atomic write
> for raid1/raid10, you must add a new branch to handle badblocks now,
> otherwise, as long as one copy contain any badblocks, atomic write will
> fail while theoretically I think it can work.

Can you please expand on what you mean by this last sentence, "I think 
it can work".

Indeed, IMO, chance of encountering a device with BBs and supporting 
atomic writes is low, so no need to try to make it work (if it were 
possible) - I think that we just report EIO.

Thanks,
John