linux-kernel - Re: [PATCH RFC 5/6] md/raid1: Handle bio

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5a16f8c2-d868-48cf-96c8-a0d99e440ca5@oracle.com>
Date: Wed, 23 Oct 2024 13:11:53 +0100
From: John Garry <john.g.garry@...cle.com>
To: Geoff Back <geoff@...onlair.co.uk>, Yu Kuai <yukuai1@...weicloud.com>,
        axboe@...nel.dk, hch@....de
Cc: linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-raid@...r.kernel.org, martin.petersen@...cle.com,
        "yangerkun@...wei.com" <yangerkun@...wei.com>,
        "yukuai (C)" <yukuai3@...wei.com>
Subject: Re: [PATCH RFC 5/6] md/raid1: Handle bio_split() errors

On 23/10/2024 12:46, Geoff Back wrote:
>>> Yes, raid1/raid10 write are the same. If you want to enable atomic write
>>> for raid1/raid10, you must add a new branch to handle badblocks now,
>>> otherwise, as long as one copy contain any badblocks, atomic write will
>>> fail while theoretically I think it can work.
>> Can you please expand on what you mean by this last sentence, "I think
>> it can work".
>>
>> Indeed, IMO, chance of encountering a device with BBs and supporting
>> atomic writes is low, so no need to try to make it work (if it were
>> possible) - I think that we just report EIO.
>>
>> Thanks,
>> John
>>
>>
> Hi all,
> 
> Looking at this from a different angle: what does the bad blocks system
> actually gain in modern environments?  All the physical storage devices
> I can think of (including all HDDs and SSDs, NVME or otherwise) have
> internal mechanisms for remapping faulty blocks, and therefore
> unrecoverable blocks don't become visible to the Linux kernel level
> until after the physical storage device has exhausted its internal
> supply of replacement blocks.  At that point the physical device is
> already catastrophically failing, and in the case of SSDs will likely
> have already transitioned to a read-only state.  Using bad-blocks at the
> kernel level to map around additional faulty blocks at this point does
> not seem to me to have any benefit, and the device is unlikely to be
> even marginally usable for any useful length of time at that point anyway.
> 
> It seems to me that the bad-blocks capability is a legacy from the
> distant past when HDDs did not do internal block remapping and hence the
> kernel could usefully keep a disk usable by mapping out individual
> blocks in software.
> If this is the case and there isn't some other way that bad-blocks is
> still beneficial, might it be better to drop it altogether rather than
> implementing complex code to work around its effects?

I am not proposing to drop it. That is another topic.

I am just saying that I don't expect BBs for a device which supports 
atomic writes. As such, the solution for that case is simple - for an 
atomic write which cover BBs in any rdev, then just error that write.

> 
> Of course I'm happy to be corrected if there's still a real benefit to
> having it, just because I can't see one doesn't mean there isn't one.

I don't know if there is really a BB support benefit for modern devices 
at all.

Thanks,
John