linux-ext4 - Re: [PATCH] e2fsprogs: Try again to solve unreliable io case

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <cf1aae86-4192-3554-cf6f-21b21ee9f953@huawei.com>
Date:   Mon, 10 May 2021 11:35:59 +0800
From:   Zhiqiang Liu <liuzhiqiang26@...wei.com>
To:     Theodore Ts'o <tytso@....edu>
CC:     Haotian Li <lihaotian9@...wei.com>,
        Ext4 Developers List <linux-ext4@...r.kernel.org>,
        "harshad shirwadkar," <harshadshirwadkar@...il.com>,
        linfeilong <linfeilong@...wei.com>
Subject: Re: [PATCH] e2fsprogs: Try again to solve unreliable io case

friendly ping ...

On 2021/4/24 12:46, Zhiqiang Liu wrote:
>
> On 2021/4/23 23:46, Theodore Ts'o wrote:
>> On Fri, Apr 23, 2021 at 10:18:09AM +0800, Zhiqiang Liu wrote:
>>> Thanks for your reply.
>>> Actually, we have met the problem in ipsan situation.
>>> When exec 'fsck -a <remote-device>', short-term fluctuations or
>>> abnormalities may occur on the network. Despite the driver has
>>> do the best effort, some IO errors may occur. So add retrying in
>>> e2fsprogs can further improve the reliability of the repair
>>> process.
>> But why doesn't this happen when the file system is mounted, and why
>> is that acceptable?   And why not change the driver to do more retries?
>>
>>    		      	      	  - Ted
>>
> Actually, this may happen when the filesystem is mounted. The difference
> is that the mounted filesystem can ensure the consistency with journal.
>
> For example, if the IO error occurs when calling io_channel_write_byte()
> to update superblock, the checksum may be not written to the disk successfully.
> Then the checksum error will occur, and the filesystem cannot be
> repaired with 'fsck -y|a|f'.
>
> This situation has a very low probability. For improving the reliability of
> the repair process, the retries in e2fsprogs may be necessary.
>
> Regards
> Zhiqiang Liu.
>
>> .
>>
>
> .