linux-ext4 - Re: [PATCH] ext4: Return EIO on read error in ext4_find

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <97EC040D-FF3D-4921-8CF0-F338542BAA42@dilger.ca>
Date:   Mon, 26 Jun 2017 14:45:56 -0600
From:   Andreas Dilger <adilger@...ger.ca>
To:     Tahsin Erdogan <tahsin@...gle.com>
Cc:     Theodore Ts'o <tytso@....edu>,
        Khazhismel Kumykov <khazhy@...gle.com>,
        linux-ext4 <linux-ext4@...r.kernel.org>,
        lkml <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] ext4: Return EIO on read error in ext4_find_entry

> On Jun 26, 2017, at 1:22 PM, Tahsin Erdogan <tahsin@...gle.com> wrote:
> 
> On Thu, Jun 22, 2017 at 4:23 PM, Khazhismel Kumykov <khazhy@...gle.com> wrote:
>> -                       /* read error, skip block & hope for the best */
>>                        EXT4_ERROR_INODE(dir, "reading directory lblock %lu",
>>                                         (unsigned long) block);
>>                        brelse(bh);
>> -                       goto next;
>> +                       ret = ERR_PTR(-EIO);
>> +                       goto cleanup_and_exit;
> 
> EXT4_ERROR_INODE() triggers ext4_handle_error() which performs error
> handling. I think it should be removed here because we are returning
> the error to the caller and there is nothing more drastic about this
> error than other error return paths in the same function.

The issue is that this represents filesystem corruption, since the
IO error in the leaf block means some files cannot be looked up, and
new files cannot be created in this directory.  Running e2fsck will
repair this problem, so ext4_handle_error() will mark the filesystem
in error so that e2fsck will be run on the next restart and this error
is not lost because it is only returned to the application.

The sysadmin already has the option to declare the behaviour of
ext4_handle_error() - "continue" (which seems to what you are already
running), "remount-ro" (what we are using), or "panic" the node (which
some HA systems use).

To my reading, this patch only applies to the "continue" case, and it
is good to make this handling more robust.  Similar work was done with
the block and inode bitmaps (EXT4_{MB_GRP,GROUP_INFO}_BBITMAP_CORRUPT
and EXT4_{MB_GRP,GROUP_INFO}_IBITMAP_CORRUPT) to skip allocations in
those groups if an error is detected.

Cheers, Andreas

Download attachment "signature.asc" of type "application/pgp-signature" (196 bytes)