linux-ext4 - Re: [PATCH] ext4: Return EIO on read error in ext4_find

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <54BEB476-F6E0-4421-B381-92442457910F@dilger.ca>
Date:   Fri, 23 Jun 2017 17:34:23 -0600
From:   Andreas Dilger <adilger@...ger.ca>
To:     Theodore Ts'o <tytso@....edu>
Cc:     Khazhismel Kumykov <khazhy@...gle.com>,
        linux-ext4 <linux-ext4@...r.kernel.org>,
        lkml <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] ext4: Return EIO on read error in ext4_find_entry

On Jun 23, 2017, at 5:26 PM, Theodore Ts'o <tytso@....edu> wrote:
> 
> On Fri, Jun 23, 2017 at 03:33:46PM -0700, Khazhismel Kumykov wrote:
>> 
>> Giving up early or checking future blocks both work, critical thing
>> here is not returning NULL after seeing a read error.
>> Previously to this the behavior was to continue to check future blocks
>> after a read error, and it seemed OK.
> 
> Whether or not it is OK probably depends on how big the directory is.
> If we need to suffer through N long error retries, whether it is
> caused by long SCSI error retries, or long iSCSI error retries, sooner
> or later it's going to be problematic if the process which is taking
> forever to search through the whole directory has a some kind health
> monitoring service or other watchdog timer.

I think this is a problem regardless of what is being done by the filesystem,
basically if the block device is broken then there will be a lot of retries
and/or errors.  I agree it doesn't make sense to return a benign error like
"ENOENT" if there are IO errors.

> Still, I agree that there will be some cases where instead of "Fast
> fail", having the file server try as hard as possible fetch the file
> from the failing disk is worthwhile.  I tend to be focused on the
> cluster file system case where if it's going to several hundred
> milliseconds to fetch the file, you're better off getting it from the
> one other replicated copies from another server, or start the
> reed-solomon reconstruction from.

Sure, but that is a problem independent of the readdir case I think?

> However, if you have an
> architecture where the only copy of the file is on the particular file
> server (perhaps because you are depending on RAID instead of n=3
> replication or reed-solomon erasure codes), having the file server try
> as hard as possible to find the file is a good thing.
> 
> I wonder if the right answer is to have "fastfail" and "nofastfail"
> mount option.

Wouldn't it just make sense to mount the filesystem with "errors=remount-ro"
or "errors=panic" in your case, where you can give up on a single node
easily if it detects device-level errors, rather than "errors=continue" as
it seems you currently have?  This is what we do in HA environments, and
fail the storage over to a backup server in case the problem is with the
node, SCSI cards, cables, etc. and not the disk (preventing further automatic
failback to prevent node ping-pong if there is actually a media error).

Cheers, Andreas






Download attachment "signature.asc" of type "application/pgp-signature" (196 bytes)