linux-ext4 - Re: ext4_clear_journal_err: Filesystem error recorded from previous mount: IO failure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <4CC5EA78.1010005@ddn.com>
Date:	Mon, 25 Oct 2010 22:37:12 +0200
From:	Bernd Schubert <bschubert@....com>
To:	Eric Sandeen <sandeen@...hat.com>
CC:	Andreas Dilger <andreas.dilger@...cle.com>,
	Ric Wheeler <rwheeler@...hat.com>, Ted Ts'o <tytso@....edu>,
	Amir Goldstein <amir73il@...il.com>,
	Bernd Schubert <bs_lists@...ef.fastmail.fm>,
	Ext4 Developers List <linux-ext4@...r.kernel.org>
Subject: Re: ext4_clear_journal_err: Filesystem error recorded from previous
 mount: IO failure

On 10/25/2010 09:43 PM, Eric Sandeen wrote:
> 
> Now, extN has this feature of recording fs errors in the superblock,
> but I'm not sure we distinguish between "errors which require a fsck"
> and others?

That is definitely a good question - is it right to set a generic error
flag, if 'only' I/O errors came up?  The problem is that the error flag
comes from ext4_error() and ext4_abort(), which are all over the code
and which do not make any difference if it just an IO error or real
filesystem issue.

> 
> Anyway your characterization of xfs is wrong, IMHO, it's:
> 
> Mount (possibly replaying the journal) because all should be well,
> we have faith in our hardware and our software.
> If during runtime the fs encounters a severe metadata error, it will
> shut down, and this is your cue to unmount and run xfs_repair, then
> remount.  Doesn't seem backwards to me.  ;)  Requiring that fsck
> prior to the first mount makes no sense for a journaling fs.
> 
> However, Bernd's issue is probably an issue in general with XFS
> as well (which doesn't record error state on-disk) - how to quickly
> know whether the filesystem you're about to mount in a cluster has
> a -known- integrity issue from a previous mount and really does
> require a fsck.
> 
> For XFS, you have to have monitored the previous mount, I guess,
> and watched for any errors the kernel threw when it encountered them.


It really would be helpful, if filesystems would provide a health file
as Lustre does. A generic VFS proc/sys file or IOCTL would be helpful,
to have a generic interface. I probably should write a patch for it ;)

Cheers,
Bernd


Download attachment "signature.asc" of type "application/pgp-signature" (263 bytes)