linux-ext4 - Re: ext4_clear_journal_err: Filesystem error recorded from previous mount: IO failure

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <201010221942.49915.bs_lists@aakef.fastmail.fm>
Date:	Fri, 22 Oct 2010 19:42:49 +0200
From:	Bernd Schubert <bs_lists@...ef.fastmail.fm>
To:	"Ted Ts'o" <tytso@....edu>
Cc:	linux-ext4@...r.kernel.org, Bernd Schubert <bschubert@....com>
Subject: Re: ext4_clear_journal_err: Filesystem error recorded from previous mount: IO failure

On Friday, October 22, 2010, Ted Ts'o wrote:
> On Fri, Oct 22, 2010 at 03:33:29PM +0200, Bernd Schubert wrote:
> > is is really a good idea to allow the filesystem to mount if something
> > like that comes up? I really would prefer if mount would abort.
> > 
> > Oct 22 12:37:36 vm7 kernel: [ 1227.814294] LDISKFS-fs warning (device
> > sfa0074): ldiskfs_clear_journal_err: Filesystem error recorded from p
> > revious mount: IO failure
> > Oct 22 12:37:36 vm7 kernel: [ 1227.814314] LDISKFS-fs warning (device
> > sfa0074): ldiskfs_clear_journal_err: Marking fs in need of filesystem
> > 
> >  check.
> > 
> > (please ignore "ldiskfs", it was just renamed to that by Lustre, but is
> > ext4 based as in RHEL5.5, so 2.6.32-ish).
> 
> Did you try running e2fsck first?  If it detects the error after
> running the journal, it will run the file system check right then and
> there.  If it doesn't, it's a bug.  If you're not running e2fsck

I *think* I got those messages at least once although I run e2fsck. But I'm 
not sure.

> first, and the filesystem had previously detected inconsistencies, the
> long-standing tradition is to allow that, since root should know what
> it's doing.

No, it is far more difficult than that. The devices are managed by pacemaker. 
Which means: I/O  errors come up -> Lustre complains about that in its proc 
file. Pacemaker monitoring fails, so pacemaker stops the device and starts it 
again. If that does not succeed, it tries to start it on fail-over system.
I also cannot tell pacemaker to not to try to re-start after an error, as that 
would completely defeat an HA solution.

> 
> And there are times when you do want to mount a filesystem with known
> errors; for example, in the case of the root file system, we have
> always allowed a read-only mount to continue, so that we can run
> e2fsck without requiring a rescue CD 99% of the time.

Yes, it seems a mount option is missing here. 


Thanks,
Bernd


-- 
Bernd Schubert
DataDirect Networks
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html