[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <201010222054.42083.bs_lists@aakef.fastmail.fm>
Date: Fri, 22 Oct 2010 20:54:41 +0200
From: Bernd Schubert <bs_lists@...ef.fastmail.fm>
To: "Ted Ts'o" <tytso@....edu>
Cc: linux-ext4@...r.kernel.org, Bernd Schubert <bschubert@....com>
Subject: Re: ext4_clear_journal_err: Filesystem error recorded from previous mount: IO failure
On Friday, October 22, 2010, Ted Ts'o wrote:
> On Fri, Oct 22, 2010 at 07:42:49PM +0200, Bernd Schubert wrote:
> > No, it is far more difficult than that. The devices are managed by
> > pacemaker. Which means: I/O errors come up -> Lustre complains
> > about that in its proc file. Pacemaker monitoring fails, so
> > pacemaker stops the device and starts it again.
>
> I'm not sure what errors you're referring to, but if the errors are
There are multiple ways to let Lustre tell you that there is problem.
Underlying filesystem related is just one of many.
> related to file system inconsistencies, by definition umounting and
> re-mounting isn't going to fix things, and could result in more
> damage. For certain errors, you really do need to run e2fsck before
> remounting the device.
Yes and that is exactly why I'm asking for another mount option to not allow
mounts when the filesystem knows better.
>
> Can you not change pacemaker to stop the device, run e2fsck, and then
> remount the file system?
I am sure I could spend the next 4 weeks to write code that would allow to do
that with Lustre and pacemaker. But at the same time, it seems far more easy
to add another mount flag to ext4...
I also cannot simply set a max_failcount=1 in pacemaker, at that would
completely be against an HA concept. There are so many ways to increase the
failcount, for example Lustre bugs (ext4 unrelated), pacemaker bugs, human
errors (something missing on one node, but available on another), etc. A few
failures (ext4 unrelated) are absolutely 'normal' over a couple of month and
there is no reason not to allow that.
I'm not asking you to implement another feature, but I'm asking if a patch to
add a new option would be accepted. I also cannot promise to implement that
any time soon, given that I will leave DDN end of November. But it seems to be
option useful for everyone including my desktop. So either I do that over the
next 4 weeks when I find a minute or during x-mas or so.
Thanks,
Bernd
--
Bernd Schubert
DataDirect Networks
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists