[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <20080604215834.GA2961@webber.adilger.int>
Date: Wed, 04 Jun 2008 15:58:34 -0600
From: Andreas Dilger <adilger@....com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: Jan Kara <jack@...e.cz>,
Hidehiro Kawai <hidehiro.kawai.ez@...achi.com>, sct@...hat.com,
linux-kernel@...r.kernel.org, linux-ext4@...r.kernel.org,
jbacik@...hat.com, cmm@...ibm.com, tytso@....edu,
yumiko.sugita.yf@...achi.com, satoshi.oshima.fk@...achi.com
Subject: Re: [PATCH 1/5] jbd: strictly check for write errors on data buffers
On Jun 04, 2008 11:19 -0700, Andrew Morton wrote:
> On Wed, 4 Jun 2008 12:19:25 +0200 Jan Kara <jack@...e.cz> wrote:
>
> > On Tue 03-06-08 15:30:50, Andrew Morton wrote:
> > > On Mon, 02 Jun 2008 19:43:57 +0900
> > > Hidehiro Kawai <hidehiro.kawai.ez@...achi.com> wrote:
> > >
> > > >
> > > > In ordered mode, we should abort journaling when an I/O error has
> > > > occurred on a file data buffer in the committing transaction.
> > >
> > > Why should we do that?
> > I see two reasons:
> > 1) If fs below us is returning IO errors, we don't really know how severe
> > it is so it's safest to stop accepting writes. Also user notices the
> > problem early this way. I agree that with the growing size of disks and
> > thus probability of seeing IO error, we should probably think of something
> > cleverer than this but aborting seems better than just doing nothing.
> >
> > 2) If the IO error is just transient (i.e., link to NAS is disconnected for
> > a while), we would silently break ordering mode guarantees (user could be
> > able to see old / uninitialized data).
> >
>
> Does any other filesystem driver turn the fs read-only on the first
> write-IO-error?
>
> It seems like a big policy change to me. For a lot of applications
> it's effectively a complete outage and people might get a bit upset if
> this happens on the first blip from their NAS.
I agree with Andrew. The historical policy of ext2/3/4 is that write
errors for FILE DATA propagate to the application via EIO, regardless
of whether ordered mode is active or not. If filesystem METADATA is
involved, yes this should cause an ext3_error() to be called and the
policy for what to do next is controlled by the administrator.
If the journal is aborted for a file data write error, the node maybe
panics (errors=panic) or is rebooted by the administrator, then e2fsck
is run and no problem is found (which it will not because e2fsck does
not check file data), then all that has been accomplished is to reboot
the node.
Applications should check the error return codes from their writes,
and call fsync on the file before closing it if they are really worried
about whether the data made it to disk safely, otherwise even a read-only
filesystem will not cause the application to notice anything.
Now, if we have a problem where the write error from the journal checkpoint
is not being propagated back to the original buffer, that is a different
issue. For ordered-mode data I don't _think_ that the data buffers are
mirrored when a transaction is closed, so as long as the !buffer_uptodate()
error is propagated back to the caller on fsync() that is enough.
We might also consider having a separate mechanism to handle write failures,
but I don't think that that should be intermixed with the ext3 error handling
for metadata errors.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists