[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <20080730211703.GZ3342@webber.adilger.int>
Date: Wed, 30 Jul 2008 15:17:03 -0600
From: Andreas Dilger <adilger@....com>
To: Mike Snitzer <snitzer@...il.com>
Cc: Hidehiro Kawai <hidehiro.kawai.ez@...achi.com>,
akpm@...ux-foundation.org, sct@...hat.com, adilger@...sterfs.com,
linux-kernel@...r.kernel.org, linux-ext4@...r.kernel.org,
jack@...e.cz, jbacik@...hat.com, cmm@...ibm.com, tytso@....edu,
tglx@...utronix.de, yumiko.sugita.yf@...achi.com,
satoshi.oshima.fk@...achi.com
Subject: Re: [PATCH 1/2] ext3: add an option to control error handling on file
data
On Jul 30, 2008 11:14 -0400, Mike Snitzer wrote:
> On Tue, Jul 29, 2008 at 10:52 PM, Hidehiro Kawai
> <hidehiro.kawai.ez@...achi.com> wrote:
> > If the journal doesn't abort when it gets an IO error in file data
> > blocks, the file data corruption will spread silently. Because
> > most of applications and commands do buffered writes without fsync(),
> > they don't notice the IO error. It's scary for mission critical
> > systems. On the other hand, if the journal aborts whenever it gets
> > an IO error in file data blocks, the system will easily become
> > inoperable. So this patch introduces a filesystem option to
> > determine whether it aborts the journal or just call printk() when
> > it gets an IO error in file data.
> >
> > If you mount a ext3 fs with data_err=abort option, it aborts on file
> > data write error. If you mount it with data_err=ignore, it doesn't
> > abort, just call printk(). data_err=abort is default, because
> > people have used this error handling policy for three years.
>
> Thanks for making this configurable!
>
> But given how surprised many of us were when we found out that
> jbd/ext3 has been aborting on file data blocks isn't this our chance
> to correct that long-standing oversight? Shouldn't the default be
> data_err=ignore? Or would changing this behavior cause more harm than
> good?
>
> I don't feel strongly either way, having the "data_err" option makes
> this issue moot for me, but I figured I'd raise the question (in the
> interest of review).
Yes, good point. I don't think any of the ext3 maintainers were aware
that the 3-years-old patch had introduced "abort on data error" behaviour.
The default for ext4 is only now going to errors=remount-ro from
errors=continue (as it is on ext2/3) so I think it is inconsistent to
have the journal abort on data errors when the filesystem itself does not.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists