[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1501246057.8241.1.camel@redhat.com>
Date: Fri, 28 Jul 2017 08:47:37 -0400
From: Jeff Layton <jlayton@...hat.com>
To: Steven Whitehouse <swhiteho@...hat.com>,
Bob Peterson <rpeterso@...hat.com>
Cc: Matthew Wilcox <willy@...radead.org>,
Jeff Layton <jlayton@...nel.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
Jan Kara <jack@...e.cz>,
"J . Bruce Fields" <bfields@...ldses.org>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, cluster-devel@...hat.com
Subject: Re: [PATCH v2 4/4] gfs2: convert to errseq_t based writeback error
reporting for fsync
On Fri, 2017-07-28 at 13:37 +0100, Steven Whitehouse wrote:
> Hi,
>
>
> On 27/07/17 13:47, Bob Peterson wrote:
> > ----- Original Message -----
> > > On Wed, 2017-07-26 at 12:21 -0700, Matthew Wilcox wrote:
> > > > On Wed, Jul 26, 2017 at 01:55:38PM -0400, Jeff Layton wrote:
> > > > > @@ -668,12 +668,14 @@ static int gfs2_fsync(struct file *file, loff_t
> > > > > start, loff_t end,
> > > > > if (ret)
> > > > > return ret;
> > > > > if (gfs2_is_jdata(ip))
> > > > > - filemap_write_and_wait(mapping);
> > > > > + ret = file_write_and_wait(file);
> > > > > + if (ret)
> > > > > + return ret;
> > > > > gfs2_ail_flush(ip->i_gl, 1);
> > > > > }
> > > >
> > > > Do we want to skip flushing the AIL if there was an error (possibly
> > > > previously encountered)? I'd think we'd want to flush the AIL then report
> > > > the error, like this:
> > > >
> > >
> > > I wondered about that. Note that earlier in the function, we also bail
> > > out without flushing the AIL if sync_inode_metadata fails, so I assumed
> > > that we'd want to do the same here.
> > >
> > > I could definitely be wrong and am fine with changing it if so.
> > > Discarding the error like we do today seems wrong though.
> > >
> > > Bob, thoughts?
> >
> > Hi Jeff, Matthew,
> >
> > I'm not sure there's a right or wrong answer here. I don't know what's
> > best from a "correctness" point of view.
> >
> > I guess I'm leaning toward Jeff's original solution where we don't
> > call gfs2_ail_flush() on error. The main purpose of ail_flush is to
> > go through buffer descriptors (bds) attached to the glock and generate
> > revokes for them in a new transaction. If there's an error condition,
> > trying to go through more hoops will probably just get us into more
> > trouble. If the error is -ENOMEM, we don't want to allocate new memory
> > for the new transaction. If the error is -EIO, we probably don't
> > want to encourage more writing either.
> >
> > So on the one hand, it might be good to get rid of the buffer descriptors
> > so we don't leak memory, but that's probably also done elsewhere.
> > I have not chased down what happens in that case, but the same thing
> > would happen in the existing -EIO case a few lines above.
> >
> > On the other hand, we probably don't want to start a new transaction
> > and start adding revokes to it, and such, due to the error.
> >
> > Perhaps Steve Whitehouse can weigh in?
> >
> > Regards,
> >
> > Bob Peterson
> > Red Hat File Systems
>
> Yes, we probably do want to skip the ail flush if there is an error. We
> don't know whether the error is permanent or transient at that stage. If
> a previous stage of the fsync has failed, then there may be nothing for
> the next stage to do anyway, so it is probably not a big deal either
> way. So long as the error is reported to the caller, then we should be ok,
>
Ok, cool. I'll plan to carry this patch for now as it depends on an
earlier one in the series. One more question though:
Is it correct in the gfs2_is_jdata case to ignore the range that was
passed in from the caller? ->fsync gets start and end arguments, but
this will always write back the whole range. Is that necessary in this
case?
--
Jeff Layton <jlayton@...hat.com>
Powered by blists - more mailing lists