[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <457865DC.3020608@redhat.com>
Date: Thu, 07 Dec 2006 14:05:00 -0500
From: Wendy Cheng <wcheng@...hat.com>
To: Steven Whitehouse <swhiteho@...hat.com>
CC: Andrew Morton <akpm@...l.org>, cluster-devel@...hat.com,
linux-kernel@...r.kernel.org
Subject: Re: [Cluster-devel] Re: [GFS2] Don't flush everything on fdatasync
[70/70]
Steven Whitehouse wrote:
> Hi,
>
> On Fri, 2006-12-01 at 11:09 -0800, Andrew Morton wrote:
>
>>> I was taking my cue here from ext3 which does something similar. The
>>> filemap_fdatawrite() is done by the VFS before this is called with a
>>> filemap_fdatawait() afterwards. This was intended to flush the metadata
>>> via (eventually) ->write_inode() although I guess I should be calling
>>> write_inode_now() instead?
>>>
>> oh I see, you're jsut trying to write the inode itself, not the pages.
>>
>> write_inode_now() will write the pages, which you seem to not want to do.
>> Whatever. The APIs here are a bit awkward.
>>
>
> I've added a comment to explain whats going on here, and also the
> following patch. I know it could be better, but its still an improvement
> on what was there before,
>
>
>
Steve,
I'm in the middle of something else and don't have upstream kernel
source handy at this moment. But I read akpm's comment as
"write_inode_now" would do writepage and that is *not* what you want (?)
(since vfs has done that before this call is invoked). I vaguely
recalled I did try write_inode_now() on GFS1 once but had to replace it
with "sync_inode" on RHEL4 (for the reason that I can't remember at this
moment). I suggest you keep "sync_inode" (at least for a while until we
can prove other call can do better). This "sync_inode" has been well
tested out (with GFS1's fsync call).
There is another issue. It is a gray area. Note that you don't grab any
glock here ... so if someone *has* written something in other nodes,
this sync could miss it (?). This depends on how people expects a
fsync/fdatasync should behave in a cluster filesystem. GFS1 asks for a
shared lock here so it will force other node to flush the data (I
personally think this is a more correct behavior). Your call though.
-- Wendy
> --------------------------------------------------------------------------------------------
>
> >From 34126f9f41901ca9d7d0031c2b11fc0d6c07b72d Mon Sep 17 00:00:00 2001
> From: Steven Whitehouse <swhiteho@...hat.com>
> Date: Thu, 7 Dec 2006 09:13:14 -0500
> Subject: [PATCH] [GFS2] Change gfs2_fsync() to use write_inode_now()
>
> This is a bit better than the previous version of gfs2_fsync()
> although it would be better still if we were able to call a
> function which only wrote the inode & metadata. Its no big deal
> though that this will potentially write the data as well since
> the VFS has already done that before calling gfs2_fsync(). I've
> also added a comment to explain whats going on here.
>
> Signed-off-by: Steven Whitehouse <swhiteho@...hat.com>
> Cc: Andrew Morton <akpm@...l.org>
> ---
> fs/gfs2/ops_file.c | 11 ++++++-----
> 1 files changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/fs/gfs2/ops_file.c b/fs/gfs2/ops_file.c
> index 7bd971b..b3f1e03 100644
> --- a/fs/gfs2/ops_file.c
> +++ b/fs/gfs2/ops_file.c
> @@ -510,6 +510,11 @@ static int gfs2_close(struct inode *inod
> * is set. For stuffed inodes we must flush the log in order to
> * ensure that all data is on disk.
> *
> + * The call to write_inode_now() is there to write back metadata and
> + * the inode itself. It does also try and write the data, but thats
> + * (hopefully) a no-op due to the VFS having already called filemap_fdatawrite()
> + * for us.
> + *
> * Returns: errno
> */
>
> @@ -518,10 +523,6 @@ static int gfs2_fsync(struct file *file,
> struct inode *inode = dentry->d_inode;
> int sync_state = inode->i_state & (I_DIRTY_SYNC|I_DIRTY_DATASYNC);
> int ret = 0;
> - struct writeback_control wbc = {
> - .sync_mode = WB_SYNC_ALL,
> - .nr_to_write = 0,
> - };
>
> if (gfs2_is_jdata(GFS2_I(inode))) {
> gfs2_log_flush(GFS2_SB(inode), GFS2_I(inode)->i_gl);
> @@ -530,7 +531,7 @@ static int gfs2_fsync(struct file *file,
>
> if (sync_state != 0) {
> if (!datasync)
> - ret = sync_inode(inode, &wbc);
> + ret = write_inode_now(inode, 0);
>
> if (gfs2_is_stuffed(GFS2_I(inode)))
> gfs2_log_flush(GFS2_SB(inode), GFS2_I(inode)->i_gl);
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists