[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101125100603.GA3164@amd>
Date: Thu, 25 Nov 2010 21:06:03 +1100
From: Nick Piggin <npiggin@...nel.dk>
To: Boaz Harrosh <bharrosh@...asas.com>
Cc: Nick Piggin <npiggin@...nel.dk>, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-ext4@...r.kernel.org,
Roman Zippel <zippel@...ux-m68k.org>,
"Tigran A. Aivazian" <tigran@...azian.fsnet.co.uk>,
OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>,
Dave Kleikamp <shaggy@...ux.vnet.ibm.com>,
Bob Copeland <me@...copeland.com>,
reiserfs-devel@...r.kernel.org,
Christoph Hellwig <hch@...radead.org>,
Evgeniy Dushistov <dushistov@...l.ru>, Jan Kara <jack@...e.cz>
Subject: Re: [RFC][PATCH] Possible data integrity problems in lots of
filesystems?
On Thu, Nov 25, 2010 at 11:28:14AM +0200, Boaz Harrosh wrote:
> Hi Nick.
> Thanks for digging into this issue, I bet it's causing pain. Which
> I totally missed in my tests. I wish I had a better xsync+reboot
> tests for all this.
That's no problem, thanks for looking.
> So in that previous patch you had:
> > Index: linux-2.6/fs/exofs/file.c
> > ===================================================================
> > --- linux-2.6.orig/fs/exofs/file.c 2010-11-19 16:50:00.000000000 +1100
> > +++ linux-2.6/fs/exofs/file.c 2010-11-19 16:50:07.000000000 +1100
> > @@ -48,11 +48,6 @@ static int exofs_file_fsync(struct file
> > struct inode *inode = filp->f_mapping->host;
> > struct super_block *sb;
> >
> > - if (!(inode->i_state & I_DIRTY))
> > - return 0;
> > - if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))
> > - return 0;
> > -
> > ret = sync_inode_metadata(inode, 1);
> >
> > /* This is a good place to write the sb */
> >
>
> Is that a good enough fix for the issue in your opinion?
> Or is there more involved?
For the inode dirty bit race problem, yes it should fix it.
sync_inode_metadata basically makes the same checks without
races (in a subsequent patch I re-introduced the datasync
optimisation).
> In exofs there is nothing special to do other than VFS
> managment and the final call, by vfs, to .write_inode.
>
> I wish we had a simple_file_fsync() from VFS that does
> what the VFS expects us to do. So when code evolves it
> does not need to change all FSs. This is the third time
> I'm fixing this code trying to second guess the VFS.
Well in your fsync, you need to wait for inode writeback
that might have been started by an asynchronous write_inode.
Also, with your sync_inode_metadata call, you shouldn't need the
sync_inode call by the looks.
> Actually the only other thing I need to do in file_fsync
> today is sb_sync. But this is a stupidity (and a bug) that
> I'm fixing soon. So that theoretical simple_file_fsync()
> would be all I need.
>
> Please advise?
> BTW: Do you want that I take the changes through my tree?
At this point I'd just like some review and feedback, we
might get some other opinions on how to fix it, so don't
take the changes quite yet.
I'll cc you again with a broken out patch.
Thanks,
Nick
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists