linux-kernel - Re: [RFC][PATCH] Possible data integrity problems in lots of filesystems?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20101125114711.GA3622@amd>
Date:	Thu, 25 Nov 2010 22:47:11 +1100
From:	Nick Piggin <npiggin@...nel.dk>
To:	Boaz Harrosh <bharrosh@...asas.com>
Cc:	Nick Piggin <npiggin@...nel.dk>, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-ext4@...r.kernel.org,
	Roman Zippel <zippel@...ux-m68k.org>,
	"Tigran A. Aivazian" <tigran@...azian.fsnet.co.uk>,
	OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>,
	Dave Kleikamp <shaggy@...ux.vnet.ibm.com>,
	Bob Copeland <me@...copeland.com>,
	reiserfs-devel@...r.kernel.org,
	Christoph Hellwig <hch@...radead.org>,
	Evgeniy Dushistov <dushistov@...l.ru>, Jan Kara <jack@...e.cz>
Subject: Re: [RFC][PATCH] Possible data integrity problems in lots of
 filesystems?

On Thu, Nov 25, 2010 at 12:51:11PM +0200, Boaz Harrosh wrote:
> On 11/25/2010 12:06 PM, Nick Piggin wrote:
> > On Thu, Nov 25, 2010 at 11:28:14AM +0200, Boaz Harrosh wrote:
> 
> >>> Index: linux-2.6/fs/exofs/file.c
> >>> ===================================================================
> >>> --- linux-2.6.orig/fs/exofs/file.c	2010-11-19 16:50:00.000000000 +1100
> >>> +++ linux-2.6/fs/exofs/file.c	2010-11-19 16:50:07.000000000 +1100
> >>> @@ -48,11 +48,6 @@ static int exofs_file_fsync(struct file
> >>>  	struct inode *inode = filp->f_mapping->host;
> >>>  	struct super_block *sb;
> >>>  
> >>> -	if (!(inode->i_state & I_DIRTY))
> >>> -		return 0;
> >>> -	if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))
> >>> -		return 0;
> >>> -
> >>>  	ret = sync_inode_metadata(inode, 1);
> >>>  
> >>>  	/* This is a good place to write the sb */
> >>>
> >>
> >> Is that a good enough fix for the issue in your opinion?
> >> Or is there more involved?
> > 
> > For the inode dirty bit race problem, yes it should fix it.
> > sync_inode_metadata basically makes the same checks without
> > races (in a subsequent patch I re-introduced the datasync
> > optimisation).
> > 
> >  
> 
> > 
> > Well in your fsync, you need to wait for inode writeback
> > that might have been started by an asynchronous write_inode.
> > 
> 
> All I'm calling is sync_inode_metadata(,1) which calls sync_inode()
> which calls writeback_single_inode(sync_mode == WB_SYNC_ALL). It gets
> a little complicated but from the looks of it, even though the
> call to .write_inode() is not under any lock the state machine there
> will do inode_wait_for_writeback() if there was one in motion
> all ready. ?
> 
> And it looks like writeback_single_inode() does all the proper
> checks in the correct order for these flags above.
> 
> So current code in exofs_file_fsync() looks scary to me. I would
> like to push your above patch for this Kernel. (I'll repost it)

It does not get it right, because of the situation I described
above. Background writeout can come in first, and clear the inode
dirty bits, and call your ->write_inode for async writeout.

That means you skip doing the exofs_put_io_state(), and (I presume)
this means you aren't waiting for write completion there.

What then happens is that sync_inode_metadata() from your fsync
does not call ->write_inode because the inode dirty bits are clear.
It's basically a noop. So you need to either make your .write_inode
always synchronous, or wait for it in your .fsync and .sync_fs.


> > Also, with your sync_inode_metadata call, you shouldn't need the
> > sync_inode call by the looks.
> >  
> 
> What? I missed you. You mean I don't need to sync_inode_metadata(,wait==1),
> or what did you mean?

Sorry, I was looking at the wrong code, ignore that.

Nick
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/