lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 24 Nov 2010 11:23:05 +1100
From:	Nick Piggin <npiggin@...nel.dk>
To:	Dave Chinner <david@...morbit.com>
Cc:	npiggin@...nel.dk, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [patch 7/7] fs: fix or note I_DIRTY handling bugs in
  filesystems

On Wed, Nov 24, 2010 at 09:51:48AM +1100, Dave Chinner wrote:
> On Wed, Nov 24, 2010 at 01:06:17AM +1100, npiggin@...nel.dk wrote:
> > Comments?
> 
> How did you test the changes?

Not widely as yet, just tested a few filesystems passed deadlock and
bug tests. It's just in RFC state as yet.

 
> > +++ linux-2.6/fs/xfs/linux-2.6/xfs_file.c	2010-11-24 00:08:03.000000000 +1100
> > @@ -99,6 +99,7 @@ xfs_file_fsync(
> >  	struct xfs_trans	*tp;
> >  	int			error = 0;
> >  	int			log_flushed = 0;
> > +	unsigned		dirty, mask;
> >  
> >  	trace_xfs_file_fsync(ip);
> >  
> > @@ -132,9 +133,16 @@ xfs_file_fsync(
> >  	 * might gets cleared when the inode gets written out via the AIL
> >  	 * or xfs_iflush_cluster.
> >  	 */
> > -	if (((inode->i_state & I_DIRTY_DATASYNC) ||
> > -	    ((inode->i_state & I_DIRTY_SYNC) && !datasync)) &&
> > -	    ip->i_update_core) {
> > +	spin_lock(&inode_lock);
> > +	inode_writeback_begin(inode, 1);
> > +	if (datasync)
> > +		mask = I_DIRTY_DATASYNC;
> > +	else
> > +		mask = I_DIRTY_SYNC | I_DIRTY_DATASYNC;
> > +	dirty = inode->i_state & mask;
> > +	inode->i_state &= ~mask;
> > +	spin_unlock(&inode_lock);
> > +	if (dirty && ip->i_update_core) {
> 
> It looks to me like the pattern "inode_writeback_begin(); get dirty
> state from i_state" repeated for each filesystem is wrong. The
> inode_writeback_begin() helper does this:
> 
> 	inode->i_state &= ~I_DIRTY;
> 
> which clears all the dirty bits from the i_state, which means the
> followup:
> 
> 	dirty = inode->i_state & mask;
> 
> will always result in a zero value for dirty.  IOWs, this seems to
> ensure that ->fsync never sees dirty inodes anymore. This will break
> fsync on XFS, and probably on all the other filesystems you modified
> to use this pattern as well.

Yes, the helper needs to do inode->i_state &= ~I_DIRTY_PAGES. Good
catch, thanks.

I had I_DIRTY there because I was initially going to return the
dirty bits, however some cases want to check/clear bits at different
times (eg. background writeout wants to clear DIRTY_PAGES then do
the pagecache writeback, and then test/clear the metadata dirty bits).

 
> Also, I think the pattern is racy with respect to concurrent page
> cache dirtiers. i.e if the inode was dirtied between writeback and
> ->fsync() in vfs_fsync_range(), then this new code clears the
> I_DIRTY_PAGES bit in i_state without writing back the dirty pages.

That gets caught in the writeback_end helper, same way as for background
writeout. It's useful to do this for the fsync helper so that the inode
actually gets marked clean if the pagecache writeback cleaned
everything.

> 
> And FWIW, I'm not sure that we want to be propagating the inode_lock
> into every filesystem...
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ