lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20090914160041.GB25549@duck.suse.cz>
Date:	Mon, 14 Sep 2009 18:00:41 +0200
From:	Jan Kara <jack@...e.cz>
To:	Theodore Tso <tytso@....edu>
Cc:	Jan Kara <jack@...e.cz>,
	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
	linux-ext4@...r.kernel.org
Subject: Re: fsync on ext[34] working only by an accident

On Thu 10-09-09 12:52:01, Theodore Tso wrote:
> On Thu, Sep 10, 2009 at 04:06:36PM +0200, Jan Kara wrote:
> > > In fact, I think the problem is worse than Jan is pointing out,
> > > because it's not enough that vfs_fq_alloc_space() is calling
> > > mark_inode_dirty(), since that only sets I_DIRTY_SYNC.  When we touch
> > > i_size or i_block[], we need to make sure that I_DIRTY_DATASYNC is
> > > set, so that fdatasync() will force a commit.
> >   Actually no. mark_inode_dirty() dirties inode with I_DIRTY ==
> > (I_DIRTY_SYNC | I_DIRTY_DATASYNC | I_DIRTY_PAGES) so it happens to work.
> > The fact that quota *could* dirty the inode with I_DIRTY_SYNC only
> > is right but that's a separate issue.
> 
> Oops, you're right.  I mixed up I_DIRTY and I_DIRTY_SYNC.  Hmm, what's
> actually a bit surprising is that we don't have a
> mark_inode_dirty_datasync() which only sets I_DIRTY_DATASYNC.
  Yeah. But that's easy to fix ;).

> And shouldn't quota be dirtying the inode using I_DIRTY_DATASYNC only?
> After, all the reason why we need to do this is because it's messing
> with i_size, right?
  Quota does not touch i_size. It touches only i_blocks - that's why it
dirties the inode. Since i_blocks are not really needed to get to the data,
I think I_DIRTY_SYNC for quota is more appropriate.

> >   Thinking about it, it won't work so easily. The problem is that when
> > pdflush decides to write the inode, it unconditionally clears dirty flags.
> > We could redirty the inode in write_inode() but that's IMHO too ugly to
> > bear it.
> 
> Hmm, yes.  I wonder if this is something we should change, and make it
> be the responsibility of the filesystem's write_inode method function
> to clear I_DIRTY_SYNC and I_DIRTY_DATASYNC flags.  That would allow
> the file system to refuse to clean the inode for whatever reason the
> file system saw fit.  That would require sweeping through all of the
> file systems, but it might be useful for more file systems other than
> ext3/ext4.
  Possibly yes. But OTOH from the point of view of pdflush, the inode
actually is clean. It could be even evicted from memory when needed before
the transaction commits - which invalidates my approach to solving the
fsync() trouble. Nasty.
  Also just not clearing dirty bits in write_inode() would probably rather
confuse current writeback code because all inodes would stay dirty until
a transaction commit with no way of getting rid of them. So pdflush would
needlessly burn CPU cycles scanning them and trying to write them. Also
it's not completely clear how we would clear the dirty bits after the
transaction commits since during commit, it could become part of a freshly
started transaction.
  Sigh, it's rather convoluted. Probably we have to somehow pin the inode
in memory until the transaction commits to fix the "remember TID" approach.

								Honza
-- 
Jan Kara <jack@...e.cz>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ