[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200326032212.GN10776@dread.disaster.area>
Date: Thu, 26 Mar 2020 14:22:12 +1100
From: Dave Chinner <david@...morbit.com>
To: Christoph Hellwig <hch@....de>
Cc: Theodore Ts'o <tytso@....edu>, Jaegeuk Kim <jaegeuk@...nel.org>,
Chao Yu <chao@...nel.org>, Al Viro <viro@...iv.linux.org.uk>,
Richard Weinberger <richard@....at>, linux-xfs@...r.kernel.org,
Eric Biggers <ebiggers@...nel.org>, linux-ext4@...r.kernel.org,
linux-f2fs-devel@...ts.sourceforge.net,
linux-fsdevel@...r.kernel.org, linux-mtd@...ts.infradead.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/4] fs: avoid double-writing the inode on a lazytime
expiration
On Wed, Mar 25, 2020 at 01:28:23PM +0100, Christoph Hellwig wrote:
> In the case that an inode has dirty timestamp for longer than the
> lazytime expiration timeout (or if all such inodes are being flushed
> out due to a sync or syncfs system call), we need to inform the file
> system that the inode is dirty so that the inode's timestamps can be
> copied out to the on-disk data structures. That's because if the file
> system supports lazytime, it will have ignored the dirty_inode(inode,
> I_DIRTY_TIME) notification when the timestamp was modified in memory.q
> Previously, this was accomplished by calling mark_inode_dirty_sync(),
> but that has the unfortunate side effect of also putting the inode the
> writeback list, and that's not necessary in this case, since we will
> immediately call write_inode() afterwards. Replace the call to
> mark_inode_dirty_sync() with a new lazytime_expired method to clearly
> separate out this case.
hmmm. Doesn't this cause issues with both iput() and
vfs_fsync_range() because they call mark_inode_dirty_sync() on
I_DIRTY_TIME inodes to move them onto the writeback list so they are
appropriately expired when the inode is written back.
i.e.:
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 2094386af8ac..e5aafd40dd0f 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -612,19 +612,13 @@ xfs_fs_destroy_inode(
> }
>
> static void
> -xfs_fs_dirty_inode(
> - struct inode *inode,
> - int flag)
> +xfs_fs_lazytime_expired(
> + struct inode *inode)
> {
> struct xfs_inode *ip = XFS_I(inode);
> struct xfs_mount *mp = ip->i_mount;
> struct xfs_trans *tp;
>
> - if (!(inode->i_sb->s_flags & SB_LAZYTIME))
> - return;
> - if (flag != I_DIRTY_SYNC || !(inode->i_state & I_DIRTY_TIME))
> - return;
> -
> if (xfs_trans_alloc(mp, &M_RES(mp)->tr_fsyncts, 0, 0, 0, &tp))
> return;
> xfs_ilock(ip, XFS_ILOCK_EXCL);
> @@ -1053,7 +1047,7 @@ xfs_fs_free_cached_objects(
> static const struct super_operations xfs_super_operations = {
> .alloc_inode = xfs_fs_alloc_inode,
> .destroy_inode = xfs_fs_destroy_inode,
> - .dirty_inode = xfs_fs_dirty_inode,
> + .lazytime_expired = xfs_fs_lazytime_expired,
> .drop_inode = xfs_fs_drop_inode,
> .put_super = xfs_fs_put_super,
> .sync_fs = xfs_fs_sync_fs,
This means XFS no longer updates/logs the current timestamp because
->dirty_inode(I_DIRTY_SYNC) is no longer called for XFS) before
->fsync flushes the inode data and metadata changes to the journal.
Hence the current in-memory timestamps are not present in the log
before the fsync is run as so we violate the fsync guarantees
lazytime gives for timestamp updates....
I haven't quite got it straight in my head if the iput() case has
similar problems, but the fsync case definitely looks broken.
Cheers,
Dave.
--
Dave Chinner
david@...morbit.com
Powered by blists - more mailing lists