[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220728165332.cu2kiduob2xyvoep@quack3>
Date: Thu, 28 Jul 2022 18:53:32 +0200
From: Jan Kara <jack@...e.cz>
To: Lukas Czerner <lczerner@...hat.com>
Cc: linux-ext4@...r.kernel.org, jlayton@...nel.org, tytso@....edu,
linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH 2/2] fs: record I_DIRTY_TIME even if inode already has
I_DIRTY_INODE
On Thu 28-07-22 15:39:14, Lukas Czerner wrote:
> Currently the I_DIRTY_TIME will never get set if the inode already has
> I_DIRTY_INODE with assumption that it supersedes I_DIRTY_TIME. That's
> true, however ext4 will only update the on-disk inode in
> ->dirty_inode(), not on actual writeback. As a result if the inode
> already has I_DIRTY_INODE state by the time we get to
> __mark_inode_dirty() only with I_DIRTY_TIME, the time was already filled
> into on-disk inode and will not get updated until the next I_DIRTY_INODE
> update, which might never come if we crash or get a power failure.
>
> The problem can be reproduced on ext4 by running xfstest generic/622
> with -o iversion mount option. Fix it by setting I_DIRTY_TIME even if
> the inode already has I_DIRTY_INODE.
As a datapoint I've checked and XFS has the very same problem as ext4.
> Also clear the I_DIRTY_TIME after ->dirty_inode() otherwise it may never
> get cleared.
>
> Signed-off-by: Lukas Czerner <lczerner@...hat.com>
> ---
> fs/fs-writeback.c | 18 +++++++++++++++---
> 1 file changed, 15 insertions(+), 3 deletions(-)
>
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 05221366a16d..174f01e6b912 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -2383,6 +2383,11 @@ void __mark_inode_dirty(struct inode *inode, int flags)
>
> /* I_DIRTY_INODE supersedes I_DIRTY_TIME. */
> flags &= ~I_DIRTY_TIME;
> + if (inode->i_state & I_DIRTY_TIME) {
> + spin_lock(&inode->i_lock);
> + inode->i_state &= ~I_DIRTY_TIME;
> + spin_unlock(&inode->i_lock);
> + }
Hum, so this is a bit dangerous because inode->i_state may be inconsistent
with the writeback list inode is queued in (wb->b_dirty_time) and these two
are supposed to be in sync. So I rather think we need to make sure we go
through the full round of 'update flags and writeback list' below in case
we need to clear I_DIRTY_TIME from inode->i_state.
> } else {
> /*
> * Else it's either I_DIRTY_PAGES, I_DIRTY_TIME, or nothing.
> @@ -2399,13 +2404,20 @@ void __mark_inode_dirty(struct inode *inode, int flags)
> */
> smp_mb();
>
> - if (((inode->i_state & flags) == flags) ||
> - (dirtytime && (inode->i_state & I_DIRTY_INODE)))
> + if ((inode->i_state & flags) == flags)
> return;
>
> spin_lock(&inode->i_lock);
> - if (dirtytime && (inode->i_state & I_DIRTY_INODE))
> + if (dirtytime && (inode->i_state & I_DIRTY_INODE)) {
> + /*
> + * We've got a new lazytime update. Make sure it's recorded in
> + * i_state, because the time might have already got updated in
> + * ->dirty_inode() and will not get updated until next
> + * I_DIRTY_INODE update.
> + */
> + inode->i_state |= I_DIRTY_TIME;
> goto out_unlock_inode;
> + }
So I'm afraid this combination is not properly handled in
writeback_single_inode() where we have at the end:
if (!(inode->i_state & I_DIRTY_ALL))
inode_cgwb_move_to_attached(inode, wb);
else if (!(inode->i_state & I_SYNC_QUEUED) &&
(inode->i_state & I_DIRTY))
redirty_tail_locked(inode, wb);
So inode that had I_DIRTY_SYNC | I_DIRTY_TIME will not be properly refiled
to wb->b_dirty_time list after writeback was done and I_DIRTY_SYNC got
cleared.
So we need to refine it to something like:
if (!(inode->i_state & I_DIRTY_ALL))
inode_cgwb_move_to_attached(inode, wb);
else if (!(inode->i_state & I_SYNC_QUEUED)) {
if (inode->i_state & I_DIRTY) {
redirty_tail_locked(inode, wb);
} else if (inode->i_state & I_DIRTY_TIME) {
inode->dirtied_when = jiffies;
inode_io_list_move_locked(inode, wb, &wb->b_dirty_time);
}
}
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists