[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <p7aznpdg4ue7g3hv7y4wv6lfqp3aoavkdzthz5jgbwtrkc2cnu@gndrpsqaoma2>
Date: Tue, 16 Sep 2025 12:56:41 +0200
From: Jan Kara <jack@...e.cz>
To: Zhang Yi <yi.zhang@...weicloud.com>
Cc: linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, tytso@....edu, adilger.kernel@...ger.ca, jack@...e.cz,
hsiangkao@...ux.alibaba.com, yi.zhang@...wei.com, libaokun1@...wei.com, yukuai3@...wei.com,
yangerkun@...wei.com
Subject: Re: [PATCH 1/2] jbd2: ensure that all ongoing I/O complete before
freeing blocks
On Tue 16-09-25 17:33:36, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@...wei.com>
>
> When releasing file system metadata blocks in jbd2_journal_forget(), if
> this buffer has not yet been checkpointed, it may have already been
> written back, currently be in the process of being written back, or has
> not yet written back. jbd2_journal_forget() calls
> jbd2_journal_try_remove_checkpoint() to check the buffer's status and
> add it to the current transaction if it has not been written back. This
> buffer can only be reallocated after the transaction is committed.
>
> jbd2_journal_try_remove_checkpoint() attempts to lock the buffer and
> check its dirty status while holding the buffer lock. If the buffer has
> already been written back, everything proceeds normally. However, there
> are two issues. First, the function returns immediately if the buffer is
> locked by the write-back process. It does not wait for the write-back to
> complete. Consequently, until the current transaction is committed and
> the block is reallocated, there is no guarantee that the I/O will
> complete. This means that ongoing I/O could write stale metadata to the
> newly allocated block, potentially corrupting data. Second, the function
> unlocks the buffer as soon as it detects that the buffer is still dirty.
> If a concurrent write-back occurs immediately after this unlocking and
> before clear_buffer_dirty() is called in jbd2_journal_forget(), data
> corruption can theoretically still occur.
>
> Although these two issues are unlikely to occur in practice since the
> undergoing metadata writeback I/O does not take this long to complete,
> it's better to explicitly ensure that all ongoing I/O operations are
> completed.
>
> Fixes: 597599268e3b ("jbd2: discard dirty data when forgetting an un-journalled buffer")
> Suggested-by: Jan Kara <jack@...e.cz>
> Signed-off-by: Zhang Yi <yi.zhang@...wei.com>
Looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@...e.cz>
Honza
> ---
> fs/jbd2/transaction.c | 13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> index c7867139af69..3e510564de6e 100644
> --- a/fs/jbd2/transaction.c
> +++ b/fs/jbd2/transaction.c
> @@ -1659,6 +1659,7 @@ int jbd2_journal_forget(handle_t *handle, struct buffer_head *bh)
> int drop_reserve = 0;
> int err = 0;
> int was_modified = 0;
> + int wait_for_writeback = 0;
>
> if (is_handle_aborted(handle))
> return -EROFS;
> @@ -1782,18 +1783,22 @@ int jbd2_journal_forget(handle_t *handle, struct buffer_head *bh)
> }
>
> /*
> - * The buffer is still not written to disk, we should
> - * attach this buffer to current transaction so that the
> - * buffer can be checkpointed only after the current
> - * transaction commits.
> + * The buffer has not yet been written to disk. We should
> + * either clear the buffer or ensure that the ongoing I/O
> + * is completed, and attach this buffer to current
> + * transaction so that the buffer can be checkpointed only
> + * after the current transaction commits.
> */
> clear_buffer_dirty(bh);
> + wait_for_writeback = 1;
> __jbd2_journal_file_buffer(jh, transaction, BJ_Forget);
> spin_unlock(&journal->j_list_lock);
> }
> drop:
> __brelse(bh);
> spin_unlock(&jh->b_state_lock);
> + if (wait_for_writeback)
> + wait_on_buffer(bh);
> jbd2_journal_put_journal_head(jh);
> if (drop_reserve) {
> /* no need to reserve log space for this block -bzzz */
> --
> 2.46.1
>
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists