[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230531115100.2779605-5-yi.zhang@huaweicloud.com>
Date: Wed, 31 May 2023 19:50:59 +0800
From: Zhang Yi <yi.zhang@...weicloud.com>
To: linux-ext4@...r.kernel.org
Cc: tytso@....edu, adilger.kernel@...ger.ca, jack@...e.cz,
yi.zhang@...wei.com, yi.zhang@...weicloud.com, yukuai3@...wei.com,
chengzhihao1@...wei.com
Subject: [PATCH 4/5] jbd2: Fix wrongly judgement for buffer head removing while doing checkpoint
From: Zhihao Cheng <chengzhihao1@...wei.com>
Following process,
jbd2_journal_commit_transaction
// there are several dirty buffer heads in transaction->t_checkpoint_list
P1 wb_workfn
jbd2_log_do_checkpoint
if (buffer_locked(bh)) // false
__block_write_full_page
trylock_buffer(bh)
test_clear_buffer_dirty(bh)
if (!buffer_dirty(bh))
__jbd2_journal_remove_checkpoint(jh)
if (buffer_write_io_error(bh)) // false
>> bh IO error occurs <<
jbd2_cleanup_journal_tail
__jbd2_update_log_tail
jbd2_write_superblock
// The bh won't be replayed in next mount.
, which could corrupt the ext4 image, fetch a reproducer in [Link].
Since writeback process clears buffer dirty after locking buffer head,
we can fix it by checking buffer dirty firstly and then checking buffer
locked, the buffer head can be removed if it is neither dirty nor locked.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217490
Fixes: 470decc613ab ("[PATCH] jbd2: initial copy of files from jbd")
Signed-off-by: Zhihao Cheng <chengzhihao1@...wei.com>
Signed-off-by: Zhang Yi <yi.zhang@...wei.com>
---
fs/jbd2/checkpoint.c | 48 ++++++++++++++++++++++----------------------
1 file changed, 24 insertions(+), 24 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
index 3f52560912a9..620f3d345f3d 100644
--- a/fs/jbd2/checkpoint.c
+++ b/fs/jbd2/checkpoint.c
@@ -204,20 +204,6 @@ int jbd2_log_do_checkpoint(journal_t *journal)
jh = transaction->t_checkpoint_list;
bh = jh2bh(jh);
- /*
- * The buffer may be writing back, or flushing out in the
- * last couple of cycles, or re-adding into a new transaction,
- * need to check it again until it's unlocked.
- */
- if (buffer_locked(bh)) {
- get_bh(bh);
- spin_unlock(&journal->j_list_lock);
- wait_on_buffer(bh);
- /* the journal_head may have gone by now */
- BUFFER_TRACE(bh, "brelse");
- __brelse(bh);
- goto retry;
- }
if (jh->b_transaction != NULL) {
transaction_t *t = jh->b_transaction;
tid_t tid = t->t_tid;
@@ -252,16 +238,7 @@ int jbd2_log_do_checkpoint(journal_t *journal)
spin_lock(&journal->j_list_lock);
goto restart;
}
- if (!buffer_dirty(bh)) {
- BUFFER_TRACE(bh, "remove from checkpoint");
- /*
- * If the transaction was released or the checkpoint
- * list was empty, we're done.
- */
- if (__jbd2_journal_remove_checkpoint(jh) ||
- !transaction->t_checkpoint_list)
- goto out;
- } else {
+ if (buffer_dirty(bh)) {
/*
* We are about to write the buffer, it could be
* raced by some other transaction shrink or buffer
@@ -275,6 +252,29 @@ int jbd2_log_do_checkpoint(journal_t *journal)
journal->j_chkpt_bhs[batch_count++] = bh;
transaction->t_chp_stats.cs_written++;
transaction->t_checkpoint_list = jh->b_cpnext;
+ } else if (buffer_locked(bh)) {
+ /*
+ * The buffer may be writing back, or flushing out
+ * in the last couple of cycles, or re-adding into
+ * a new transaction, need to check it again until
+ * it's unlocked.
+ */
+ get_bh(bh);
+ spin_unlock(&journal->j_list_lock);
+ wait_on_buffer(bh);
+ /* the journal_head may have gone by now */
+ BUFFER_TRACE(bh, "brelse");
+ __brelse(bh);
+ goto retry;
+ } else {
+ BUFFER_TRACE(bh, "remove from checkpoint");
+ /*
+ * If the transaction was released or the checkpoint
+ * list was empty, we're done.
+ */
+ if (__jbd2_journal_remove_checkpoint(jh) ||
+ !transaction->t_checkpoint_list)
+ goto out;
}
if ((batch_count == JBD2_NR_BATCH) ||
--
2.31.1
Powered by blists - more mailing lists