[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240429191437.2pxfg43hgdmkgl2m@quack3>
Date: Mon, 29 Apr 2024 21:14:37 +0200
From: Jan Kara <jack@...e.cz>
To: Ye Bin <yebin@...weicloud.com>
Cc: tytso@....edu, adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org,
jack@...e.cz, Ye Bin <yebin10@...wei.com>
Subject: Re: [PATCH v3] jbd2: avoid mount failed when commit block is partial
submitted
On Thu 25-04-24 14:45:15, Ye Bin wrote:
> From: Ye Bin <yebin10@...wei.com>
>
> We encountered a problem that the file system could not be mounted in
> the power-off scenario. The analysis of the file system mirror shows that
> only part of the data is written to the last commit block.
> The valid data of the commit block is concentrated in the first sector.
> However, the data of the entire block is involved in the checksum calculation.
> For different hardware, the minimum atomic unit may be different.
> If the checksum of a committed block is incorrect, clear the data except the
> 'commit_header' and then calculate the checksum. If the checkusm is correct,
> it is considered that the block is partially committed. However, if there are
> valid description/revoke blocks, it is considered that the data is abnormal
> and the log replay is stopped.
>
> Signed-off-by: Ye Bin <yebin10@...wei.com>
The patch as a fix looks good. Feel free to add:
Reviewed-by: Jan Kara <jack@...e.cz>
I'd just note that the maze of branches and gotos in do_one_pass() is
becoming very hard to follow so ideally we should come up with some
refactoring of the function so that it is easier to follow. But that's
definitely something for a separate patch.
Honza
> ---
> fs/jbd2/recovery.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 48 insertions(+)
>
> diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c
> index 1f7664984d6e..594bf02a709f 100644
> --- a/fs/jbd2/recovery.c
> +++ b/fs/jbd2/recovery.c
> @@ -443,6 +443,27 @@ static int jbd2_commit_block_csum_verify(journal_t *j, void *buf)
> return provided == cpu_to_be32(calculated);
> }
>
> +static bool jbd2_commit_block_csum_verify_partial(journal_t *j, void *buf)
> +{
> + struct commit_header *h;
> + __be32 provided;
> + __u32 calculated;
> + void *tmpbuf;
> +
> + tmpbuf = kzalloc(j->j_blocksize, GFP_KERNEL);
> + if (!tmpbuf)
> + return false;
> +
> + memcpy(tmpbuf, buf, sizeof(struct commit_header));
> + h = tmpbuf;
> + provided = h->h_chksum[0];
> + h->h_chksum[0] = 0;
> + calculated = jbd2_chksum(j, j->j_csum_seed, tmpbuf, j->j_blocksize);
> + kfree(tmpbuf);
> +
> + return provided == cpu_to_be32(calculated);
> +}
> +
> static int jbd2_block_tag_csum_verify(journal_t *j, journal_block_tag_t *tag,
> journal_block_tag3_t *tag3,
> void *buf, __u32 sequence)
> @@ -479,6 +500,7 @@ static int do_one_pass(journal_t *journal,
> int descr_csum_size = 0;
> int block_error = 0;
> bool need_check_commit_time = false;
> + bool has_partial_commit = false;
> __u64 last_trans_commit_time = 0, commit_time;
>
> /*
> @@ -590,6 +612,14 @@ static int do_one_pass(journal_t *journal,
> next_log_block);
> }
>
> + if (pass == PASS_SCAN && has_partial_commit) {
> + pr_err("JBD2: Detect validate descriptor block %lu after incomplete commit block\n",
> + next_log_block);
> + err = -EFSBADCRC;
> + brelse(bh);
> + goto failed;
> + }
> +
> /* If it is a valid descriptor block, replay it
> * in pass REPLAY; if journal_checksums enabled, then
> * calculate checksums in PASS_SCAN, otherwise,
> @@ -810,6 +840,14 @@ static int do_one_pass(journal_t *journal,
> if (pass == PASS_SCAN &&
> !jbd2_commit_block_csum_verify(journal,
> bh->b_data)) {
> + if (jbd2_commit_block_csum_verify_partial(
> + journal,
> + bh->b_data)) {
> + pr_notice("JBD2: Find incomplete commit block in transaction %u block %lu\n",
> + next_commit_ID, next_log_block);
> + has_partial_commit = true;
> + goto chksum_ok;
> + }
> chksum_error:
> if (commit_time < last_trans_commit_time)
> goto ignore_crc_mismatch;
> @@ -824,6 +862,7 @@ static int do_one_pass(journal_t *journal,
> }
> }
> if (pass == PASS_SCAN) {
> + chksum_ok:
> last_trans_commit_time = commit_time;
> head_block = next_log_block;
> }
> @@ -843,6 +882,15 @@ static int do_one_pass(journal_t *journal,
> next_log_block);
> need_check_commit_time = true;
> }
> +
> + if (pass == PASS_SCAN && has_partial_commit) {
> + pr_err("JBD2: Detect validate revoke block %lu after incomplete commit block\n",
> + next_log_block);
> + err = -EFSBADCRC;
> + brelse(bh);
> + goto failed;
> + }
> +
> /* If we aren't in the REVOKE pass, then we can
> * just skip over this block. */
> if (pass != PASS_REVOKE) {
> --
> 2.31.1
>
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists