[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220613111935.swheyx3p7psvshxn@quack3.lan>
Date: Mon, 13 Jun 2022 13:19:35 +0200
From: Jan Kara <jack@...e.cz>
To: Zhang Yi <yi.zhang@...wei.com>
Cc: linux-ext4@...r.kernel.org, tytso@....edu,
adilger.kernel@...ger.ca, jack@...e.cz, yukuai3@...wei.com
Subject: Re: [PATCH] jbd2: fix outstanding credits assert in
jbd2_journal_commit_transaction()
On Sat 11-06-22 21:04:26, Zhang Yi wrote:
> We catch an assert problem in jbd2_journal_commit_transaction() when
> doing fsstress and request falut injection tests. The problem is
> happened in a race condition between jbd2_journal_commit_transaction()
> and ext4_end_io_end(). Firstly, ext4_writepages() writeback dirty pages
> and start reserved handle, and then the journal was aborted due to some
> previous metadata IO error, jbd2_journal_abort() start to commit current
> running transaction, the committing procedure could be raced by
> ext4_end_io_end() and lead to subtract j_reserved_credits twice from
> commit_transaction->t_outstanding_credits, finally the
> t_outstanding_credits is mistakenly smaller than t_nr_buffers and
> trigger assert.
>
> kjournald2 kworker
>
> jbd2_journal_commit_transaction()
> write_unlock(&journal->j_state_lock);
> atomic_sub(j_reserved_credits, t_outstanding_credits); //sub once
>
> jbd2_journal_start_reserved()
> start_this_handle() //detect aborted journal
> jbd2_journal_free_reserved() //get running transaction
> read_lock(&journal->j_state_lock)
> __jbd2_journal_unreserve_handle()
> atomic_sub(j_reserved_credits, t_outstanding_credits);
> //sub again
> read_unlock(&journal->j_state_lock);
>
> journal->j_running_transaction = NULL;
> J_ASSERT(t_nr_buffers <= t_outstanding_credits) //bomb!!!
>
> Fix this issue by using journal->j_state_lock to protect the subtraction
> in jbd2_journal_commit_transaction().
>
> Fixes: 96f1e0974575 ("jbd2: avoid long hold times of j_state_lock while committing a transaction")
> Signed-off-by: Zhang Yi <yi.zhang@...wei.com>
Thanks for the analysis and the fix! This is indeed subtle. This fix looks
good to me. Feel free to add:
Reviewed-by: Jan Kara <jack@...e.cz>
Honza
> ---
> fs/jbd2/commit.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
> index eb315e81f1a6..af1a9191368c 100644
> --- a/fs/jbd2/commit.c
> +++ b/fs/jbd2/commit.c
> @@ -553,13 +553,13 @@ void jbd2_journal_commit_transaction(journal_t *journal)
> */
> jbd2_journal_switch_revoke_table(journal);
>
> + write_lock(&journal->j_state_lock);
> /*
> * Reserved credits cannot be claimed anymore, free them
> */
> atomic_sub(atomic_read(&journal->j_reserved_credits),
> &commit_transaction->t_outstanding_credits);
>
> - write_lock(&journal->j_state_lock);
> trace_jbd2_commit_flushing(journal, commit_transaction);
> stats.run.rs_flushing = jiffies;
> stats.run.rs_locked = jbd2_time_diff(stats.run.rs_locked,
> --
> 2.31.1
>
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists