[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210408134551.GC3271@quack2.suse.cz>
Date: Thu, 8 Apr 2021 15:45:51 +0200
From: Jan Kara <jack@...e.cz>
To: Zhang Yi <yi.zhang@...wei.com>
Cc: linux-ext4@...r.kernel.org, tytso@....edu,
adilger.kernel@...ger.ca, jack@...e.cz, yukuai3@...wei.com
Subject: Re: [PATCH 1/3] jbd2: protect buffers release with j_checkpoint_mutex
On Thu 08-04-21 19:36:16, Zhang Yi wrote:
> There is a race between jbd2_journal_try_to_free_buffers() and
> jbd2_journal_destroy(), so the jbd2_log_do_checkpoint() may still
> missing to detect the buffer write io error flag and lead to filesystem
> inconsistency.
>
> jbd2_journal_try_to_free_buffers() ext4_put_super()
> jbd2_journal_destroy()
> __jbd2_journal_remove_checkpoint()
> detect buffer write error jbd2_log_do_checkpoint()
> jbd2_cleanup_journal_tail()
> <--- lead to inconsistency
> jbd2_journal_abort()
>
> Fix this issue by add j_checkpoint_mutex to protect journal buffer
> release on jbd2_journal_try_to_free_buffers().
>
> Signed-off-by: Zhang Yi <yi.zhang@...wei.com>
Thanks for the patch Zhang. I agree with your problem analysis but I don't
think the solution is correct:
> J_ASSERT(PageLocked(page));
>
> + mutex_lock(&journal->j_checkpoint_mutex);
We cannot grab j_checkpoint_mutex inside jbd2_journal_try_to_free_buffers()
(or even ext4_releasepage()) because that function is called withe a page
lock which ranks below the checkpoint mutex - generally page locks are
acquired within a transaction and thus all locks required to start a
transaction (and j_checkpoint_mutex is one of them) rank above the page
lock.
Also even if the lock ordering was OK, grabbing j_checkpoint_mutex for
every page from memory reclaim just to close this rare race seems like a
performance overkill.
What we seem to need is a quick way of marking the journal as "IO error
occured" in __journal_try_to_free_buffer() before actually removing the
buffer from the checkpoint list. Perhaps this marking could even happen
already in __jbd2_journal_remove_checkpoint() and we can reuse it in
jbd2_log_do_checkpoint() for IO error handling as well... And then once we
are in a safer context, we can do:
if (!is_journal_aborted(journal) && journal_io_error_happened(journal))
jbd2_journal_abort(...)
Honza
> head = page_buffers(page);
> bh = head;
> do {
> @@ -2163,6 +2164,7 @@ int jbd2_journal_try_to_free_buffers(journal_t *journal, struct page *page)
> if (has_write_io_error)
> jbd2_journal_abort(journal, -EIO);
>
> + mutex_unlock(&journal->j_checkpoint_mutex);
> return ret;
> }
>
> --
> 2.25.4
>
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists