[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150728071502.GF18016@dhcp-13-216.nay.redhat.com>
Date: Tue, 28 Jul 2015 15:15:26 +0800
From: Eryu Guan <guaneryu@...il.com>
To: Jan Kara <jack@...e.cz>
Cc: Joseph Qi <joseph.qi@...wei.com>, linux-ext4@...r.kernel.org,
Theodore Ts'o <tytso@....edu>
Subject: Re: [BUG] infinite loop when unmounting ext3/4
On Mon, Jul 27, 2015 at 09:54:40PM +0200, Jan Kara wrote:
> On Thu 23-07-15 22:41:55, Jan Kara wrote:
> > On Mon 20-07-15 15:23:56, Eryu Guan wrote:
> > > On Thu, Jul 16, 2015 at 02:18:16PM +0800, Joseph Qi wrote:
> > > > I got the problem.
> > > > I am not sure why old code uses "result <= 0" even if
> > > > it won't return negative value. Could we use "result == 0" instead of
> > > > "result <= 0"?
> > >
> > > I thought about this too, but I'm not sure if it has other side effects.
> > > Someone else familiar with this code could comment on this?
> >
> > Well, we should rather decide, what is the right behavior of the
> > checkpointing code when the journal is aborted. When journal gets aborted,
> > we are in serious trouble. Our standard answer to this is to stop modifying
> > the filesystem as that has a chance of corrupting it even more. I think
> > that avoiding checkpointing on a filesystem with aborted journal is thus
> > what we really want.
> >
> > To fix the issue you've reported, we just need to teach
> > jbd2_log_do_checkpoint() to cleanup all the buffers in the transactions
> > when the journal is aborted so that journal_destroy() can proceed. I can
> > have a look at it sometime next week unless someone beats me to it.
>
> OK, attached is the patch that fixes the issue for me. Ted, can you please
> pick it up? Thanks!
I confirmed that the attached patch fixed the infinite loop on aborted
journal. Thanks!
Eryu
>
> Honza
> > > > On 2015/7/15 22:30, Eryu Guan wrote:
> > > > > Hi all,
> > > > >
> > > > > I found an infinite loop when unmounting a known bad ext3 image (using
> > > > > ext4 driver) with 4.2-rc1 kernel.
> > > > >
> > > > > The fs image can be found here
> > > > > https://bugzilla.kernel.org/show_bug.cgi?id=10882#c1
> > > > >
> > > > > Reproduce steps:
> > > > > mount -o loop ext3.img /mnt/ext3
> > > > > rm -rf /mnt/ext3/{dev,proc,sys}
> > > > > umount /mnt/ext3 # never return
> > > > >
> > > > > And this issue was introduced by
> > > > > 6f6a6fd jbd2: fix ocfs2 corrupt when updating journal superblock fails
> > > > >
> > > > > It's looping in
> > > > > fs/jbd2/journal.c:jbd2_journal_destroy()
> > > > > ...
> > > > > 1693 while (journal->j_checkpoint_transactions != NULL) {
> > > > > 1694 spin_unlock(&journal->j_list_lock);
> > > > > 1695 mutex_lock(&journal->j_checkpoint_mutex);
> > > > > 1696 jbd2_log_do_checkpoint(journal);
> > > > > 1697 mutex_unlock(&journal->j_checkpoint_mutex);
> > > > > 1698 spin_lock(&journal->j_list_lock);
> > > > > 1699 }
> > > > > ...
> > > > >
> > > > > Because jbd2_cleanup_journal_tail() is returning -EIO on aborted journal
> > > > > now instead of 1, and jbd2_log_do_checkpoint() won't cleanup
> > > > > journal->j_checkpoint_transactions in this while loop.
> > > > >
> > > > > A quick hack would be
> > > > >
> > > > > diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> > > > > index 4227dc4..1b2ea47 100644
> > > > > --- a/fs/jbd2/checkpoint.c
> > > > > +++ b/fs/jbd2/checkpoint.c
> > > > > @@ -220,11 +220,13 @@ int jbd2_log_do_checkpoint(journal_t *journal)
> > > > > * don't need checkpointing, just eliminate them from the
> > > > > * journal straight away.
> > > > > */
> > > > > - result = jbd2_cleanup_journal_tail(journal);
> > > > > - trace_jbd2_checkpoint(journal, result);
> > > > > - jbd_debug(1, "cleanup_journal_tail returned %d\n", result);
> > > > > - if (result <= 0)
> > > > > - return result;
> > > > > + if (!is_journal_aborted(journal)) {
> > > > > + result = jbd2_cleanup_journal_tail(journal);
> > > > > + trace_jbd2_checkpoint(journal, result);
> > > > > + jbd_debug(1, "cleanup_journal_tail returned %d\n", result);
> > > > > + if (result <= 0)
> > > > > + return result;
> > > > > + }
> > > > >
> > > > > /*
> > > > > * OK, we need to start writing disk blocks. Take one transaction
> > > > >
> > > > > to restore the old behavior (continue on aborted journal). Maybe someone
> > > > > has a better fix.
> > > > >
> > > > > Thanks,
> > > > > Eryu
> > > > >
> > > > > .
> > > > >
> > > >
> > > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > > the body of a message to majordomo@...r.kernel.org
> > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > --
> > Jan Kara <jack@...e.com>
> > SUSE Labs, CR
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> > the body of a message to majordomo@...r.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> Jan Kara <jack@...e.com>
> SUSE Labs, CR
> From 2839d314353b6d9e88355dc0b30927d38b2cb7b7 Mon Sep 17 00:00:00 2001
> From: Jan Kara <jack@...e.com>
> Date: Mon, 27 Jul 2015 21:42:08 +0200
> Subject: [PATCH] jbd2: Avoid infinite loop when destroying aborted journal
>
> Commit 6f6a6fda2945 "jbd2: fix ocfs2 corrupt when updating journal
> superblock fails" changed jbd2_cleanup_journal_tail() to return EIO when
> the journal is aborted. That makes logic in jbd2_log_do_checkpoint()
> bail out which is fine, except that jbd2_journal_destroy() expects
> jbd2_log_do_checkpoint() to always make a progress in cleaning the
> journal. Without it jbd2_journal_destroy() just loops in an infinite
> loop.
>
> Fix jbd2_journal_destroy() to cleanup journal checkpoint lists of
> jbd2_log_do_checkpoint() fails with error.
>
> Reported-by: Eryu Guan <guaneryu@...il.com>
> Fixes: 6f6a6fda294506dfe0e3e0a253bb2d2923f28f0a
> Signed-off-by: Jan Kara <jack@...e.com>
> ---
> fs/jbd2/checkpoint.c | 39 +++++++++++++++++++++++++++++++++------
> fs/jbd2/commit.c | 2 +-
> fs/jbd2/journal.c | 11 ++++++++++-
> include/linux/jbd2.h | 3 ++-
> 4 files changed, 46 insertions(+), 9 deletions(-)
>
> diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> index 4227dc4f7437..8c44654ce274 100644
> --- a/fs/jbd2/checkpoint.c
> +++ b/fs/jbd2/checkpoint.c
> @@ -417,12 +417,12 @@ int jbd2_cleanup_journal_tail(journal_t *journal)
> * journal_clean_one_cp_list
> *
> * Find all the written-back checkpoint buffers in the given list and
> - * release them.
> + * release them. If 'destroy' is set, clean all buffers unconditionally.
> *
> * Called with j_list_lock held.
> * Returns 1 if we freed the transaction, 0 otherwise.
> */
> -static int journal_clean_one_cp_list(struct journal_head *jh)
> +static int journal_clean_one_cp_list(struct journal_head *jh, bool destroy)
> {
> struct journal_head *last_jh;
> struct journal_head *next_jh = jh;
> @@ -436,7 +436,10 @@ static int journal_clean_one_cp_list(struct journal_head *jh)
> do {
> jh = next_jh;
> next_jh = jh->b_cpnext;
> - ret = __try_to_free_cp_buf(jh);
> + if (!destroy)
> + ret = __try_to_free_cp_buf(jh);
> + else
> + ret = __jbd2_journal_remove_checkpoint(jh) + 1;
> if (!ret)
> return freed;
> if (ret == 2)
> @@ -459,10 +462,11 @@ static int journal_clean_one_cp_list(struct journal_head *jh)
> * journal_clean_checkpoint_list
> *
> * Find all the written-back checkpoint buffers in the journal and release them.
> + * If 'destroy' is set, release all buffers unconditionally.
> *
> * Called with j_list_lock held.
> */
> -void __jbd2_journal_clean_checkpoint_list(journal_t *journal)
> +void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy)
> {
> transaction_t *transaction, *last_transaction, *next_transaction;
> int ret;
> @@ -476,7 +480,8 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal)
> do {
> transaction = next_transaction;
> next_transaction = transaction->t_cpnext;
> - ret = journal_clean_one_cp_list(transaction->t_checkpoint_list);
> + ret = journal_clean_one_cp_list(transaction->t_checkpoint_list,
> + destroy);
> /*
> * This function only frees up some memory if possible so we
> * dont have an obligation to finish processing. Bail out if
> @@ -492,7 +497,7 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal)
> * we can possibly see not yet submitted buffers on io_list
> */
> ret = journal_clean_one_cp_list(transaction->
> - t_checkpoint_io_list);
> + t_checkpoint_io_list, destroy);
> if (need_resched())
> return;
> /*
> @@ -506,6 +511,28 @@ void __jbd2_journal_clean_checkpoint_list(journal_t *journal)
> }
>
> /*
> + * Remove buffers from all checkpoint lists as journal is aborted and we just
> + * need to free memory
> + */
> +void jbd2_journal_destroy_checkpoint(journal_t *journal)
> +{
> + /*
> + * We loop because __jbd2_journal_clean_checkpoint_list() may abort
> + * early due to a need of rescheduling.
> + */
> + while (1) {
> + spin_lock(&journal->j_list_lock);
> + if (!journal->j_checkpoint_transactions) {
> + spin_unlock(&journal->j_list_lock);
> + break;
> + }
> + __jbd2_journal_clean_checkpoint_list(journal, true);
> + spin_unlock(&journal->j_list_lock);
> + cond_resched();
> + }
> +}
> +
> +/*
> * journal_remove_checkpoint: called after a buffer has been committed
> * to disk (either by being write-back flushed to disk, or being
> * committed to the log).
> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
> index b73e0215baa7..362e5f614450 100644
> --- a/fs/jbd2/commit.c
> +++ b/fs/jbd2/commit.c
> @@ -510,7 +510,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
> * frees some memory
> */
> spin_lock(&journal->j_list_lock);
> - __jbd2_journal_clean_checkpoint_list(journal);
> + __jbd2_journal_clean_checkpoint_list(journal, false);
> spin_unlock(&journal->j_list_lock);
>
> jbd_debug(3, "JBD2: commit phase 1\n");
> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c
> index 4ff3fad4e9e3..2721513adb1f 100644
> --- a/fs/jbd2/journal.c
> +++ b/fs/jbd2/journal.c
> @@ -1693,8 +1693,17 @@ int jbd2_journal_destroy(journal_t *journal)
> while (journal->j_checkpoint_transactions != NULL) {
> spin_unlock(&journal->j_list_lock);
> mutex_lock(&journal->j_checkpoint_mutex);
> - jbd2_log_do_checkpoint(journal);
> + err = jbd2_log_do_checkpoint(journal);
> mutex_unlock(&journal->j_checkpoint_mutex);
> + /*
> + * If checkpointing failed, just free the buffers to avoid
> + * looping forever
> + */
> + if (err) {
> + jbd2_journal_destroy_checkpoint(journal);
> + spin_lock(&journal->j_list_lock);
> + break;
> + }
> spin_lock(&journal->j_list_lock);
> }
>
> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> index edb640ae9a94..eb1cebed3f36 100644
> --- a/include/linux/jbd2.h
> +++ b/include/linux/jbd2.h
> @@ -1042,8 +1042,9 @@ void jbd2_update_log_tail(journal_t *journal, tid_t tid, unsigned long block);
> extern void jbd2_journal_commit_transaction(journal_t *);
>
> /* Checkpoint list management */
> -void __jbd2_journal_clean_checkpoint_list(journal_t *journal);
> +void __jbd2_journal_clean_checkpoint_list(journal_t *journal, bool destroy);
> int __jbd2_journal_remove_checkpoint(struct journal_head *);
> +void jbd2_journal_destroy_checkpoint(journal_t *journal);
> void __jbd2_journal_insert_checkpoint(struct journal_head *, transaction_t *);
>
>
> --
> 2.1.4
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists