[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20100526135807.GB2268@atrey.karlin.mff.cuni.cz>
Date: Wed, 26 May 2010 15:58:07 +0200
From: Jan Kara <jack@...e.cz>
To: linux-ext4@...r.kernel.org
Cc: tytso@....edu
Subject: Re: [RFC][PATCH] Journal superblock update should send a barrier
Hi Ted,
> while reading through the checkpointing code I've realized that we
> actually have to send a barrier before each update of journal superblock
> after checkpointing. Attached patch does this. Just I'm not sure whether
> the performance cost won't be too big. In principle, we could make this
> more lightweight by using the fact that transaction commit also sends the
> barrier. So we could check before sending a barrier for transaction commit
> whether we are slowly running out of journal space and if so whether some
> transaction isn't already checkpointed. If yes, we can happily submit
> update of journal superblock after the barrier. In case journal is decently
> large this should solve the checkpointing problem without introducing
> noticeable overhead...
Did you have a chance to look at this?
Honza
> >From e749e83627c35c683114fea32695e581a487e560 Mon Sep 17 00:00:00 2001
> From: Jan Kara <jack@...e.cz>
> Date: Fri, 23 Apr 2010 02:12:32 +0200
> Subject: [PATCH] ext4: Send barrier before updating journal superblock after checkpointing
>
> We have to send a disk barrier after we have finished checkpointing and before
> we update the journal superblock and thus effectively remove transactions from
> the journal. Otherwise the write of journal superblock can be reordered before
> writes of checkpointed journal blocks and thus in case of crash these blocks
> needn't be on the platter leading to filesystem corruption.
>
> Signed-off-by: Jan Kara <jack@...e.cz>
> ---
> fs/jbd2/checkpoint.c | 19 +++++++++----------
> 1 files changed, 9 insertions(+), 10 deletions(-)
>
> diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> index 30beb11..b2de17f 100644
> --- a/fs/jbd2/checkpoint.c
> +++ b/fs/jbd2/checkpoint.c
> @@ -519,17 +519,16 @@ int jbd2_cleanup_journal_tail(journal_t *journal)
> spin_unlock(&journal->j_state_lock);
>
> /*
> - * If there is an external journal, we need to make sure that
> - * any data blocks that were recently written out --- perhaps
> - * by jbd2_log_do_checkpoint() --- are flushed out before we
> - * drop the transactions from the external journal. It's
> - * unlikely this will be necessary, especially with a
> - * appropriately sized journal, but we need this to guarantee
> - * correctness. Fortunately jbd2_cleanup_journal_tail()
> - * doesn't get called all that often.
> + * We need to make sure that any data blocks that were recently written
> + * out --- perhaps by jbd2_log_do_checkpoint() --- are flushed out
> + * before we drop the transactions from the journal. Otherwise journal
> + * superblock write could be reordered before writeout of data and thus
> + * we could corrupt the filesystem in case of crash. It's unlikely this
> + * will be necessary, especially with a appropriately sized journal,
> + * but we need this to guarantee correctness. Fortunately
> + * jbd2_cleanup_journal_tail() doesn't get called all that often.
> */
> - if ((journal->j_fs_dev != journal->j_dev) &&
> - (journal->j_flags & JBD2_BARRIER))
> + if (journal->j_flags & JBD2_BARRIER)
> blkdev_issue_flush(journal->j_fs_dev, NULL);
> if (!(journal->j_flags & JBD2_ABORT))
> jbd2_journal_update_superblock(journal, 1);
> --
> 1.6.4.2
>
--
Jan Kara <jack@...e.cz>
SuSE CR Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists