[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <20090824234336.GU5931@webber.adilger.int>
Date: Mon, 24 Aug 2009 17:43:36 -0600
From: Andreas Dilger <adilger@....com>
To: Theodore Tso <tytso@....edu>
Cc: Ric Wheeler <rwheeler@...hat.com>,
Christian Fischer <Christian.Fischer@...terngraphics.com>,
linux-ext4@...r.kernel.org
Subject: Re: Enable asynchronous commits by default patch revoked?
On Aug 24, 2009 19:28 -0400, Theodore Ts'o wrote:
> @@ -132,9 +132,7 @@ static int journal_submit_commit_record(journal_t *journal,
> set_buffer_uptodate(bh);
> bh->b_end_io = journal_end_buffer_io_sync;
>
> - if (journal->j_flags & JBD2_BARRIER &&
> - !JBD2_HAS_INCOMPAT_FEATURE(journal,
> - JBD2_FEATURE_INCOMPAT_ASYNC_COMMIT)) {
> + if (journal->j_flags & JBD2_BARRIER) {
> set_buffer_ordered(bh);
> barrier_done = 1;
> }
>
>
> Ok, to be fair, most of the complexity was already in the code
> already; but it the main complexity was simply separating
> journal_write_commit_record() into journal_submit_commit_record() and
> journal_wait_on_commit_record().
>
> We can clean up the patch by recombining these two functions, since
> there was never any point in separate submitting the commit record
> from where we waited for it. I think who ever implemented thought we
> could add a bit more paralisms, but in reality all of the code between
> line 709 of commit.c and 834 of commit.c (i.e., commit phases 3-5) is
> waiting for the various journal data blocks to be written. So we
> might as well wait for the commit block, which will save a bit of
> scheduling overhead, using the same rationale listed in the commit
> found in line 740 of commit.c:
>
> /*
> Wait for the buffers in reverse order. That way we are
> less likely to be woken up until all IOs have completed, and
> so we incur less scheduling load.
> */
Without transaction checksums waiting on all of the blocks together
is NOT safe. If the commit record is on disk, but the rest of the
transaction's blocks are not then during replay it may cause garbage
to be written from the journal into the filesystem metadata.
Have you seen any of my other emails on this topic? It would seem not...
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists