[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD+ocbwNYc6Fe8tmRebw5Hcg2jxZT5zTbmfUFfFP-9079cBukA@mail.gmail.com>
Date: Mon, 26 Oct 2020 09:34:03 -0700
From: harshad shirwadkar <harshadshirwadkar@...il.com>
To: Jan Kara <jack@...e.cz>
Cc: Ext4 Developers List <linux-ext4@...r.kernel.org>,
"Theodore Y. Ts'o" <tytso@....edu>
Subject: Re: [PATCH v10 4/9] jbd2: add fast commit machinery
On Mon, Oct 26, 2020 at 2:03 AM Jan Kara <jack@...e.cz> wrote:
>
> On Fri 23-10-20 10:17:18, harshad shirwadkar wrote:
> > Thanks Jan for reviewing the patches.
>
> You're welcome. Rather I'm sorry that I've got to that after so long time.
>
> > On Thu, Oct 22, 2020 at 3:16 AM Jan Kara <jack@...e.cz> wrote:
> > >
> > > On Thu 15-10-20 13:37:56, Harshad Shirwadkar wrote:
> > > > This functions adds necessary APIs needed in JBD2 layer for fast
> > > > commits.
> > > >
> > > > Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@...il.com>
> > > > ---
> > > > fs/ext4/fast_commit.c | 8 ++
> > > > fs/jbd2/commit.c | 44 ++++++++++
> > > > fs/jbd2/journal.c | 190 +++++++++++++++++++++++++++++++++++++++++-
> > > > include/linux/jbd2.h | 27 ++++++
> > > > 4 files changed, 268 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c
> > > > index 0dad8bdb1253..f2d11b4c6b62 100644
> > > > --- a/fs/ext4/fast_commit.c
> > > > +++ b/fs/ext4/fast_commit.c
> > > > @@ -8,11 +8,19 @@
> > > > * Ext4 fast commits routines.
> > > > */
> > > > #include "ext4_jbd2.h"
> > > > +/*
> > > > + * Fast commit cleanup routine. This is called after every fast commit and
> > > > + * full commit. full is true if we are called after a full commit.
> > > > + */
> > > > +static void ext4_fc_cleanup(journal_t *journal, int full)
> > > > +{
> > > > +}
> > > >
> > > > void ext4_fc_init(struct super_block *sb, journal_t *journal)
> > > > {
> > > > if (!test_opt2(sb, JOURNAL_FAST_COMMIT))
> > > > return;
> > > > + journal->j_fc_cleanup_callback = ext4_fc_cleanup;
> > > > if (jbd2_fc_init(journal, EXT4_NUM_FC_BLKS)) {
> > > > pr_warn("Error while enabling fast commits, turning off.");
> > > > ext4_clear_feature_fast_commit(sb);
> > > > diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
> > > > index 6252b4c50666..fa688e163a80 100644
> > > > --- a/fs/jbd2/commit.c
> > > > +++ b/fs/jbd2/commit.c
> > > > @@ -206,6 +206,30 @@ int jbd2_journal_submit_inode_data_buffers(struct jbd2_inode *jinode)
> > > > return generic_writepages(mapping, &wbc);
> > > > }
> > > >
> > > > +/* Send all the data buffers related to an inode */
> > > > +int jbd2_submit_inode_data(struct jbd2_inode *jinode)
> > > > +{
> > > > +
> > > > + if (!jinode || !(jinode->i_flags & JI_WRITE_DATA))
> > > > + return 0;
> > > > +
> > > > + trace_jbd2_submit_inode_data(jinode->i_vfs_inode);
> > > > + return jbd2_journal_submit_inode_data_buffers(jinode);
> > > > +
> > > > +}
> > > > +EXPORT_SYMBOL(jbd2_submit_inode_data);
> > > > +
> > > > +int jbd2_wait_inode_data(journal_t *journal, struct jbd2_inode *jinode)
> > > > +{
> > > > + if (!jinode || !(jinode->i_flags & JI_WAIT_DATA) ||
> > > > + !jinode->i_vfs_inode || !jinode->i_vfs_inode->i_mapping)
> > > > + return 0;
> > > > + return filemap_fdatawait_range_keep_errors(
> > > > + jinode->i_vfs_inode->i_mapping, jinode->i_dirty_start,
> > > > + jinode->i_dirty_end);
> > > > +}
> > > > +EXPORT_SYMBOL(jbd2_wait_inode_data);
> > > > +
> > > > /*
> > > > * Submit all the data buffers of inode associated with the transaction to
> > > > * disk.
> > > > @@ -415,6 +439,20 @@ void jbd2_journal_commit_transaction(journal_t *journal)
> > > > J_ASSERT(journal->j_running_transaction != NULL);
> > > > J_ASSERT(journal->j_committing_transaction == NULL);
> > > >
> > > > + write_lock(&journal->j_state_lock);
> > > > + journal->j_flags |= JBD2_FULL_COMMIT_ONGOING;
> > > > + while (journal->j_flags & JBD2_FAST_COMMIT_ONGOING) {
> > > > + DEFINE_WAIT(wait);
> > > > +
> > > > + prepare_to_wait(&journal->j_fc_wait, &wait,
> > > > + TASK_UNINTERRUPTIBLE);
> > > > + write_unlock(&journal->j_state_lock);
> > > > + schedule();
> > > > + write_lock(&journal->j_state_lock);
> > > > + finish_wait(&journal->j_fc_wait, &wait);
> > > > + }
> > > > + write_unlock(&journal->j_state_lock);
> > >
> > > Hum, I'd like to understand: Is there a reason to block fastcommits already
> > > when the running transaction is in T_LOCKED state? Strictly speaking it is
> > > necessary only once we get to T_FLUSH state AFAIU (because only then we
> > > start to write transaction to the journal). I guess there are both
> > > advantages and disadvantages to it - if we allowed fastcommits running in
> > > T_LOCKED state, we could lower fsync() latency more. OTOH it could increase
> > > commit latency because we'd have to wait for fastcommits after T_LOCKED
> > > state.
> > That's right. I thought given that the transaction is anyway entering
> > locked state, might as well wait for it to complete instead of writing
> > blocks that are going to be obsoleted immediately. Also note that this
> > full commit could have started due to fast commits being in ineligible
> > state. If that's the case, the fast commit code will realize that it
> > can't do much and it will again wait for a full commit. So, even
> > though there is a fsync latency benefit to waiting till T_FLUSH, I'd
> > still marginally prefer blocking fast commits once the transaction
> > enters T_LOCKED state.
>
> OK, makes sence. We can always change it if we find good performance
> reasons for other choice.
Sounds good!
>
> > > > +}
> > > > +EXPORT_SYMBOL(jbd2_fc_end_commit);
> > > > +
> > > > +int jbd2_fc_end_commit_fallback(journal_t *journal, tid_t tid)
> > > > +{
> > > > + return __jbd2_fc_end_commit(journal, tid, 1);
> > > > +}
> > > > +EXPORT_SYMBOL(jbd2_fc_end_commit_fallback);
> > > > +
> > >
> > > Is there a need for 'tid' here? Once jbd2_fc_begin_commit() sets
> > > JBD2_FAST_COMMIT_ONGOING normal commit cannot proceed so when we decide we
> > > cannot do fastcommit in the end, we know the transaction that needs to
> > > commit is the currently running transaction, so we can fetch its TID from
> > > the journal once we hold j_state_lock before clearing
> > > JBD2_FAST_COMMIT_ONGOING. Cannot we?
>
> Did you miss this comment?
Oops. Yeah, I did. Sorry for that. Yes, when we call this function, we
know that the TID that needs to be committed is the current running
transaction. I'll fix this.
>
> > > > /* Return 1 when transaction with given tid has already committed. */
> > > > int jbd2_transaction_committed(journal_t *journal, tid_t tid)
> > > > {
> > > > @@ -784,6 +855,110 @@ int jbd2_journal_next_log_block(journal_t *journal, unsigned long long *retp)
> > > > return jbd2_journal_bmap(journal, blocknr, retp);
> > > > }
> > > >
> > > > +/* Map one fast commit buffer for use by the file system */
> > > > +int jbd2_fc_get_buf(journal_t *journal, struct buffer_head **bh_out)
> > > > +{
> > > > + unsigned long long pblock;
> > > > + unsigned long blocknr;
> > > > + int ret = 0;
> > > > + struct buffer_head *bh;
> > > > + int fc_off;
> > > > +
> > > > + *bh_out = NULL;
> > > > + write_lock(&journal->j_state_lock);
> > > > +
> > > > + if (journal->j_fc_off + journal->j_fc_first < journal->j_fc_last) {
> > > > + fc_off = journal->j_fc_off;
> > > > + blocknr = journal->j_fc_first + fc_off;
> > > > + journal->j_fc_off++;
> > > > + } else {
> > > > + ret = -EINVAL;
> > > > + }
> > > > + write_unlock(&journal->j_state_lock);
> > >
> > > Is j_state_lock really needed here? There is always only one process doing
> > > fastcommit so nobody else should be touching j_fc_off and other fields. Or
> > > am I missing something?
> > You are right, there should only be one process calling
> > jbd2_fc_get_buf. I'll fix this.
>
> Maybe add a comment to j_fc_off & co. that they are not protected by any
> lock - only by the fact that there's always only a single process doing
> fastcommit.
Ack
Thanks,
Harshad
>
> > > > +
> > > > + /*
> > > > + * Wait in reverse order to minimize chances of us being woken up before
> > > > + * all IOs have completed
> > > > + */
> > > > + for (i = j_fc_off - 1; i >= j_fc_off - num_blks; i--) {
> > > > + bh = journal->j_fc_wbuf[i];
> > > > + wait_on_buffer(bh);
> > > > + put_bh(bh);
> > > > + journal->j_fc_wbuf[i] = NULL;
> > > > + if (unlikely(!buffer_uptodate(bh)))
> > > > + return -EIO;
> > > > + }
> > > > +
> > > > + return 0;
> > > > +}
> > > > +EXPORT_SYMBOL(jbd2_fc_wait_bufs);
> > > > +
> > > > +/*
> > > > + * Wait on fast commit buffers that were allocated by jbd2_fc_get_buf
> > > > + * for completion.
> > > > + */
> > > > +int jbd2_fc_release_bufs(journal_t *journal)
> > > > +{
> > > > + struct buffer_head *bh;
> > > > + int i, j_fc_off;
> > > > +
> > > > + read_lock(&journal->j_state_lock);
> > > > + j_fc_off = journal->j_fc_off;
> > > > + read_unlock(&journal->j_state_lock);
> > > > +
> > > > + /*
> > > > + * Wait in reverse order to minimize chances of us being woken up before
> > > > + * all IOs have completed
> > > > + */
> > > > + for (i = j_fc_off - 1; i >= 0; i--) {
> > > > + bh = journal->j_fc_wbuf[i];
> > > > + if (!bh)
> > > > + break;
> > > > + put_bh(bh);
> > > > + journal->j_fc_wbuf[i] = NULL;
> > > > + }
> > > > +
> > > > + return 0;
> > > > +}
> > > > +EXPORT_SYMBOL(jbd2_fc_release_bufs);
> > > > +
> > >
> > > I kind of wonder if releasing of buffers shouldn't be done automatically
> > > either as part of jbd2_fc_wait_bufs() or when ending fastcommit. But I
> > > don't have a strong opinion so this is just an idea for consideration.
> > So, that's what I do. The buffers get released in jbd2_fc_wait_bufs().
> > However, in case of errors or fallback to full commits, buffers may
> > not be submitted and thus won't be released. So this function is to
> > release all the unsubmitted buffers. This gets called from the cleanup
> > callback which is called after every successful or failed full commit
> > or fast commit.
>
> Aha, I missed that. Thanks for explanation.
>
> Honza
> --
> Jan Kara <jack@...e.com>
> SUSE Labs, CR
Powered by blists - more mailing lists