lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD+ocbxzTnB9Jd0NNgY3JtgiZdNgkdLRPTpr9qJoZVk0qMXHsA@mail.gmail.com>
Date: Fri, 12 Jul 2024 18:38:21 -0700
From: harshad shirwadkar <harshadshirwadkar@...il.com>
To: Jan Kara <jack@...e.cz>
Cc: linux-ext4@...r.kernel.org, tytso@....edu, saukad@...gle.com, 
	harshads@...gle.com
Subject: Re: [PATCH v6 04/10] ext4: rework fast commit commit path

On Fri, Jun 28, 2024 at 6:43 AM Jan Kara <jack@...e.cz> wrote:
>
> On Wed 29-05-24 01:19:57, Harshad Shirwadkar wrote:
> > This patch reworks fast commit's commit path to remove locking the
> > journal for the entire duration of a fast commit. Instead, we only lock
> > the journal while marking all the eligible inodes as "committing". This
> > allows handles to make progress in parallel with the fast commit.
> >
> > Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@...il.com>
> ...
> > @@ -1124,6 +1119,20 @@ static int ext4_fc_perform_commit(journal_t *journal)
> >       int ret = 0;
> >       u32 crc = 0;
> >
> > +     /*
> > +      * Wait for all the handles of the current transaction to complete
> > +      * and then lock the journal. Since this is essentially the commit
> > +      * path, we don't need to wait for reserved handles.
> > +      */
>
> Here I'd expand the comment to explain better why this is safe. Like:
>
>         /*
>          * Wait for all the handles of the current transaction to complete
>          * and then lock the journal. We don't need to wait for reserved
>          * handles since we only need to set EXT4_STATE_FC_COMMITTING state
>          * while the journal is locked - in particular we don't depend on
>          * page writeback state so there's no risk of deadlocking reserved
>          * handles.
>          */
>
> > +     jbd2_journal_lock_updates_no_rsv(journal);
> > +     spin_lock(&sbi->s_fc_lock);
> > +     list_for_each_entry(iter, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list) {
> > +             ext4_set_inode_state(&iter->vfs_inode,
> > +                                  EXT4_STATE_FC_COMMITTING);
> > +     }
> > +     spin_unlock(&sbi->s_fc_lock);
> > +     jbd2_journal_unlock_updates(journal);
> > +
> >       ret = ext4_fc_submit_inode_data_all(journal);
> >       if (ret)
> >               return ret;
> > @@ -1174,6 +1183,18 @@ static int ext4_fc_perform_commit(journal_t *journal)
> >               ret = ext4_fc_write_inode(inode, &crc);
> >               if (ret)
> >                       goto out;
> > +             ext4_clear_inode_state(inode, EXT4_STATE_FC_COMMITTING);
> > +             /*
> > +              * Make sure clearing of EXT4_STATE_FC_COMMITTING is
> > +              * visible before we send the wakeup. Pairs with implicit
> > +              * barrier in prepare_to_wait() in ext4_fc_track_inode().
> > +              */
> > +             smp_mb();
> > +#if (BITS_PER_LONG < 64)
> > +             wake_up_bit(&iter->i_state_flags, EXT4_STATE_FC_COMMITTING);
> > +#else
> > +             wake_up_bit(&iter->i_flags, EXT4_STATE_FC_COMMITTING);
> > +#endif
>
> Maybe create a helper function for clearing the EXT4_STATE_FC_COMMITTING
> bit and waking up the wait queue? It's a bit subtle and used in a few
> places.
>
> > diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> > index cb0b8d6fc0c6..4361e5c56490 100644
> > --- a/fs/jbd2/transaction.c
> > +++ b/fs/jbd2/transaction.c
> > @@ -865,25 +865,15 @@ void jbd2_journal_wait_updates(journal_t *journal)
> >       }
> >  }
> >
> > -/**
> > - * jbd2_journal_lock_updates () - establish a transaction barrier.
> > - * @journal:  Journal to establish a barrier on.
> > - *
> > - * This locks out any further updates from being started, and blocks
> > - * until all existing updates have completed, returning only once the
> > - * journal is in a quiescent state with no updates running.
> > - *
> > - * The journal lock should not be held on entry.
> > - */
> > -void jbd2_journal_lock_updates(journal_t *journal)
> > +static void __jbd2_journal_lock_updates(journal_t *journal, bool wait_on_rsv)
> >  {
> >       jbd2_might_wait_for_commit(journal);
> >
> >       write_lock(&journal->j_state_lock);
> >       ++journal->j_barrier_count;
> >
> > -     /* Wait until there are no reserved handles */
> > -     if (atomic_read(&journal->j_reserved_credits)) {
> > +     if (wait_on_rsv && atomic_read(&journal->j_reserved_credits)) {
> > +             /* Wait until there are no reserved handles */
>
> So it is not as simple as this. start_this_handle() ignores
> journal->j_barrier_count for reserved handles so they would happily start
> while you have the journal locked with jbd2_journal_lock_updates_no_rsv()
> and then writeback code could mess with your fastcommit state. Or perhaps I
> miss some subtlety why this is fine - but that then deserves a good
> explanation in a comment or maybe a different API because currently
> jbd2_journal_lock_updates_no_rsv() doesn't do what one would naively
> expect.

AFAICT, jbd2_journal_commit_transaction() only calls
jbd2_journal_wait_updates(journal) which waits for
trasaction->t_updates to reach 0. But it doesn't wait for
journal->j_reserved_credits to reach 0. I saw a performance
improvement by removing waiting on reserved handles in FC commit code
as well. Given that JBD2 doesn't wait, I (perhaps incorrectly) thought
that it'd be okay to not wait in FC as well. Could you help me
understand why the JBD2 journal commit doesn't need to wait for
reserved handles?

- Harshad
>
>                                                                 Honza
> --
> Jan Kara <jack@...e.com>
> SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ