[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201105103024.GA32718@quack2.suse.cz>
Date: Thu, 5 Nov 2020 11:30:24 +0100
From: Jan Kara <jack@...e.cz>
To: harshad shirwadkar <harshadshirwadkar@...il.com>
Cc: Jan Kara <jack@...e.cz>,
Ext4 Developers List <linux-ext4@...r.kernel.org>,
"Theodore Y. Ts'o" <tytso@....edu>
Subject: Re: [PATCH 04/10] ext4: clean up the JBD2 API that initializes fast
commits
On Wed 04-11-20 11:52:24, harshad shirwadkar wrote:
> On Tue, Nov 3, 2020 at 8:29 AM Jan Kara <jack@...e.cz> wrote:
> > > -int jbd2_fc_init(journal_t *journal, int num_fc_blks)
> > > +int jbd2_fc_init(journal_t *journal)
> > > {
> > > - journal->j_fc_wbufsize = num_fc_blks;
> > > - journal->j_fc_wbuf = kmalloc_array(journal->j_fc_wbufsize,
> > > - sizeof(struct buffer_head *), GFP_KERNEL);
> > > - if (!journal->j_fc_wbuf)
> > > - return -ENOMEM;
> > > + /*
> > > + * Only set j_fc_wbufsize here to indicate that the client file
> > > + * system is interested in using fast commits. The actual number of
> > > + * fast commit blocks is found inside jbd2_superblock and is only
> > > + * valid if j_fc_wbufsize is non-zero. The real value of j_fc_wbufsize
> > > + * gets set in journal_reset().
> > > + */
> > > + journal->j_fc_wbufsize = JBD2_MIN_FC_BLOCKS;
> > > return 0;
> > > }
> >
> > When looking at this, is there a reason why jbd2_fc_init() still exists? I
> > mean why not just make the rule that the journal has FC block number set
> > iff FC gets enabled? Anything else seems a bit confusing to me and also
> > dangerous - imagine we have fs with FC running, we write some FCs and then
> > crash. Then on system recovery we mount with no_fc mount option. We have
> > just lost data on the filesystem AFAIU... So I'd just remove all the mount
> > options related to fastcommits and leave everything to the journal setup
> > (which can be modified with e2fsprogs if needed) to keep things simple.
> The problem is whether or not to use fast commits is the file system's
> call. The JBD2 feature flag will be cleared on a clean unmount and if
> we rely solely on the JBD2 feature flag, fast commit will be turned
> off after a clean unmount. Whereas the FS compat flag is the source of
> truth about whether fast commit needs to be used or not. That's why we
> need an API for the file system to tell JBD2 to still do fast commits.
Yes, I meant the API could be just that the filesystem either calls
jbd2_journal_set_features() with FASTCOMMIT feature or it won't. Similarly
to how e.g. JBD2_FEATURE_INCOMPAT_64BIT is handled. No need for
jbd2_fc_init() function AFAICT.
> Mount options that override the feature flag in Ext4 were mainly meant
> for debugging purposes. So, perhaps there should be a clear warning
> message in the kernel if any of these options are used? Even if we get
> rid of the mount options, we still need the jbd2_fc_init() API for the
> FS to tell JBD2 that it wants to use fast commit. Note that even if
> jbd2_fc_init() is not called, JBD2 will still try to replay fast
> commit blocks.
> > > EXPORT_SYMBOL(jbd2_fc_init);
> > > @@ -1500,7 +1494,7 @@ static void journal_fail_superblock(journal_t *journal)
> > > static int journal_reset(journal_t *journal)
> > > {
> > > journal_superblock_t *sb = journal->j_superblock;
> > > - unsigned long long first, last;
> > > + unsigned long long first, last, num_fc_blocks;
> > >
> > > first = be32_to_cpu(sb->s_first);
> > > last = be32_to_cpu(sb->s_maxlen);
> > > @@ -1513,6 +1507,28 @@ static int journal_reset(journal_t *journal)
> > >
> > > journal->j_first = first;
> > >
> > > + /*
> > > + * At this point, fast commit recovery has finished. Now, we solely
> > > + * rely on the file system to decide whether it wants fast commits
> > > + * or not. File system that wishes to use fast commits must have
> > > + * already called jbd2_fc_init() before we get here.
> > > + */
> > > + if (journal->j_fc_wbufsize > 0)
> > > + jbd2_journal_set_features(journal, 0, 0,
> > > + JBD2_FEATURE_INCOMPAT_FAST_COMMIT);
> > > + else
> > > + jbd2_journal_clear_features(journal, 0, 0,
> > > + JBD2_FEATURE_INCOMPAT_FAST_COMMIT);
> > > +
> > > + /* If valid, prefer the value found in superblock over the default */
> > > + num_fc_blocks = be32_to_cpu(sb->s_num_fc_blks);
> > > + if (num_fc_blocks > 0 && num_fc_blocks < last)
> > > + journal->j_fc_wbufsize = num_fc_blocks;
> > > +
> > > + if (jbd2_has_feature_fast_commit(journal))
> > > + journal->j_fc_wbuf = kmalloc_array(journal->j_fc_wbufsize,
> > > + sizeof(struct buffer_head *), GFP_KERNEL);
> > > +
> > > if (jbd2_has_feature_fast_commit(journal) &&
> > > journal->j_fc_wbufsize > 0) {
> > > journal->j_fc_last = last;
> > > @@ -1531,7 +1547,8 @@ static int journal_reset(journal_t *journal)
> > > journal->j_commit_sequence = journal->j_transaction_sequence - 1;
> > > journal->j_commit_request = journal->j_commit_sequence;
> > >
> > > - journal->j_max_transaction_buffers = journal->j_maxlen / 4;
> > > + journal->j_max_transaction_buffers =
> > > + (journal->j_maxlen - journal->j_fc_wbufsize) / 4;
> > >
> > > /*
> > > * As a special case, if the on-disk copy is already marked as needing
> > > @@ -1872,6 +1889,7 @@ static int load_superblock(journal_t *journal)
> > > {
> > > int err;
> > > journal_superblock_t *sb;
> > > + int num_fc_blocks;
> > >
> > > err = journal_get_superblock(journal);
> > > if (err)
> > > @@ -1884,10 +1902,12 @@ static int load_superblock(journal_t *journal)
> > > journal->j_first = be32_to_cpu(sb->s_first);
> > > journal->j_errno = be32_to_cpu(sb->s_errno);
> > >
> > > - if (jbd2_has_feature_fast_commit(journal) &&
> > > - journal->j_fc_wbufsize > 0) {
> > > + if (jbd2_has_feature_fast_commit(journal)) {
> > > journal->j_fc_last = be32_to_cpu(sb->s_maxlen);
> > > - journal->j_last = journal->j_fc_last - journal->j_fc_wbufsize;
> > > + num_fc_blocks = be32_to_cpu(sb->s_num_fc_blks);
> > > + if (!num_fc_blocks || num_fc_blocks >= journal->j_fc_last)
> >
> > I think this needs to be stricter - we need the check that the journal is
> > at least JBD2_MIN_JOURNAL_BLOCKS long (which happens at the beginning of
> > journal_reset()) to happen after we've subtracted fastcommit blocks...
> So are you saying that with FC, the minimum journal size is
> JBD2_MIN_JOURNAL_BLOCKS + JBD2_MIN_FC_BLOCKS? I was assuming that we
Yes. JBD2_MIN_JOURNAL_BLOCKS is minimum number of blocks we need for normal
commits to get reasonable behavior. So as you say with fastcommits enabled,
the minimal journal size is JBD2_MIN_JOURNAL_BLOCKS + JBD2_MIN_FC_BLOCKS.
> will reserve JBD2_MIN_FC_BLOCKS (256) blocks out of the total journal
> size. That way the users who rely on the journal size to be 1024
> blocks, won't see a difference in journal size even after turning FC
> on. But I'm not sure if that's something we should care about.
Well, e2fsprogs need to check journal size when enabling fastcommits so
that we don't get invalid configurations.
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists