[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240828105452.z6tqwq47bcnmrevd@quack3>
Date: Wed, 28 Aug 2024 12:54:52 +0200
From: Jan Kara <jack@...e.cz>
To: Theodore Ts'o <tytso@....edu>
Cc: Vinicius Peixoto <vpeixoto@...amp.dev>,
syzbot+8512f3dbd96253ffbe27@...kaller.appspotmail.com,
jack@...e.com, jlbec@...lplan.org, joseph.qi@...ux.alibaba.com,
linux-ext4@...r.kernel.org, linux-kernel@...r.kernel.org,
mark@...heh.com, ocfs2-devel@...ts.linux.dev,
Julian Sun <sunjunchao2870@...il.com>,
syzkaller-bugs@...glegroups.com, ~lkcamp/discussion@...ts.sr.ht
Subject: Re: [syzbot] [ext4?] [ocfs2?] kernel BUG in jbd2_cleanup_journal_tail
On Mon 26-08-24 09:32:08, Theodore Ts'o wrote:
> On Mon, Aug 26, 2024 at 01:22:54AM -0300, Vinicius Peixoto wrote:
> > Since the disk data is bogus, journal_reset fails with -EINVAL ("JBD2:
> > Journal too short (blocks 2-1024)"); this leaves journal->j_head == NULL.
> > However, jbd2_journal_load clears the JBD2_ABORT flag right before calling
> > journal_reset. This leads to a problem later when ofcs2_mount_volume tries
> > to flush the journal as part of the cleanup when aborting the mount
> > operation:
> >
> > -> ofcs2_mount_volume (error; goto out_system_inodes)
> > -> ofcs2_journal_shutdown
> > -> jbd2_journal_flush
> > -> jbd2_cleanup_journal_tail (J_ASSERT fails)
> > ...
> >
> > I confirmed that setting the JBD2_ABORT flag in journal_reset before
> > returning -EINVAL fixes the problem:
> >
> > static int journal_reset(journal_t *journal)
> > journal_fail_superblock(journal);
> > + journal->j_flags |= JBD2_ABORT;
> > return -EINVAL;
> >
> > You can find a proper patch file + the syzbot re-test result in [1].
> > However, I'm not entirely sure whether this is the correct decision, and I
> > wanted to confirm that this is an appropriate solution before sending a
> > proper patch to the mailing list.
>
> The reason why this isn't an issue with ext4 is because the normal
> "right" way to do this is if jbd2_journal_load() returns an error, is
> to call jbd2_journal_destroy() to release the data structures with the
> journal --- and then don't try to use the journal afterwards.
Yep. OCFS2 guys are actually looking into fixing this in OCFS2
(https://lore.kernel.org/ocfs2-devel/20240819131120.746077-1-sunjunchao2870@gmail.com/)
Not sure what's the status though. Julian, did you send v2 of your fix?
Honza
> The weird thing is that there are two code paths in ocfs2 that calls
> jbd2_journal_load(). One is in ocfs2_replay_journal() which does what
> ext4 does. The other is ocfs2_load_journal() which does *not* do
> this, and this is the one which you saw in the syzkaller reproducer.
> It looks like one codepath is used to replay the ocfs2 for some other
> node, and the is to load (and presumably later, replay) the journal
> for the mounting node.
>
> There are also some other things which look very confusing, such as the
> fact that ocfs2_journal_shutdown calls igrab:
>
> /* need to inc inode use count - jbd2_journal_destroy will iput. */
> if (!igrab(inode))
> BUG();
>
> ... which is *weird*. Normaly the journal inode refcount is expected
> to be bumped before calling jbd2_journal_init_inode() --- which
> normally is done by an iget() function, and is needed to make sure the
> journal inode isn't released out from under the jbd2 layer. It looks
> like ocfs2 uses the journal inode for other stuff so get the journal
> inode without calling something like iget(). Which is OK, I guess,
> but it means that there are a whole bunch of places where it has to
> call !BUG_ON(igrab(journal->j_inode) before calling
> jbd2_journal_destroy(). It would seem like the *right* thing to do is
> to bump the refcount in ocfs2_journal_init(), and if for some reason
> the igrab fails, it can just return an error early, instead of having
> to resort to BUG_ON() later, and if you don't realize that you have to
> do this weird igrab() before calling jbd2_journal_destroy(), you'll
> end up leaking the journal inode.
>
> Anyway, I think the right thing to do is to restructure how ocfs2
> calls the jbd2 journal, but that's going to require a more careful
> examination of the code paths. Your patch is a bit of a blunt force
> hack that should be harmless, but it's probably not the best way to
> fix the problem, although doing it "right" would be more
> complicated....
>
> - Ted
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists