[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <vxsqorapn2flwqx6ipsye6wf6h5lvciqoywvwrd2w4nwxyuajz@l3mao3pmqikn>
Date: Tue, 21 Jan 2025 17:55:19 +0100
From: Jan Kara <jack@...e.cz>
To: Heming Zhao <heming.zhao@...e.com>
Cc: Jan Kara <jack@...e.cz>, Theodore Ts'o <tytso@....edu>,
li.kai4@....com, jack@...e.com, linux-ext4@...r.kernel.org,
linux-kernel@...r.kernel.org, syzkaller@...glegroups.com,
Joseph Qi <joseph.qi@...ux.alibaba.com>, ocfs2-devel@...ts.linux.dev,
Liebes Wang <wanghaichi0403@...il.com>, syzbot <syzbot+96ee12698391289383dd@...kaller.appspotmail.com>
Subject: Re: WARNING in jbd2_journal_update_sb_log_tail
On Wed 15-01-25 18:53:41, Jan Kara wrote:
> On Wed 15-01-25 13:00:23, Heming Zhao wrote:
> > Hello Jan,
> >
> > On 1/15/25 09:32, Liebes Wang wrote:
> > > The bisection log shows the first cause commit is a09decff5c32060639a685581c380f51b14e1fc2:
> > > a09decff5c32 jbd2: clear JBD2_ABORT flag before journal_reset to update log tail info when load journal
> > >
> > > The full bisection log is attached. Hope this helps.
> >
> > This bisearch commit a09decff5c32 appears to be the root cause
> > of this issue. It fixed one issue but introduced another.
> >
> > Syzbot tested the patch with calling jbd2_journal_wipe() with 'write=1'.
> > The Syzbot test result [1] shows that the same WARN_ON() is triggered
> > in a subsequent routine – the classic whack-a-mole!
> >
> > Back to commit a09decff5c32, it opened a door to allow jbd2 to update
> > sb regardless of whether the value of sb items are correct.
> >
> > To fix a09decff5c32, it seems that jbd2 needs to add more sanity check
> > codes in a sub-routine of jbd2_journal_load().
> >
> > btw, in my view, this is a jbd2 issue not ocfs2/ext4 issue.
> >
> > [1]: https://lore.kernel.org/ocfs2-devel/04a9ad29-51de-4b50-a5bb-56f91817639d@suse.com/T/#m86d01f83d808868bb5e6548d30f79b4f9f889b13
>
> Thanks for debugging this! So I'm not 100% convinced this is only jbd2 bug
> because jbd2_journal_recover() was never intended to be called after
> jbd2_journal_skip_recovery() (called from jbd2_journal_wipe()). You're
> supposed to call either jbd2_journal_wipe() or jbd2_journal_recover() but
> not both. So IMO this needs fixing in OCFS2 code. That being said you've
> also pointed at one bug in jbd2 code - the WARN_ON(!sb->s_sequence) in
> jbd2_journal_update_sb_log_tail() is indeed wrong. We were inconsistent
> inside jbd2 whether TID 0 is considered valid or not and relatively
> recently we've decided to accept TID 0 as valid but this place was left
> out. I'll send a fix for that.
OK, after checking again OCFS2 is indeed fine here. I'm sorry for the
confusion. I'll send appropriate jbd2 fixes shortly.
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists