[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <19c0edeadad.6adfc20c802629.8728440093470268039@linux.beauty>
Date: Fri, 30 Jan 2026 20:26:40 +0800
From: Li Chen <me@...ux.beauty>
To: "Matthew Wilcox" <willy@...radead.org>
Cc: "Mark Fasheh" <mark@...heh.com>, "Joel Becker" <jlbec@...lplan.org>,
"Joseph Qi" <joseph.qi@...ux.alibaba.com>,
"ocfs2-devel" <ocfs2-devel@...ts.linux.dev>,
"linux-kernel" <linux-kernel@...r.kernel.org>,
"Jan Kara" <jack@...e.com>
Subject: Re: [PATCH 3/3] ocfs2: use READ_ONCE for lockless jinode reads
Hi Matthew,
> On Fri, Jan 30, 2026 at 11:12:32AM +0800, Li Chen wrote:
> > ocfs2 journal commit callback reads jbd2_inode dirty range fields without
> > holding journal->j_list_lock.
> >
> > Use READ_ONCE() for these reads to correct the concurrency assumptions.
>
> I don't think this is the right solution to the problem. If it is,
> there needs to be much better argumentation in the commit message.
>
> As I understand it, jbd2_journal_file_inode() initialises jinode,
> then adds it to the t_inode_list, then drops the j_list_lock. So the
> actual problem we need to address is that there's no memory barrier
> between the store to i_dirty_start and the list_add(). Once that's
> added, there's no need for a READ_ONCE here.
>
> Or have I misunderstood the problem?
Thanks for the review.
My understanding of your point is that you're worried about a missing
"publish" ordering in jbd2_journal_file_inode(): we store
jinode->i_dirty_start/end and then list_add() the jinode to
t_inode_list, and a core which observes the list entry might miss the prior
i_dirty_* stores. Is that the issue you had in mind?
If so, for the normal commit path where the list is walked under
journal->j_list_lock (e.g. journal_submit_data_buffers() in
fs/jbd2/commit.c), spin_lock()/spin_unlock() should already provide the
necessary ordering, since both the i_dirty_* updates and the list_add()
happen inside the same critical section.
The ocfs2 case I was aiming at is different: the filesystem callback is
invoked after unlocking journal->j_list_lock and may sleep, so it can't hold
j_list_lock but it still reads jinode->i_dirty_start/end while other
threads update these fields under the lock. Adding a barrier between the
stores and list_add() would not address that concurrent update window.
So the itent of READ_ONCE() in ocfs2 is to take a single snapshot of the
dirty range values from memory (avoid compiler to reuse a value kept in a
register or fold multiple reads). I'm not trying to claim any additional
memory ordering from this change.
I'll respin and adjust the commit message accordingly. The updated part will
say along the lines of:
"ocfs2 reads jinode->i_dirty_start/end without journal->j_list_lock
(callback may sleep); these fields are updated under j_list_lock in jbd2.
Use READ_ONCE() so the callback takes a single snapshot via actual loads
from the variable (i.e. don't let the compiler reuse a value kept in a register
or fold multiple reads)."
Does that match your understanding?
Regards,
Li
> > Suggested-by: Jan Kara <jack@...e.com>
> > Signed-off-by: Li Chen <me@...ux.beauty>
> > ---
> > fs/ocfs2/journal.c | 7 +++++--
> > 1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
> > index 85239807dec7..7032284cdbd6 100644
> > --- a/fs/ocfs2/journal.c
> > +++ b/fs/ocfs2/journal.c
> > @@ -902,8 +902,11 @@ int ocfs2_journal_alloc(struct ocfs2_super *osb)
> >
> > static int ocfs2_journal_submit_inode_data_buffers(struct jbd2_inode *jinode)
> > {
> > - return filemap_fdatawrite_range(jinode->i_vfs_inode->i_mapping,
> > - jinode->i_dirty_start, jinode->i_dirty_end);
> > + struct address_space *mapping = jinode->i_vfs_inode->i_mapping;
> > + loff_t dirty_start = READ_ONCE(jinode->i_dirty_start);
> > + loff_t dirty_end = READ_ONCE(jinode->i_dirty_end);
> > +
> > + return filemap_fdatawrite_range(mapping, dirty_start, dirty_end);
> > }
> >
> > int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
> > --
> > 2.52.0
> >
>
Powered by blists - more mailing lists