linux-ext4 - Re: [PATCH] ext4: Fix call trace when remounting to read only in data=journal mode

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <CAMsNC+svSX9AiZbs1dd4qigZqBuOjuCHOjpyXzgaO0sNLUHDYA@mail.gmail.com>
Date: Thu, 5 Feb 2026 20:59:13 +0800
From: Gerald Yang <gerald.yang@...onical.com>
To: Jan Kara <jack@...e.cz>
Cc: tytso@....edu, adilger.kernel@...ger.ca, linux-ext4@...r.kernel.org, 
	gerald.yang.tw@...il.com
Subject: Re: [PATCH] ext4: Fix call trace when remounting to read only in
 data=journal mode

Thanks Jan for fixing this issue, I can confirm the patch works for me too.


On Thu, Feb 5, 2026 at 5:25 PM Jan Kara <jack@...e.cz> wrote:
>
> On Tue 03-02-26 15:50:43, Jan Kara wrote:
> > Hello,
> >
> > On Fri 30-01-26 19:38:55, Gerald Yang wrote:
> > > Thanks for sharing the findings, I'd also like to share some findings:
> > > I tried to figure out why the buffer is dirty after calling sync_filesystem,
> > > in mpage_prepare_extent_to_map, first I printed folio_test_dirty(folio):
> > >
> > > while (index <= end)
> > >     ...
> > >     for (i = 0; i < nr_folios; i++) {
> > >         ...
> > >         (print if folio is dirty here)
> > >
> > > and actually all folios are clean:
> > > if (!folio_test_dirty(folio) ||
> > >     ...
> > >     folio_unlock(folio);
> > >     continue;       <==== continue here without writing anything
> > >
> > > Because the call trace happens before going into the above while loop:
> > >
> > > if (ext4_should_journal_data(mpd->inode)) {
> > >     handle = ext4_journal_start(mpd->inode, EXT4_HT_WRITE_PAGE,
> > >
> > > it checks if the file system is read only and dumps the call trace in
> > > ext4_journal_check_start, but it doesn't check if there are any real writes
> > > that will happen later in the loop.
> > >
> > > To confirm this, first I added 2 more lines in the reproduce script before
> > > remounting read only:
> > > sync      <==== it calls ext4_sync_fs to flush all dirty data same as what's
> > >                          called during remount read only
> > > echo 1 > /proc/sys/vm/drop_caches       <==== drop clean page cache
> > > mount -o remount,ro ext4disk mnt
> > >
> > > Then I can no longer reproduce the call trace.
> >
> > OK, but ext4_do_writepages() has a check at the beginning:
> >
> >         if (!mapping->nrpages || !mapping_tagged(mapping, PAGECACHE_TAG_DIRTY))
> >                 goto out_writepages;
> >
> > So if there are no dirty pages, mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)
> > should be false and so we shouldn't go further?
> >
> > It all looks like some kind of a race because I'm not always able to
> > reproduce the problem... I'll try to look more into this.
>
> OK, the race is with checkpointing code writing the buffers while flush
> worker tries to writeback the pages. I've posted a patch which fixes the
> issue for me.
>
>                                                                 Honza
> --
> Jan Kara <jack@...e.com>
> SUSE Labs, CR