[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aTJSglQznqeph5lM@casper.infradead.org>
Date: Fri, 5 Dec 2025 03:33:22 +0000
From: Matthew Wilcox <willy@...radead.org>
To: Theodore Tso <tytso@....edu>
Cc: Deepanshu Kartikey <kartikey406@...il.com>,
Zhang Yi <yi.zhang@...weicloud.com>, linux-ext4@...r.kernel.org,
linux-kernel@...r.kernel.org,
syzbot+b0a0670332b6b3230a0a@...kaller.appspotmail.com,
adilger.kernel@...ger.ca, djwong@...nel.org
Subject: Re: [PATCH v2] ext4: check folio uptodate state in
ext4_page_mkwrite()
On Thu, Dec 04, 2025 at 09:18:18PM -0500, Theodore Tso wrote:
> On Thu, Dec 04, 2025 at 03:24:50PM +0530, Deepanshu Kartikey wrote:
> > Based on Matthew's earlier feedback that we need to "prevent !uptodate
> > folios from being referenced by the page tables," I believe the
> > correct fix is not in ext4_page_mkwrite() at all, but rather in
> > mpage_release_unused_pages().
> >
> > When we invalidate folios due to writeback failure, we should also
> > unmap them from page tables....
>
> Hmm.... if the page is mmap'ed into the user process, on a writeback
> failure, the page contents will suddenly and without any warning,
> *disappear*.
It sounds like I was confused -- I thought the folios being invalidated
in mpage_release_unused_pages() belonged to the block device, but from
what you're saying, they belong to a user-visible file?
Once we hit a writeback error (whether we're in a "device gave EIO" or
"filesystem is corrupted" situation), we're firmly outside what POSIX
speaks to, and so all that matters is quality of implementation.
Now, is the folio necessarily dirty at this point? I guess so if we're
in the writeback path. Darrick got rid of similar code in iomap a few
years ago; see commit e9c3a8e820ed. So it'd probably be good to have
ext4 behave the same way.
> So the other option is we could simply *not* invalidate the folio, but
> instead leave the folio dirty. In some cases, where a particular
> block group is corrupted, if we retry the block allocation, the
> corrupted block group will be busied out, and so when the write back
> is retried, it's possible that the data will be actually be persisted.
>
> We do need to make sure the right thing we unmount the filesystem,
> since at that point, we have no choice but the invalidate the page and
> the data will get lost when the file system is unmounted. So it's a
> more complicated approach. But if this is happening when the file
> system is corrupted, especially if it was maliciously corrupted, all
> bets are off anyway, so maybe it's not worth the complexity.
I'm generally in favour of doing anything we can to write dirty user
data back to storage ;-) Of course if the storage is throwing a wobbly,
that's beyond our abilities.
Powered by blists - more mailing lists