linux-kernel - Re: [syzbot] [ext4?] WARNING in __folio_mark

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <20251121111433.91bea9e742dd2a2e0a3ecfff@linux-foundation.org>
Date: Fri, 21 Nov 2025 11:14:33 -0800
From: Andrew Morton <akpm@...ux-foundation.org>
To: Matthew Wilcox <willy@...radead.org>
Cc: syzbot <syzbot+b0a0670332b6b3230a0a@...kaller.appspotmail.com>,
 linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [ext4?] WARNING in __folio_mark_dirty (3)

On Fri, 21 Nov 2025 19:02:18 +0000 Matthew Wilcox <willy@...radead.org> wrote:

> > I'm guessing that ext4 permitted a non-uptodate folio to find its way
> > into the blockdev mapping then the pagefault code tried to modify it
> > and got upset.
> 
> I think you're right, but the reason it's upset is that it found a
> !uptodate folio that was mapped into userspace, and that's not supposed
> to happen!  Presumably it was uptodate at the point it was initially
> faulted in, then (perhaps when the error happened?) somebody cleared the
> uptodate flag without unmapping the folio.
> 
> Hm.  I wonder if we should do this to catch the offender:
> 
> @@ -831,7 +833,17 @@ static __always_inline void SetPageUptodate(struct page *pa
> ge)
>         folio_mark_uptodate((struct folio *)page);
>  }
> 
> -CLEARPAGEFLAG(Uptodate, uptodate, PF_NO_TAIL)
> +static __always_inline void folio_clear_uptodate(struct folio *folio)
> +{
> +       VM_BUG_ON_FOLIO(folio_mapped(folio), folio);
> +       clear_bit(PG_uptodate, folio_flags(folio, 0));
> +}
> +
> +static __always_inline void ClearPageUptodate(struct page *page)
> +{
> +       VM_BUG_ON_PGFLAGS(PageTail(page), page);
> +       folio_clear_uptodate((struct folio *)page);
> +}
> 
>  void __folio_start_writeback(struct folio *folio, bool keep_write);
>  void set_page_writeback(struct page *page);

We have a reproducer, fortunately.

> ... it doesn't actually compile because folio_mapcount() is in mm.h
> so the declaration is out of order, but I can invest smoe effort into
> making that work if you think it's worth doing.

It's a shame to add more debug stuff into oft-called inline functions.

Maybe some hacky thing which uninlines these functions and adds the
debug?  I can slip that into -next until we fix the bug then throw the
debug patch away.

Of course, there may be other filesystems which are tripped up by this.
Once we fully understand the failure we can decide whether it's worth
adding the extra debug to mainline?