linux-kernel - Re: [PATCH v2] ext4: check folio uptodate state in ext4_page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <7b2fafab-a156-47e3-b504-469133b8793d@huaweicloud.com>
Date: Tue, 2 Dec 2025 20:24:24 +0800
From: Zhang Yi <yi.zhang@...weicloud.com>
To: Deepanshu Kartikey <kartikey406@...il.com>
Cc: linux-ext4@...r.kernel.org, linux-kernel@...r.kernel.org,
 syzbot+b0a0670332b6b3230a0a@...kaller.appspotmail.com, tytso@....edu,
 adilger.kernel@...ger.ca, djwong@...nel.org
Subject: Re: [PATCH v2] ext4: check folio uptodate state in
 ext4_page_mkwrite()

Hi Deepanshu!

On 11/30/2025 10:06 AM, Deepanshu Kartikey wrote:
> On Sat, Nov 22, 2025 at 7:27 AM Deepanshu Kartikey
> <kartikey406@...il.com> wrote:
>>
>> When delayed block allocation fails due to filesystem corruption,
>> ext4's writeback error handling invalidates affected folios by calling
>> mpage_release_unused_pages() with invalidate=true, which explicitly
>> clears the uptodate flag:
>>
>>     static void mpage_release_unused_pages(..., bool invalidate)
>>     {
>>         ...
>>         if (invalidate) {
>>             block_invalidate_folio(folio, 0, folio_size(folio));
>>             folio_clear_uptodate(folio);
>>         }
>>     }
>>
>> If ext4_page_mkwrite() is subsequently called on such a non-uptodate
>> folio, it can proceed to mark the folio dirty without checking its
>> state. This triggers a warning in __folio_mark_dirty():
>>
>>     WARNING: CPU: 0 PID: 5 at mm/page-writeback.c:2960
>>     __folio_mark_dirty+0x578/0x880
>>
>>     Call Trace:
>>      fault_dirty_shared_page+0x16e/0x2d0
>>      do_wp_page+0x38b/0xd20
>>      handle_pte_fault+0x1da/0x450
>>      __handle_mm_fault+0x652/0x13b0
>>      handle_mm_fault+0x22a/0x6f0
>>      do_user_addr_fault+0x200/0x8a0
>>      exc_page_fault+0x81/0x1b0
>>
>> This scenario occurs when:
>> 1. A write with delayed allocation marks a folio dirty (uptodate=1)
>> 2. Writeback attempts block allocation but detects filesystem corruption
>> 3. Error handling calls mpage_release_unused_pages(invalidate=true),
>>    which clears the uptodate flag via folio_clear_uptodate()
>> 4. A subsequent ftruncate() triggers ext4_truncate()
>> 5. ext4_block_truncate_page() attempts to zero the page tail
>> 6. This triggers a write fault on the mmap'd page
>> 7. ext4_page_mkwrite() is called with the non-uptodate folio
>> 8. Without checking uptodate, it proceeds to mark the folio dirty
>> 9. __folio_mark_dirty() triggers: WARN_ON_ONCE(!folio_test_uptodate())

Thank you a lot for analyzing this issue and the fix patch. As I was
going through the process of understanding this issue, I had one
question. Is the code flow that triggers the warning as follows?

wp_page_shared()
  do_page_mkwrite()
    ext4_page_mkwrite()
      block_page_mkwrite()   //The default delalloc path
        block_commit_write()
          mark_buffer_dirty()
            __folio_mark_dirty(0)  //'warn' is false, doesn't trigger warning
        folio_mark_dirty()
          ext4_dirty_folio()
            block_dirty_folio  //newly_dirty is false, doesn't call __folio_mark_dirty()
  fault_dirty_shared_page()
    folio_mark_dirty()  //Trigger warning ?

This folio has been marked as dirty. How was this warning triggered?
Am I missing something?

Thanks,
Yi.

>>
>> Fix this by checking folio_test_uptodate() early in ext4_page_mkwrite()
>> and returning VM_FAULT_SIGBUS if the folio is not uptodate. This prevents
>> attempting to write to invalidated folios and properly signals the error
>> to userspace.
>>
>> The check is placed early, before the delalloc/journal/normal code paths,
>> as none of these paths should proceed with a non-uptodate folio.
>>
>> Reported-by: syzbot+b0a0670332b6b3230a0a@...kaller.appspotmail.com
>> Tested-by: syzbot+b0a0670332b6b3230a0a@...kaller.appspotmail.com
>> Closes: https://syzkaller.appspot.com/bug?extid=b0a0670332b6b3230a0a
>> Signed-off-by: Deepanshu Kartikey <kartikey406@...il.com>
>> ---
>>  fs/ext4/inode.c | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
>> index e99306a8f47c..18a029362c1f 100644
>> --- a/fs/ext4/inode.c
>> +++ b/fs/ext4/inode.c
>> @@ -6688,6 +6688,14 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf)
>>         if (err)
>>                 goto out_ret;
>>
>> +       folio_lock(folio);
>> +       if (!folio_test_uptodate(folio)) {
>> +               folio_unlock(folio);
>> +               ret = VM_FAULT_SIGBUS;
>> +               goto out;
>> +       }
>> +       folio_unlock(folio);
>> +
>>         /*
>>          * On data journalling we skip straight to the transaction handle:
>>          * there's no delalloc; page truncated will be checked later; the
>> --
>> 2.43.0
>>
> 
> Hi Ted and ext4 maintainers,
> 
> I wanted to follow up on this patch submitted a week ago. This fixes
> a syzbot-reported WARNING in __folio_mark_dirty() that occurs when
> ext4_page_mkwrite() is called with a non-uptodate folio after delayed
> allocation writeback failure.
> 
> Please let me know if there's any feedback or if I should make any
> changes.
> 
> Thanks,
> Deepanshu
>