linux-kernel - Re: [PATCH v2] ext4: check folio uptodate state in ext4_page

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aadd43df-3df4-4dcc-a0b3-e7bfded0dff8@huaweicloud.com>
Date: Wed, 3 Dec 2025 14:52:34 +0800
From: Zhang Yi <yi.zhang@...weicloud.com>
To: Deepanshu Kartikey <kartikey406@...il.com>
Cc: linux-ext4@...r.kernel.org, linux-kernel@...r.kernel.org,
 syzbot+b0a0670332b6b3230a0a@...kaller.appspotmail.com, tytso@....edu,
 adilger.kernel@...ger.ca, djwong@...nel.org
Subject: Re: [PATCH v2] ext4: check folio uptodate state in
 ext4_page_mkwrite()

On 12/3/2025 9:37 AM, Deepanshu Kartikey wrote:
> On Tue, Dec 2, 2025 at 5:54 PM Zhang Yi <yi.zhang@...weicloud.com> wrote:
>>
>> Hi Deepanshu!
>>
>> On 11/30/2025 10:06 AM, Deepanshu Kartikey wrote:
>>> On Sat, Nov 22, 2025 at 7:27 AM Deepanshu Kartikey
>>> <kartikey406@...il.com> wrote:
>>>>
>>>> When delayed block allocation fails due to filesystem corruption,
>>>> ext4's writeback error handling invalidates affected folios by calling
>>>> mpage_release_unused_pages() with invalidate=true, which explicitly
>>>> clears the uptodate flag:
>>>>
>>>>     static void mpage_release_unused_pages(..., bool invalidate)
>>>>     {
>>>>         ...
>>>>         if (invalidate) {
>>>>             block_invalidate_folio(folio, 0, folio_size(folio));
>>>>             folio_clear_uptodate(folio);
>>>>         }
>>>>     }
>>>>
>>>> If ext4_page_mkwrite() is subsequently called on such a non-uptodate
>>>> folio, it can proceed to mark the folio dirty without checking its
>>>> state. This triggers a warning in __folio_mark_dirty():
>>>>
>>>>     WARNING: CPU: 0 PID: 5 at mm/page-writeback.c:2960
>>>>     __folio_mark_dirty+0x578/0x880
>>>>
>>>>     Call Trace:
>>>>      fault_dirty_shared_page+0x16e/0x2d0
>>>>      do_wp_page+0x38b/0xd20
>>>>      handle_pte_fault+0x1da/0x450
>>>>      __handle_mm_fault+0x652/0x13b0
>>>>      handle_mm_fault+0x22a/0x6f0
>>>>      do_user_addr_fault+0x200/0x8a0
>>>>      exc_page_fault+0x81/0x1b0
>>>>
>>>> This scenario occurs when:
>>>> 1. A write with delayed allocation marks a folio dirty (uptodate=1)
>>>> 2. Writeback attempts block allocation but detects filesystem corruption
>>>> 3. Error handling calls mpage_release_unused_pages(invalidate=true),
>>>>    which clears the uptodate flag via folio_clear_uptodate()
>>>> 4. A subsequent ftruncate() triggers ext4_truncate()
>>>> 5. ext4_block_truncate_page() attempts to zero the page tail
>>>> 6. This triggers a write fault on the mmap'd page
>>>> 7. ext4_page_mkwrite() is called with the non-uptodate folio
>>>> 8. Without checking uptodate, it proceeds to mark the folio dirty
>>>> 9. __folio_mark_dirty() triggers: WARN_ON_ONCE(!folio_test_uptodate())
>>
>> Thank you a lot for analyzing this issue and the fix patch. As I was
>> going through the process of understanding this issue, I had one
>> question. Is the code flow that triggers the warning as follows?
>>
>> wp_page_shared()
>>   do_page_mkwrite()
>>     ext4_page_mkwrite()
>>       block_page_mkwrite()   //The default delalloc path
>>         block_commit_write()
>>           mark_buffer_dirty()
>>             __folio_mark_dirty(0)  //'warn' is false, doesn't trigger warning
>>         folio_mark_dirty()
>>           ext4_dirty_folio()
>>             block_dirty_folio  //newly_dirty is false, doesn't call __folio_mark_dirty()
>>   fault_dirty_shared_page()
>>     folio_mark_dirty()  //Trigger warning ?
>>
>> This folio has been marked as dirty. How was this warning triggered?
>> Am I missing something?
>>
>> Thanks,
>> Yi.
>>
> 
> Hi Yi,
> 
> Thank you for your question about the exact flow that triggers the warning.
> 

Thank you for the clarification, but there are still some details that
are unclear.

> You're correct that the code paths within ext4_page_mkwrite() and
> block_page_mkwrite() call __folio_mark_dirty() with warn=0, so no
> warning occurs there. The warning actually triggers later, in
> fault_dirty_shared_page() after page_mkwrite returns.
> 
> Here's the complete flow:
> 
>   wp_page_shared()
>     ↓
>     do_page_mkwrite()
>       ↓
>       ext4_page_mkwrite()
>         ↓
>         block_page_mkwrite()
>           ↓
>           mark_buffer_dirty() → __folio_mark_dirty(warn=0)  // No warning

             ↓
             if (!folio_test_set_dirty(folio))
                        //The folio will be mark as dirty --- 1

>         ↓
>         Returns success
>     ↓
>     fault_dirty_shared_page(vma, folio)  ← Warning triggers here
>       ↓
>       folio_mark_dirty(folio)
>         ↓
>         ext4_dirty_folio()
>           ↓
>           block_dirty_folio()
>             ↓
>             if (!folio_test_set_dirty(folio))  // Folio not already dirty

              This makes me confused, this folio should be set to dirty
              at position 1. If it is not dirty here, who cleared the dirty
              flag for this folio?

>               __folio_mark_dirty(folio, mapping, 1)  ← warn=1, triggers WARNING
> 
> The key is that the folio can become non-uptodate between when it's
> initially read and when wp_page_shared() is called. This happens when:
> 
> 1. Delayed block allocation fails due to filesystem corruption
> 2. Error handling in mpage_release_unused_pages() explicitly clears uptodate:
> 
>      if (invalidate) {
>          block_invalidate_folio(folio, 0, folio_size(folio));
>          folio_clear_uptodate(folio);
>      }
> 
> 3. A subsequent operation (like ftruncate) triggers ext4_block_truncate_page()
> 4. This causes a write fault on the mmap'd page
> 5. wp_page_shared() is called with the now-non-uptodate folio
> 
> From my debug logs with a test kernel:
> 
>   [22.387777] EXT4-fs error: lblock 0 mapped to illegal pblock 0
>   [22.389798] EXT4-fs: Delayed block allocation failed... error 117
>   [22.390401] EXT4-fs: This should not happen!! Data will be lost
> 
>   [22.399463] EXT4-fs error: Corrupt filesystem
> 
>   [22.400513] WP_PAGE_SHARED: ENTER folio=... uptodate=0 dirty=0
>   [22.401953] WP_PAGE_SHARED: page_mkwrite failed, returning 2
> 
> With my fix, ext4_page_mkwrite() detects the non-uptodate state and
> returns VM_FAULT_SIGBUS before block_page_mkwrite() is called,
> preventing wp_page_shared() from reaching fault_dirty_shared_page().
> 
> Without the fix, the sequence would be:
> - ext4_page_mkwrite() succeeds (doesn't check uptodate)
> - block_page_mkwrite() marks buffers dirty (warn=0, no warning)
> - Returns to wp_page_shared()
> - fault_dirty_shared_page() calls folio_mark_dirty()
> - block_dirty_folio() finds folio not dirty (uptodate=0, dirty=0)
> - Calls __folio_mark_dirty() with warn=1
> - WARNING triggers: WARN_ON_ONCE(warn && !folio_test_uptodate(folio)
> && !folio_test_dirty(folio))
> 
> The syzbot call trace confirms this:
> 
>   Call Trace:
>    fault_dirty_shared_page+0x16e/0x2d0
>    do_wp_page+0x38b/0xd20
>    handle_pte_fault+0x1da/0x450
> 
> Does this clarify the flow?
> 
> Best regards,
> Deepanshu
>