linux-ext4 - Re: Possible regression in pin_user_pages_fast() behavior after commit 7ac67301e82f ("ext4: enable large folio for regular file")

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ebe38d8f-0b09-47b8-9503-2d8e0585672a@huaweicloud.com>
Date: Mon, 20 Oct 2025 15:11:12 +0800
From: Zhang Yi <yi.zhang@...weicloud.com>
To: Karol Wachowski <karol.wachowski@...ux.intel.com>
Cc: linux-ext4@...r.kernel.org, tytso@....edu, adilger.kernel@...ger.ca
Subject: Re: Possible regression in pin_user_pages_fast() behavior after
 commit 7ac67301e82f ("ext4: enable large folio for regular file")

Hi, Karol.

Thank you for the report! I am trying to figure out how this issue
occurred. Could you provide a way to reproduce it? It would also be
helpful if you could include the kernel configuration and hardware
environment information.

Thanks,
Yi

On 10/17/2025 9:30 PM, Karol Wachowski wrote:
> Actually the threshold after which is starts to hang is 2 megabytes.
> 
> On 10/17/2025 3:24 PM, Karol Wachowski wrote:
>> Hi,
>>
>> I’m not entirely sure if this is right way to report this.
>>
>> I’ve encountered what appears to be a regression (or at least a
>> behavioral change) related to pin_user_pages_fast() when used with
>> FOLL_LONGTERM on a Copy-on-Write (CoW) mapping (i.e. VM_MAYWRITE without
>> VM_SHARED). Specifically, the call never finishes when the requested
>> size exceeds 8 MB.
>>
>> The same scenario works correctly prior to the following change:
>> commit 7ac67301e82f02b77a5c8e7377a1f414ef108b84
>> Author: Zhang Yi <yi.zhang@...wei.com>
>> Date:   Mon May 12 14:33:19 2025 +0800
>>
>>     ext4: enable large folio for regular file
>>
>> It seems the issue manifests when pin_user_pages_fast() falls back to
>> _gup_longterm_locked(). In that case, we end up calling
>> handle_mm_fault() with FAULT_FLAG_UNSHARE, which splits the PMD. 
>> From ftrace, it looks like the kernel enters an apparent infinite loop
>> of handle_mm_fault() which in turn invokes filemap_map_pages() from the
>> ext4 ops.
>>
>>   1)   1.553 us    |      handle_mm_fault();
>>   1)   0.126 us    |      __cond_resched();
>>   1)   0.055 us    |      vma_pgtable_walk_begin();
>>   1)   0.057 us    |      _raw_spin_lock();
>>   1)   0.111 us    |      _raw_spin_unlock();
>>   1)   0.050 us    |      vma_pgtable_walk_end();
>>   1)   1.521 us    |      handle_mm_fault();
>>   1)   0.122 us    |      __cond_resched();
>>   1)   0.055 us    |      vma_pgtable_walk_begin();
>>   1)   0.288 us    |      _raw_spin_lock();
>>   1)   0.053 us    |      _raw_spin_unlock();
>>   1)   0.048 us    |      vma_pgtable_walk_end();
>>   1)   1.484 us    |      handle_mm_fault();
>>   1)   0.124 us    |      __cond_resched();
>>   1)   0.056 us    |      vma_pgtable_walk_begin();
>>   1)   0.272 us    |      _raw_spin_lock();
>>   1)   0.051 us    |      _raw_spin_unlock();
>>   1)   0.050 us    |      vma_pgtable_walk_end();
>>   1)   1.566 us    |      handle_mm_fault();
>>   1)   0.211 us    |      __cond_resched();
>>   1)   0.107 us    |      vma_pgtable_walk_begin();
>>   1)   0.054 us    |      _raw_spin_lock();
>>   1)   0.052 us    |      _raw_spin_unlock();
>>   1)   0.049 us    |      vma_pgtable_walk_end();
>>
>> I haven’t been able to gather more detailed diagnostics yet, but I’d
>> appreciate any guidance on whether this is a known issue, or if
>> additional debugging information would be helpful.
>>
>> -
>> Karol
>>