[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <18e34ad4-82b1-42c3-b01d-ac6e5330c4e0@arm.com>
Date: Sat, 24 Jan 2026 12:18:22 +0530
From: Dev Jain <dev.jain@....com>
To: Vernon Yang <vernon2gm@...il.com>, david@...nel.org,
Lance Yang <lance.yang@...ux.dev>, baohua@...nel.org
Cc: lorenzo.stoakes@...cle.com, ziy@...dia.com, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Vernon Yang <yanglincheng@...inos.cn>,
akpm@...ux-foundation.org
Subject: Re: [PATCH mm-new v5 4/5] mm: khugepaged: skip lazy-free folios
On 24/01/26 8:52 am, Vernon Yang wrote:
> On Sat, Jan 24, 2026 at 12:32 AM Lance Yang <lance.yang@...ux.dev> wrote:
>> On 2026/1/23 23:08, Vernon Yang wrote:
>>> On Fri, Jan 23, 2026 at 5:09 PM Lance Yang <lance.yang@...ux.dev> wrote:
>>>> On 2026/1/23 16:22, Vernon Yang wrote:
>>>>> From: Vernon Yang <yanglincheng@...inos.cn>
>>>>>
>> [...]
>>
>>>>> @@ -583,6 +584,11 @@ static enum scan_result __collapse_huge_page_isolate(struct vm_area_struct *vma,
>>>>> folio = page_folio(page);
>>>>> VM_BUG_ON_FOLIO(!folio_test_anon(folio), folio);
>>>>>
>>>>> + if (!pte_dirty(pteval) && folio_test_lazyfree(folio)) {
>>>> I'm wondering if we need "cc->is_khugepaged &&" as well here?
>>>>
>>>> We should allow users to enforce collapse via the madvise_collapse()
>>>> path even if pages are marked lazyfree, IMHO.
>>> $ man madvise
>>> MADV_COLLAPSE
>>> Perform a best-effort synchronous collapse of the native pages
>>> mapped by the memory range into Transparent Huge Pages (THPs).
>>>
>>> The semantics of MADV_COLLAPSE are best-effort and do not imply to enforce
>>> collapsing, so we don't need "cc->is_khugepaged" here.
>>>
>>> We can imagine that if a user simultaneously uses MADV_FREE and
>>> MADV_COLLAPSE, it indicates a misunderstanding of their semantics.
>>> As the kernel, we need to safeguard the baseline.
>> No. Afraid I don't think so.
>>
>> To be clear, what I meant by "enforce":
>>
>> Yep, MADV_COLLAPSE is best-effort - it can fail. But when users
>> call MADV_COLLAPSE, they're explicitly asking for collapse.
>>
>> Compared to khugepaged just scanning around, that's already "enforce"
>> - users are actively requesting it, not passively waiting for.
>>
>> Note that you're *breaking* userspace. Users would not be able
>> to collapse the range where there are any lazyfree pages anymore,
>> even when they explicitly call MADV_COLLAPSE.
>>
>> For khugepaged, skipping lazyfree makes sense.
> I got your meaning, this is equivalent to two questions:
>
> 1. Does the semantics of best-effort imply any "enforce" meaning?
> 2. When madvise(MADV_FREE| MADV_COLLAPSE), do we want to collapse
> lazyfree folios?
>
> This is a semantic warning, and I'd like to hear others' opinions.
Lance is right. When user does MADV_COLLAPSE, kernel needs to try its
best to collapse. It may not be in the best interest of the user to
do MADV_FREE then MADV_COLLAPSE, but that is something the user has
to fix - kernel does not need to think about it.
Regarding "best-effort", it is best-effort in the sense that, the
madvise(MADV_COLLAPSE) is a syscall needed not for correctness,
but for optimization purposes. So it is not the end of the world
if the syscall fails. But, since the user has decided to do an
expensive operation (syscall), kernel needs to try harder to
make sure those CPU cycles weren't a waste.
>
>>>>> + result = SCAN_PAGE_LAZYFREE;
>>>>> + goto out;
>>>>> + }
>>>>> +
>>>>> /* See hpage_collapse_scan_pmd(). */
>>>>> if (folio_maybe_mapped_shared(folio)) {
>>>>> ++shared;
>>>>> @@ -1330,6 +1336,11 @@ static enum scan_result hpage_collapse_scan_pmd(struct mm_struct *mm,
>>>>> }
>>>>> folio = page_folio(page);
>>>>>
>>>>> + if (!pte_dirty(pteval) && folio_test_lazyfree(folio)) {
>>>> Ditto.
>>>>
>>>>> + result = SCAN_PAGE_LAZYFREE;
>>>>> + goto out_unmap;
>>>>> + }
>>>>> +
>>>>> if (!folio_test_anon(folio)) {
>>>>> result = SCAN_PAGE_ANON;
>>>>> goto out_unmap;
Powered by blists - more mailing lists