[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24ed11d1-4761-458f-900b-8fa79379ace2@amd.com>
Date: Sun, 28 Dec 2025 23:28:12 +0530
From: "Garg, Shivank" <shivankg@....com>
To: Wei Yang <richard.weiyang@...il.com>, Lance Yang <lance.yang@...ux.dev>
Cc: Zi Yan <ziy@...dia.com>, Andrew Morton <akpm@...ux-foundation.org>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>, Nico Pache
<npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
Dev Jain <dev.jain@....com>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
David Hildenbrand <david@...nel.org>, Barry Song <baohua@...nel.org>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH V2 2/5] mm/khugepaged: count small VMAs towards scan limit
On 12/24/2025 8:19 PM, Wei Yang wrote:
> On Wed, Dec 24, 2025 at 07:51:36PM +0800, Lance Yang wrote:
>>
>>
>> On 2025/12/24 19:13, Shivank Garg wrote:
>>> The khugepaged_scan_mm_slot() uses a 'progress' counter to limit the
>>> amount of work performed and consists of three components:
>>> 1. Transitioning to a new mm (+1).
>
> Hmm... maybe not only a new mm, but also we start another scan from last mm.
>
> Since default khugepaged_pages_to_scan is 8 PMD, it looks very possible.
>
It makes sense, will correct this.
>>> 2. Skipping an unsuitable VMA (+1).
>>> 3. Scanning a PMD-sized range (+HPAGE_PMD_NR).
>>>
>>> Consider a 1MB VMA sitting between two 2MB alignment boundaries:
>>>
>>> vma1 vma2 vma3
>>> +----------+------+----------+
>>> |2M |1M |2M |
>>> +----------+------+----------+
>>> ^ ^
>>> start end
>>> ^
>>> hstart,hend
>>>
>>> In this case, for vma2:
>>> hstart = round_up(start, HPAGE_PMD_SIZE) -> Next 2MB alignment
>>> hend = round_down(end, HPAGE_PMD_SIZE) -> Prev 2MB alignment
>>>
>>> Currently, since `hend <= hstart`, VMAs that are too small or unaligned
>>> to contain a hugepage are skipped without incrementing 'progress'.
>>> A process containing a large number of such small VMAs will unfairly
>>> consume more CPU cycles before yielding compared to a process with
>>> fewer, larger, or aligned VMAs.
>>>
>>> Fix this by incrementing progress when the `hend <= hstart` condition
>>> is met.
>>>
>>> Additionally, change 'progress' type to `unsigned int` to match both
>>> the 'pages' type and the function return value.
>>>
>>> Suggested-by: Wei Yang <richard.weiyang@...il.com>
>>> Signed-off-by: Shivank Garg <shivankg@....com>
>>> ---
>>> mm/khugepaged.c | 4 ++--
>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>>> index 107146f012b1..0b549c3250f9 100644
>>> --- a/mm/khugepaged.c
>>> +++ b/mm/khugepaged.c
>>> @@ -2403,7 +2403,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result,
>>> struct mm_slot *slot;
>>> struct mm_struct *mm;
>>> struct vm_area_struct *vma;
>>> - int progress = 0;
>>> + unsigned int progress = 0;
>>> VM_BUG_ON(!pages);
>>> lockdep_assert_held(&khugepaged_mm_lock);
>>> @@ -2447,7 +2447,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result,
>>> }
>>> hstart = round_up(vma->vm_start, HPAGE_PMD_SIZE);
>>> hend = round_down(vma->vm_end, HPAGE_PMD_SIZE);
>>> - if (khugepaged_scan.address > hend) {
>>
>> Maybe add a short comment explaining why we increment progress for small VMAs
>> ;)
>>
>> Something like this:
>>
>> /* Count small VMAs that can't hold a hugepage towards scan limit */
I'll add explanation.
>>> + if (khugepaged_scan.address > hend || hend <= hstart) {
>>> progress++;
>>> continue;
>>> }
>>
>> Otherwise, looks good to me.
>>
>> Reviewed-by: Lance Yang <lance.yang@...ux.dev>
>>
>
> The code change LGTM.
>
> Reviewed-by: Wei Yang <richard.weiyang@...il.com>
>
Thanks Lance and Wei. I have made suggested changes.
View attachment "0002-mm-khugepaged-count-small-VMAs-towards-scan-limit.patch" of type "text/plain" (2550 bytes)
Powered by blists - more mailing lists