lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24ed11d1-4761-458f-900b-8fa79379ace2@amd.com>
Date: Sun, 28 Dec 2025 23:28:12 +0530
From: "Garg, Shivank" <shivankg@....com>
To: Wei Yang <richard.weiyang@...il.com>, Lance Yang <lance.yang@...ux.dev>
Cc: Zi Yan <ziy@...dia.com>, Andrew Morton <akpm@...ux-foundation.org>,
 Baolin Wang <baolin.wang@...ux.alibaba.com>,
 "Liam R . Howlett" <Liam.Howlett@...cle.com>, Nico Pache
 <npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
 Dev Jain <dev.jain@....com>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
 David Hildenbrand <david@...nel.org>, Barry Song <baohua@...nel.org>,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH V2 2/5] mm/khugepaged: count small VMAs towards scan limit



On 12/24/2025 8:19 PM, Wei Yang wrote:
> On Wed, Dec 24, 2025 at 07:51:36PM +0800, Lance Yang wrote:
>>
>>
>> On 2025/12/24 19:13, Shivank Garg wrote:
>>> The khugepaged_scan_mm_slot() uses a 'progress' counter to limit the
>>> amount of work performed and consists of three components:
>>> 1. Transitioning to a new mm (+1).
> 
> Hmm... maybe not only a new mm, but also we start another scan from last mm.
> 
> Since default khugepaged_pages_to_scan is 8 PMD, it looks very possible.
> 
It makes sense, will correct this.

>>> 2. Skipping an unsuitable VMA (+1).
>>> 3. Scanning a PMD-sized range (+HPAGE_PMD_NR).
>>>
>>> Consider a 1MB VMA sitting between two 2MB alignment boundaries:
>>>
>>>       vma1       vma2   vma3
>>>      +----------+------+----------+
>>>      |2M        |1M    |2M        |
>>>      +----------+------+----------+
>>>                 ^      ^
>>>                 start  end
>>>                 ^
>>>            hstart,hend
>>>
>>> In this case, for vma2:
>>>    hstart = round_up(start, HPAGE_PMD_SIZE)  -> Next 2MB alignment
>>>    hend   = round_down(end, HPAGE_PMD_SIZE) -> Prev 2MB alignment
>>>
>>> Currently, since `hend <= hstart`, VMAs that are too small or unaligned
>>> to contain a hugepage are skipped without incrementing 'progress'.
>>> A process containing a large number of such small VMAs will unfairly
>>> consume more CPU cycles before yielding compared to a process with
>>> fewer, larger, or aligned VMAs.
>>>
>>> Fix this by incrementing progress when the `hend <= hstart` condition
>>> is met.
>>>
>>> Additionally, change 'progress' type to `unsigned int` to match both
>>> the 'pages' type and the function return value.
>>>
>>> Suggested-by: Wei Yang <richard.weiyang@...il.com>
>>> Signed-off-by: Shivank Garg <shivankg@....com>
>>> ---
>>>   mm/khugepaged.c | 4 ++--
>>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>>> index 107146f012b1..0b549c3250f9 100644
>>> --- a/mm/khugepaged.c
>>> +++ b/mm/khugepaged.c
>>> @@ -2403,7 +2403,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result,
>>>   	struct mm_slot *slot;
>>>   	struct mm_struct *mm;
>>>   	struct vm_area_struct *vma;
>>> -	int progress = 0;
>>> +	unsigned int progress = 0;
>>>   	VM_BUG_ON(!pages);
>>>   	lockdep_assert_held(&khugepaged_mm_lock);
>>> @@ -2447,7 +2447,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result,
>>>   		}
>>>   		hstart = round_up(vma->vm_start, HPAGE_PMD_SIZE);
>>>   		hend = round_down(vma->vm_end, HPAGE_PMD_SIZE);
>>> -		if (khugepaged_scan.address > hend) {
>>
>> Maybe add a short comment explaining why we increment progress for small VMAs
>> ;)
>>
>> Something like this:
>>
>> 		/* Count small VMAs that can't hold a hugepage towards scan limit */

I'll add explanation.

>>> +		if (khugepaged_scan.address > hend || hend <= hstart) {
>>>   			progress++;
>>>   			continue;
>>>   		}
>>
>> Otherwise, looks good to me.
>>
>> Reviewed-by: Lance Yang <lance.yang@...ux.dev>
>>
> 
> The code change LGTM.
> 
> Reviewed-by: Wei Yang <richard.weiyang@...il.com>
> 

Thanks Lance and Wei. I have made suggested changes.


View attachment "0002-mm-khugepaged-count-small-VMAs-towards-scan-limit.patch" of type "text/plain" (2550 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ