lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ad93f480-431a-4f9b-9225-136d8c6c37df@linux.alibaba.com>
Date: Tue, 29 Apr 2025 15:16:08 +0800
From: Baolin Wang <baolin.wang@...ux.alibaba.com>
To: Nico Pache <npache@...hat.com>
Cc: linux-mm@...ck.org, linux-doc@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
 akpm@...ux-foundation.org, corbet@....net, rostedt@...dmis.org,
 mhiramat@...nel.org, mathieu.desnoyers@...icios.com, david@...hat.com,
 baohua@...nel.org, ryan.roberts@....com, willy@...radead.org,
 peterx@...hat.com, ziy@...dia.com, wangkefeng.wang@...wei.com,
 usamaarif642@...il.com, sunnanyong@...wei.com, vishal.moola@...il.com,
 thomas.hellstrom@...ux.intel.com, yang@...amperecomputing.com,
 kirill.shutemov@...ux.intel.com, aarcange@...hat.com, raquini@...hat.com,
 dev.jain@....com, anshuman.khandual@....com, catalin.marinas@....com,
 tiwai@...e.de, will@...nel.org, dave.hansen@...ux.intel.com, jack@...e.cz,
 cl@...two.org, jglisse@...gle.com, surenb@...gle.com, zokeefe@...gle.com,
 hannes@...xchg.org, rientjes@...gle.com, mhocko@...e.com,
 rdunlap@...radead.org
Subject: Re: [PATCH v4 06/12] khugepaged: introduce khugepaged_scan_bitmap for
 mTHP support



On 2025/4/28 22:47, Nico Pache wrote:
> On Sat, Apr 26, 2025 at 8:52 PM Baolin Wang
> <baolin.wang@...ux.alibaba.com> wrote:
>>
>>
>>
>> On 2025/4/17 08:02, Nico Pache wrote:
>>> khugepaged scans PMD ranges for potential collapse to a hugepage. To add
>>> mTHP support we use this scan to instead record chunks of utilized
>>> sections of the PMD.
>>>
>>> khugepaged_scan_bitmap uses a stack struct to recursively scan a bitmap
>>> that represents chunks of utilized regions. We can then determine what
>>> mTHP size fits best and in the following patch, we set this bitmap while
>>> scanning the PMD.
>>>
>>> max_ptes_none is used as a scale to determine how "full" an order must
>>> be before being considered for collapse.
>>>
>>> When attempting to collapse an order that has its order set to "always"
>>> lets always collapse to that order in a greedy manner without
>>> considering the number of bits set.
>>>
>>> Signed-off-by: Nico Pache <npache@...hat.com>
>>> ---
>>>    include/linux/khugepaged.h |  4 ++
>>>    mm/khugepaged.c            | 94 ++++++++++++++++++++++++++++++++++----
>>>    2 files changed, 89 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h
>>> index 1f46046080f5..18fe6eb5051d 100644
>>> --- a/include/linux/khugepaged.h
>>> +++ b/include/linux/khugepaged.h
>>> @@ -1,6 +1,10 @@
>>>    /* SPDX-License-Identifier: GPL-2.0 */
>>>    #ifndef _LINUX_KHUGEPAGED_H
>>>    #define _LINUX_KHUGEPAGED_H
>>> +#define KHUGEPAGED_MIN_MTHP_ORDER    2
>>
>> Why is the minimum mTHP order set to 2? IMO, the file large folios can
>> support order 1, so we don't expect to collapse exec file small folios
>> to order 1 if possible?
> I should have been more specific in the patch notes, but this affects
> anonymous only. I'll go over my commit messages and make sure this is
> reflected in the next version.

OK. I am looking into how to support shmem mTHP collapse based on your 
patch series.

>> (PS: I need more time to understand your logic in this patch, and any
>> additional explanation would be helpful:) )
> 
> We are currently scanning ptes in a PMD. The core principle/reasoning
> behind the bitmap is to keep the PMD scan while saving its state. We
> then use this bitmap to determine which chunks of the PMD are active
> and are the best candidates for mTHP collapse. We start at the PMD
> level, and recursively break down the bitmap to find the appropriate
> sizes for the bitmap.
> 
> looking at a simplified example: we scan a PMD and get the following
> bitmap, 1111101101101011 (in this case MIN_MTHP_ORDER= 5, so each bit
> == 32 ptes, in the actual set each bit == 4 ptes).
> We would first attempt a PMD collapse, while checking the number of
> bits set vs the max_ptes_none tunable. If those conditions arent
> triggered, we will try the next enabled mTHP order, for each half of
> the bitmap.
> 
> ie) order 8 attempt on 11111011 and order 8 attempt on 01101011.
> 
> If a collapse succeeds we dont keep recursing on that portion of the
> bitmap. If not, we continue attempting lower orders.
> 
> Hopefully that helps you understand my logic here! Let me know if you
> need more clarification.

Thanks for your explanation. That's pretty much how I understand it.:) 
I'll give a test for your new version.

> 
> I gave a presentation on this that might help too:
> https://docs.google.com/presentation/d/1w9NYLuC2kRcMAwhcashU1LWTfmI5TIZRaTWuZq-CHEg/edit?usp=sharing&resourcekey=0-nBAGld8cP1kW26XE6i0Bpg

Unfortunately, this link requires access permission.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ