[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <540a0f2e-31e0-6396-be14-d9baec608b87@bytedance.com>
Date: Wed, 2 Aug 2023 11:05:35 +0100
From: Usama Arif <usama.arif@...edance.com>
To: Mike Kravetz <mike.kravetz@...cle.com>
Cc: linux-mm@...ck.org, muchun.song@...ux.dev, rppt@...nel.org,
linux-kernel@...r.kernel.org, fam.zheng@...edance.com,
liangma@...ngbit.com, simon.evans@...edance.com,
punit.agrawal@...edance.com, Muchun Song <songmuchun@...edance.com>
Subject: Re: [External] Re: [v2 1/6] mm: hugetlb: Skip prep of tail pages when
HVO is enabled
On 01/08/2023 00:18, Mike Kravetz wrote:
> On 07/30/23 16:16, Usama Arif wrote:
>> When vmemmap is optimizable, it will free all the duplicated tail
>> pages in hugetlb_vmemmap_optimize while preparing the new hugepage.
>> Hence, there is no need to prepare them.
>>
>> For 1G x86 hugepages, it avoids preparing
>> 262144 - 64 = 262080 struct pages per hugepage.
>>
>> The indirection of using __prep_compound_gigantic_folio is also removed,
>> as it just creates extra functions to indicate demote which can be done
>> with the argument.
>>
>> Signed-off-by: Usama Arif <usama.arif@...edance.com>
>> ---
>> mm/hugetlb.c | 32 ++++++++++++++------------------
>> mm/hugetlb_vmemmap.c | 2 +-
>> mm/hugetlb_vmemmap.h | 15 +++++++++++----
>> 3 files changed, 26 insertions(+), 23 deletions(-)
>
> Thanks,
>
> I just started looking at this series. Adding Muchun on Cc:
>
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 64a3239b6407..541c07b6d60f 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -1942,14 +1942,23 @@ static void prep_new_hugetlb_folio(struct hstate *h, struct folio *folio, int ni
>> spin_unlock_irq(&hugetlb_lock);
>> }
>>
>> -static bool __prep_compound_gigantic_folio(struct folio *folio,
>> - unsigned int order, bool demote)
>> +static bool prep_compound_gigantic_folio(struct folio *folio, struct hstate *h, bool demote)
>> {
>> int i, j;
>> + int order = huge_page_order(h);
>> int nr_pages = 1 << order;
>> struct page *p;
>>
>> __folio_clear_reserved(folio);
>> +
>> + /*
>> + * No need to prep pages that will be freed later by hugetlb_vmemmap_optimize.
>> + * Hence, reduce nr_pages to the pages that will be kept.
>> + */
>> + if (IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP) &&
>> + vmemmap_should_optimize(h, &folio->page))
>
> IIUC, vmemmap_optimize_enabled (checked in vmemmap_should_optimize) can be
> modified at runtime via sysctl. If so, what prevents it from being changed
> after this check and the later call to hugetlb_vmemmap_optimize()?
Hi,
Thanks for the review.
Yes thats a good catch. The solution for this issue would be to to turn
hugetlb_free_vmemmap into a callback core_param and have a lock around
the write and when its used in gather_bootmem_prealloc, etc.
But the bigger issue during runtime is what Muchun pointed out that the
struct page refcount is not frozen to 0.
My main usecase (and maybe for others as well?) is reserving these
gigantic pages at boot time. I thought the runtime improvement might
come from free with it but doesnt look like it. Both issues could be
solved by just limiting it to boot time, as vmemmap_optimize_enabled
cannot be changed during boot time, and reference to those pages cannot
gotten by anything else as well (they aren't even initialized by
memblock after patch 6), so will include the below diff to solve both
the issues in next revision.
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 8434100f60ae..790842a6f978 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1956,7 +1956,8 @@ static bool prep_compound_gigantic_folio(struct
folio *folio, struct hstate *h,
* Hence, reduce nr_pages to the pages that will be kept.
*/
if (IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP) &&
- vmemmap_should_optimize(h, &folio->page))
+ vmemmap_should_optimize(h, &folio->page) &&
+ system_state == SYSTEM_BOOTING)
nr_pages = HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct
page);
for (i = 0; i < nr_pages; i++) {
Thanks,
Usama
Powered by blists - more mailing lists