[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b2f6f122-fad5-4ed2-8c24-2cf4226a60d1@126.com>
Date: Tue, 18 Feb 2025 17:06:06 +0800
From: Ge Yang <yangge1116@....com>
To: David Hildenbrand <david@...hat.com>, Muchun Song <muchun.song@...ux.dev>
Cc: akpm@...ux-foundation.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, 21cnbao@...il.com,
baolin.wang@...ux.alibaba.com, osalvador@...e.de, liuzixing@...on.cn
Subject: Re: [PATCH V2] mm/hugetlb: wait for hugepage folios to be freed
在 2025/2/18 16:52, David Hildenbrand 写道:
> On 18.02.25 07:52, Muchun Song wrote:
>>
>>
>>> On Feb 15, 2025, at 15:20, yangge1116@....com wrote:
>>>
>>> From: Ge Yang <yangge1116@....com>
>>>
>>> Since the introduction of commit b65d4adbc0f0 ("mm: hugetlb: defer
>>> freeing
>>> of HugeTLB pages"), which supports deferring the freeing of HugeTLB
>>> pages,
>>> the allocation of contiguous memory through cma_alloc() may fail
>>> probabilistically.
>>>
>>> In the CMA allocation process, if it is found that the CMA area is
>>> occupied
>>> by in-use hugepage folios, these in-use hugepage folios need to be
>>> migrated
>>> to another location. When there are no available hugepage folios in the
>>> free HugeTLB pool during the migration of in-use HugeTLB pages, new
>>> folios
>>> are allocated from the buddy system. A temporary state is set on the
>>> newly
>>> allocated folio. Upon completion of the hugepage folio migration, the
>>> temporary state is transferred from the new folios to the old folios.
>>> Normally, when the old folios with the temporary state are freed, it is
>>> directly released back to the buddy system. However, due to the deferred
>>> freeing of HugeTLB pages, the PageBuddy() check fails, ultimately
>>> leading
>>> to the failure of cma_alloc().
>>>
>>> Here is a simplified call trace illustrating the process:
>>> cma_alloc()
>>> ->__alloc_contig_migrate_range() // Migrate in-use hugepage
>>> ->unmap_and_move_huge_page()
>>> ->folio_putback_hugetlb() // Free old folios
>>> ->test_pages_isolated()
>>> ->__test_page_isolated_in_pageblock()
>>> ->PageBuddy(page) // Check if the page is in buddy
>>>
>>> To resolve this issue, we have implemented a function named
>>> wait_for_hugepage_folios_freed(). This function ensures that the
>>> hugepage
>>> folios are properly released back to the buddy system after their
>>> migration
>>> is completed. By invoking wait_for_hugepage_folios_freed() before
>>> calling
>>> PageBuddy(), we ensure that PageBuddy() will succeed.
>>>
>>> Fixes: b65d4adbc0f0 ("mm: hugetlb: defer freeing of HugeTLB pages")
>>> Signed-off-by: Ge Yang <yangge1116@....com>
>>> ---
>>>
>>> V2:
>>> - flush all folios at once suggested by David
>>>
>>> include/linux/hugetlb.h | 5 +++++
>>> mm/hugetlb.c | 8 ++++++++
>>> mm/page_isolation.c | 10 ++++++++++
>>> 3 files changed, 23 insertions(+)
>>>
>>> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
>>> index 6c6546b..04708b0 100644
>>> --- a/include/linux/hugetlb.h
>>> +++ b/include/linux/hugetlb.h
>>> @@ -697,6 +697,7 @@ bool hugetlb_bootmem_page_zones_valid(int nid,
>>> struct huge_bootmem_page *m);
>>>
>>> int isolate_or_dissolve_huge_page(struct page *page, struct list_head
>>> *list);
>>> int replace_free_hugepage_folios(unsigned long start_pfn, unsigned
>>> long end_pfn);
>>> +void wait_for_hugepage_folios_freed(void);
>>> struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma,
>>> unsigned long addr, bool cow_from_owner);
>>> struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int
>>> preferred_nid,
>>> @@ -1092,6 +1093,10 @@ static inline int
>>> replace_free_hugepage_folios(unsigned long start_pfn,
>>> return 0;
>>> }
>>>
>>> +static inline void wait_for_hugepage_folios_freed(void)
>>> +{
>>> +}
>>> +
>>> static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct
>>> *vma,
>>> unsigned long addr,
>>> bool cow_from_owner)
>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>> index 30bc34d..36dd3e4 100644
>>> --- a/mm/hugetlb.c
>>> +++ b/mm/hugetlb.c
>>> @@ -2955,6 +2955,14 @@ int replace_free_hugepage_folios(unsigned long
>>> start_pfn, unsigned long end_pfn)
>>> return ret;
>>> }
>>>
>>> +void wait_for_hugepage_folios_freed(void)
>>
>> We usually use the "hugetlb" term now instead of "huge_page" to
>> differentiate with THP. So I suggest naming it as
>> wait_for_hugetlb_folios_freed().
>
> Maybe "wait_for_freed_hugetlb_folios" or "hugetlb_wait_for_freed_folios".
>
> In general, LGTM
>
Ok, thanks.
Powered by blists - more mailing lists