linux-kernel - Re: [PATCH] mm: hugetlb_vmemmap: provide stronger vmemmap allocaction gurantees

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <63736432-5cef-f67c-c809-cc19b236a7f4@google.com>
Date:   Wed, 12 Apr 2023 10:54:52 -0700 (PDT)
From:   David Rientjes <rientjes@...gle.com>
To:     Pasha Tatashin <pasha.tatashin@...een.com>
cc:     linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        akpm@...ux-foundation.org, mike.kravetz@...cle.com,
        muchun.song@...ux.dev, souravpanda@...gle.com
Subject: Re: [PATCH] mm: hugetlb_vmemmap: provide stronger vmemmap allocaction
 gurantees

On Wed, 12 Apr 2023, Pasha Tatashin wrote:

> HugeTLB pages have a struct page optimizations where struct pages for tail
> pages are freed. However, when HugeTLB pages are destroyed, the memory for
> struct pages (vmemmap) need to be allocated again.
> 
> Currently, __GFP_NORETRY flag is used to allocate the memory for vmemmap,
> but given that this flag makes very little effort to actually reclaim
> memory the returning of huge pages back to the system can be problem. Lets
> use __GFP_RETRY_MAYFAIL instead. This flag is also performs graceful
> reclaim without causing ooms, but at least it may perform a few retries,
> and will fail only when there is genuinely little amount of unused memory
> in the system.
> 

Thanks Pasha, this definitely makes sense.  We want to free the hugetlb 
page back to the system so it would be a shame to have to strand it in the 
hugetlb pool because we can't allocate the tail pages (we want to free 
more memory than we're allocating).

> Signed-off-by: Pasha Tatashin <pasha.tatashin@...een.com>
> Suggested-by: David Rientjes <rientjes@...gle.com>
> ---
>  mm/hugetlb_vmemmap.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> index a559037cce00..c4226d2af7cc 100644
> --- a/mm/hugetlb_vmemmap.c
> +++ b/mm/hugetlb_vmemmap.c
> @@ -475,9 +475,12 @@ int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head)
>  	 * the range is mapped to the page which @vmemmap_reuse is mapped to.
>  	 * When a HugeTLB page is freed to the buddy allocator, previously
>  	 * discarded vmemmap pages must be allocated and remapping.
> +	 *
> +	 * Use __GFP_RETRY_MAYFAIL to fail only when there is genuinely little
> +	 * unused memory in the system.
>  	 */
>  	ret = vmemmap_remap_alloc(vmemmap_start, vmemmap_end, vmemmap_reuse,
> -				  GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE);
> +				  GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE);
>  	if (!ret) {
>  		ClearHPageVmemmapOptimized(head);
>  		static_branch_dec(&hugetlb_optimize_vmemmap_key);

The behavior of __GFP_RETRY_MAYFAIL is different for high-order memory (at 
least larger than PAGE_ALLOC_COSTLY_ORDER).  The order that we're 
allocating would depend on the implementation of alloc_vmemmap_page_list() 
so likely best to move the gfp mask to that function.