lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+CK2bBEjNjB1+G-WD-jzGqMVB8L99uLUyOXbedyYx+RLK5JKA@mail.gmail.com>
Date:   Wed, 12 Apr 2023 15:57:42 -0400
From:   Pasha Tatashin <pasha.tatashin@...een.com>
To:     David Rientjes <rientjes@...gle.com>
Cc:     linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        akpm@...ux-foundation.org, mike.kravetz@...cle.com,
        muchun.song@...ux.dev, souravpanda@...gle.com
Subject: Re: [PATCH] mm: hugetlb_vmemmap: provide stronger vmemmap allocaction gurantees

On Wed, Apr 12, 2023 at 1:54 PM David Rientjes <rientjes@...gle.com> wrote:
>
> On Wed, 12 Apr 2023, Pasha Tatashin wrote:
>
> > HugeTLB pages have a struct page optimizations where struct pages for tail
> > pages are freed. However, when HugeTLB pages are destroyed, the memory for
> > struct pages (vmemmap) need to be allocated again.
> >
> > Currently, __GFP_NORETRY flag is used to allocate the memory for vmemmap,
> > but given that this flag makes very little effort to actually reclaim
> > memory the returning of huge pages back to the system can be problem. Lets
> > use __GFP_RETRY_MAYFAIL instead. This flag is also performs graceful
> > reclaim without causing ooms, but at least it may perform a few retries,
> > and will fail only when there is genuinely little amount of unused memory
> > in the system.
> >
>
> Thanks Pasha, this definitely makes sense.  We want to free the hugetlb
> page back to the system so it would be a shame to have to strand it in the
> hugetlb pool because we can't allocate the tail pages (we want to free
> more memory than we're allocating).
>
> > Signed-off-by: Pasha Tatashin <pasha.tatashin@...een.com>
> > Suggested-by: David Rientjes <rientjes@...gle.com>
> > ---
> >  mm/hugetlb_vmemmap.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
> > index a559037cce00..c4226d2af7cc 100644
> > --- a/mm/hugetlb_vmemmap.c
> > +++ b/mm/hugetlb_vmemmap.c
> > @@ -475,9 +475,12 @@ int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head)
> >        * the range is mapped to the page which @vmemmap_reuse is mapped to.
> >        * When a HugeTLB page is freed to the buddy allocator, previously
> >        * discarded vmemmap pages must be allocated and remapping.
> > +      *
> > +      * Use __GFP_RETRY_MAYFAIL to fail only when there is genuinely little
> > +      * unused memory in the system.
> >        */
> >       ret = vmemmap_remap_alloc(vmemmap_start, vmemmap_end, vmemmap_reuse,
> > -                               GFP_KERNEL | __GFP_NORETRY | __GFP_THISNODE);
> > +                               GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GFP_THISNODE);
> >       if (!ret) {
> >               ClearHPageVmemmapOptimized(head);
> >               static_branch_dec(&hugetlb_optimize_vmemmap_key);
>
> The behavior of __GFP_RETRY_MAYFAIL is different for high-order memory (at
> least larger than PAGE_ALLOC_COSTLY_ORDER).  The order that we're
> allocating would depend on the implementation of alloc_vmemmap_page_list()
> so likely best to move the gfp mask to that function.

Thank you David. This makes sense, I will send the 2nd version soon.

Pasha

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ