lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAA1CXcBZF4orgFh+S7xBuLun-zB55m_DF+aHG3WQuYZaJ+0fVw@mail.gmail.com>
Date: Fri, 10 Jan 2025 12:37:58 -0700
From: Nico Pache <npache@...hat.com>
To: Dev Jain <dev.jain@....com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org, ryan.roberts@....com, 
	anshuman.khandual@....com, catalin.marinas@....com, cl@...two.org, 
	vbabka@...e.cz, mhocko@...e.com, apopple@...dia.com, 
	dave.hansen@...ux.intel.com, will@...nel.org, baohua@...nel.org, jack@...e.cz, 
	srivatsa@...il.mit.edu, haowenchao22@...il.com, hughd@...gle.com, 
	aneesh.kumar@...nel.org, yang@...amperecomputing.com, peterx@...hat.com, 
	ioworker0@...il.com, wangkefeng.wang@...wei.com, ziy@...dia.com, 
	jglisse@...gle.com, surenb@...gle.com, vishal.moola@...il.com, 
	zokeefe@...gle.com, zhengqi.arch@...edance.com, jhubbard@...dia.com, 
	21cnbao@...il.com, willy@...radead.org, kirill.shutemov@...ux.intel.com, 
	david@...hat.com, aarcange@...hat.com, raquini@...hat.com, 
	sunnanyong@...wei.com, usamaarif642@...il.com, audra@...hat.com, 
	akpm@...ux-foundation.org
Subject: Re: [RFC 03/11] khugepaged: Don't allocate khugepaged mm_slot early

On Thu, Jan 9, 2025 at 11:11 PM Dev Jain <dev.jain@....com> wrote:
>
>
>
> On 09/01/25 5:01 am, Nico Pache wrote:
> > We should only "enter"/allocate the khugepaged mm_slot if we succeed at
> > allocating the PMD sized folio. Move the khugepaged_enter_vma call until
> > after we know the vma_alloc_folio was successful.
>
> Why? We have the appropriate checks from thp_vma_allowable_orders() and
> friends, so the VMA should be registered with khugepaged irrespective of
> whether during fault time we are able to allocate a PMD-THP or not. If
> we fail at fault time, it is the job of khugepaged to try to collapse it
> later.

That's a fair point. This was written a while back when I first
started looking into khugepaged. I believe the current schema for
khugepaged_enter_vma is to only register when there is a mapping large
enough for khugepaged to work on. I'd like to remove this restriction
in the future to simplify the entry points of khugepaged. Currently we
need to add these khugepaged_enter_vma functions all over the place,
ideally we just register everything with khugepaged.

Either way, you are correct, even if we FALLBACK, the mapping would
still be eligible for promotion in the future.

Ill drop this patch. Thanks!

> >
> > Signed-off-by: Nico Pache <npache@...hat.com>
> > ---
> >   mm/huge_memory.c | 3 +--
> >   1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index e53d83b3e5cf..635c65e7ef63 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -1323,7 +1323,6 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf)
> >       ret = vmf_anon_prepare(vmf);
> >       if (ret)
> >               return ret;
> > -     khugepaged_enter_vma(vma, vma->vm_flags);
> >
> >       if (!(vmf->flags & FAULT_FLAG_WRITE) &&
> >                       !mm_forbids_zeropage(vma->vm_mm) &&
> > @@ -1365,7 +1364,7 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf)
> >               }
> >               return ret;
> >       }
> > -
> > +     khugepaged_enter_vma(vma, vma->vm_flags);
> >       return __do_huge_pmd_anonymous_page(vmf);
> >   }
> >
>
> In any case, you are not achieving what you described in the patch
> description: you have moved khugepaged_enter_vma() after the read fault
> logic, what you want to do is to move it after
> vma_alloc_anon_folio_pmd() in __do_huge_pmd_anonymous_page().
Good catch! This was a byproduct of a rebase... back when i wrote this
the vma_alloc_folio was in the do_huge_pmd_anonymous_page function.
>


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ