lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHbLzkpsdHDK3KPS4M-TCMWt7svL40nkrgE3G43FO+b=s+tD_Q@mail.gmail.com>
Date: Thu, 1 Feb 2024 10:56:19 -0800
From: Yang Shi <shy828301@...il.com>
To: Lance Yang <ioworker0@...il.com>
Cc: akpm@...ux-foundation.org, mhocko@...e.com, zokeefe@...gle.com, 
	david@...hat.com, songmuchun@...edance.com, peterx@...hat.com, 
	minchan@...nel.org, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/1] mm/khugepaged: bypassing unnecessary scans with
 MMF_DISABLE_THP check

On Wed, Jan 31, 2024 at 5:13 PM Lance Yang <ioworker0@...il.com> wrote:
>
> Hey Yang,
>
> Thank you for the clarification.
>
> You're correct. If the daemon calls prctl with
> MMF_DISABLE_THP before fork, the child
> mm won't be on the hash list.
>
> What I meant is that the daemon mm might
> already be on the hash list before fork.
> Therefore, khugepaged might still scan the
> address space for the daemon.

OK, I thought you don't care about the daemon since you mentioned the
daemon would call prctl to disable THP or enable THP for different
children, so the daemon's THP preference may be constantly changed or
doesn't matter at all.

So the actual cost is actually traversing the maple tree for the
daemon. Does the daemon have excessive vmas? I'm not sure whether the
improvement is noticeable or not.

>
> Thanks,
> Lance
>
> On Thu, Feb 1, 2024 at 4:06 AM Yang Shi <shy828301@...il.com> wrote:
> >
> > On Wed, Jan 31, 2024 at 1:30 AM Lance Yang <ioworker0@...il.com> wrote:
> > >
> > > Updating the change log.
> > >
> > > khugepaged scans the entire address space in the
> > > background for each given mm, looking for
> > > opportunities to merge sequences of basic pages
> > > into huge pages. However, when an mm is inserted
> > > to the mm_slots list, and the MMF_DISABLE_THP
> > > flag is set later, this scanning process becomes
> > > unnecessary for that mm and can be skipped to
> > > avoid redundant operations, especially in scenarios
> > > with a large address space.
> > >
> > > This commit introduces a check before each scanning
> > > process to test the MMF_DISABLE_THP flag for the
> > > given mm; if the flag is set, the scanning process is
> > > bypassed, thereby improving the efficiency of khugepaged.
> > >
> > > This optimization is not a correctness issue but rather an
> > > enhancement to save expensive checks on each VMA
> > > when userspace cannot prctl itself before spawning
> > > into the new process.
> >
> > If this is an optimization, you'd better show some real numbers to help justify.
> >
> > >
> > > On some servers within our company, we deploy a
> > > daemon responsible for monitoring and updating local
> > > applications. Some applications prefer not to use THP,
> > > so the daemon calls prctl to disable THP before fork/exec.
> > > Conversely, for other applications, the daemon calls prctl
> > > to enable THP before fork/exec.
> >
> > If your daemon calls prctl with MMF_DISABLE_THP before fork, then you
> > end up having the child mm on the hash list in the first place, I
> > think it should be a bug in khugepaged_fork() IIUC. khugepaged_fork()
> > should check this flag and bail out if it is set. Did I miss
> > something?
> >
> > >
> > > Ideally, the daemon should invoke prctl after the fork,
> > > but its current implementation follows the described
> > > approach. In the Go standard library, there is no direct
> > > encapsulation of the fork system call; instead, fork and
> > > execve are combined into one through syscall.ForkExec.
> > >
> > > Thanks,
> > > Lance
> > >
> > > On Mon, Jan 29, 2024 at 1:46 PM Lance Yang <ioworker0@...il.com> wrote:
> > > >
> > > > khugepaged scans the entire address space in the
> > > > background for each given mm, looking for
> > > > opportunities to merge sequences of basic pages
> > > > into huge pages. However, when an mm is inserted
> > > > to the mm_slots list, and the MMF_DISABLE_THP flag
> > > > is set later, this scanning process becomes
> > > > unnecessary for that mm and can be skipped to avoid
> > > > redundant operations, especially in scenarios with
> > > > a large address space.
> > > >
> > > > This commit introduces a check before each scanning
> > > > process to test the MMF_DISABLE_THP flag for the
> > > > given mm; if the flag is set, the scanning process
> > > > is bypassed, thereby improving the efficiency of
> > > > khugepaged.
> > > >
> > > > Signed-off-by: Lance Yang <ioworker0@...il.com>
> > > > ---
> > > >  mm/khugepaged.c | 18 ++++++++++++------
> > > >  1 file changed, 12 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > > > index 2b219acb528e..d6a700834edc 100644
> > > > --- a/mm/khugepaged.c
> > > > +++ b/mm/khugepaged.c
> > > > @@ -410,6 +410,12 @@ static inline int hpage_collapse_test_exit(struct mm_struct *mm)
> > > >         return atomic_read(&mm->mm_users) == 0;
> > > >  }
> > > >
> > > > +static inline int hpage_collapse_test_exit_or_disable(struct mm_struct *mm)
> > > > +{
> > > > +       return hpage_collapse_test_exit(mm) ||
> > > > +              test_bit(MMF_DISABLE_THP, &mm->flags);
> > > > +}
> > > > +
> > > >  void __khugepaged_enter(struct mm_struct *mm)
> > > >  {
> > > >         struct khugepaged_mm_slot *mm_slot;
> > > > @@ -1422,7 +1428,7 @@ static void collect_mm_slot(struct khugepaged_mm_slot *mm_slot)
> > > >
> > > >         lockdep_assert_held(&khugepaged_mm_lock);
> > > >
> > > > -       if (hpage_collapse_test_exit(mm)) {
> > > > +       if (hpage_collapse_test_exit_or_disable(mm)) {
> > > >                 /* free mm_slot */
> > > >                 hash_del(&slot->hash);
> > > >                 list_del(&slot->mm_node);
> > > > @@ -2360,7 +2366,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result,
> > > >                 goto breakouterloop_mmap_lock;
> > > >
> > > >         progress++;
> > > > -       if (unlikely(hpage_collapse_test_exit(mm)))
> > > > +       if (unlikely(hpage_collapse_test_exit_or_disable(mm)))
> > > >                 goto breakouterloop;
> > > >
> > > >         vma_iter_init(&vmi, mm, khugepaged_scan.address);
> > > > @@ -2368,7 +2374,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result,
> > > >                 unsigned long hstart, hend;
> > > >
> > > >                 cond_resched();
> > > > -               if (unlikely(hpage_collapse_test_exit(mm))) {
> > > > +               if (unlikely(hpage_collapse_test_exit_or_disable(mm))) {
> > > >                         progress++;
> > > >                         break;
> > > >                 }
> > > > @@ -2390,7 +2396,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result,
> > > >                         bool mmap_locked = true;
> > > >
> > > >                         cond_resched();
> > > > -                       if (unlikely(hpage_collapse_test_exit(mm)))
> > > > +                       if (unlikely(hpage_collapse_test_exit_or_disable(mm)))
> > > >                                 goto breakouterloop;
> > > >
> > > >                         VM_BUG_ON(khugepaged_scan.address < hstart ||
> > > > @@ -2408,7 +2414,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result,
> > > >                                 fput(file);
> > > >                                 if (*result == SCAN_PTE_MAPPED_HUGEPAGE) {
> > > >                                         mmap_read_lock(mm);
> > > > -                                       if (hpage_collapse_test_exit(mm))
> > > > +                                       if (hpage_collapse_test_exit_or_disable(mm))
> > > >                                                 goto breakouterloop;
> > > >                                         *result = collapse_pte_mapped_thp(mm,
> > > >                                                 khugepaged_scan.address, false);
> > > > @@ -2450,7 +2456,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result,
> > > >          * Release the current mm_slot if this mm is about to die, or
> > > >          * if we scanned all vmas of this mm.
> > > >          */
> > > > -       if (hpage_collapse_test_exit(mm) || !vma) {
> > > > +       if (hpage_collapse_test_exit_or_disable(mm) || !vma) {
> > > >                 /*
> > > >                  * Make sure that if mm_users is reaching zero while
> > > >                  * khugepaged runs here, khugepaged_exit will find
> > > > --
> > > > 2.33.1
> > > >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ