linux-kernel - Re: [v3 PATCH 2/7] mm: thp: introduce transhuge_vma_size

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHbLzkr42VJhwj5EELw4oOXxm8bHzB7AEUBa0JqWYpvNXw6oNg@mail.gmail.com>
Date:   Thu, 9 Jun 2022 17:08:38 -0700
From:   Yang Shi <shy828301@...il.com>
To:     "Zach O'Keefe" <zokeefe@...gle.com>
Cc:     Vlastimil Babka <vbabka@...e.cz>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Matthew Wilcox <willy@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linux MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [v3 PATCH 2/7] mm: thp: introduce transhuge_vma_size_ok() helper

On Thu, Jun 9, 2022 at 3:21 PM Zach O'Keefe <zokeefe@...gle.com> wrote:
>
> On Mon, Jun 6, 2022 at 2:44 PM Yang Shi <shy828301@...il.com> wrote:
> >
> > There are couple of places that check whether the vma size is ok for
> > THP or not, they are open coded and duplicate, introduce
> > transhuge_vma_size_ok() helper to do the job.
> >
> > Signed-off-by: Yang Shi <shy828301@...il.com>
> > ---
> >  include/linux/huge_mm.h | 17 +++++++++++++++++
> >  mm/huge_memory.c        |  5 +----
> >  mm/khugepaged.c         | 12 ++++++------
> >  3 files changed, 24 insertions(+), 10 deletions(-)
> >
> > diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> > index 648cb3ce7099..a8f61db47f2a 100644
> > --- a/include/linux/huge_mm.h
> > +++ b/include/linux/huge_mm.h
> > @@ -116,6 +116,18 @@ extern struct kobj_attribute shmem_enabled_attr;
> >
> >  extern unsigned long transparent_hugepage_flags;
> >
> > +/*
> > + * The vma size has to be large enough to hold an aligned HPAGE_PMD_SIZE area.
> > + */
> > +static inline bool transhuge_vma_size_ok(struct vm_area_struct *vma)
> > +{
> > +       if (round_up(vma->vm_start, HPAGE_PMD_SIZE) <
> > +           (vma->vm_end & HPAGE_PMD_MASK))
> > +               return true;
> > +
> > +       return false;
> > +}
>
> First time coming across round_up() - thanks for that - but for
> symmetry, maybe also use round_down() for the end? No strong opinion -
> just a suggestion given I've just discovered it.

Yeah, round_down is fine too.

>
> >  static inline bool transhuge_vma_suitable(struct vm_area_struct *vma,
> >                 unsigned long addr)
> >  {
> > @@ -345,6 +357,11 @@ static inline bool transparent_hugepage_active(struct vm_area_struct *vma)
> >         return false;
> >  }
> >
> > +static inline bool transhuge_vma_size_ok(struct vm_area_struct *vma)
> > +{
> > +       return false;
> > +}
> > +
> >  static inline bool transhuge_vma_suitable(struct vm_area_struct *vma,
> >                 unsigned long addr)
> >  {
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 48182c8fe151..36ada544e494 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -71,10 +71,7 @@ unsigned long huge_zero_pfn __read_mostly = ~0UL;
> >
> >  bool transparent_hugepage_active(struct vm_area_struct *vma)
> >  {
> > -       /* The addr is used to check if the vma size fits */
> > -       unsigned long addr = (vma->vm_end & HPAGE_PMD_MASK) - HPAGE_PMD_SIZE;
> > -
> > -       if (!transhuge_vma_suitable(vma, addr))
> > +       if (!transhuge_vma_size_ok(vma))
> >                 return false;
> >         if (vma_is_anonymous(vma))
> >                 return __transparent_hugepage_enabled(vma);
>
> Do we need a check for vma->vm_pgoff alignment here, after
> !vma_is_anonymous(), and now that we don't call
> transhuge_vma_suitable()?

Actually I was thinking about this too. But the THPeligible bit shown
by smaps is a little bit ambiguous for file vma. The document says:
"THPeligible" indicates whether the mapping is eligible for allocating
THP pages - 1 if true, 0 otherwise.

Even though it doesn't fulfill the alignment, it is still possible to
get THP allocated, but just can't be PMD mapped. So the old behavior
of THPeligible for file vma seems problematic, or at least doesn't
match the document.

I should elaborate this in the commit log.

>
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index 84b9cf4b9be9..d0f8020164fc 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -454,6 +454,9 @@ bool hugepage_vma_check(struct vm_area_struct *vma,
> >                                 vma->vm_pgoff, HPAGE_PMD_NR))
> >                 return false;
> >
> > +       if (!transhuge_vma_size_ok(vma))
> > +               return false;
> > +
> >         /* Enabled via shmem mount options or sysfs settings. */
> >         if (shmem_file(vma->vm_file))
> >                 return shmem_huge_enabled(vma);
> > @@ -512,9 +515,7 @@ void khugepaged_enter_vma(struct vm_area_struct *vma,
> >                           unsigned long vm_flags)
> >  {
> >         if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) &&
> > -           khugepaged_enabled() &&
> > -           (((vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK) <
> > -            (vma->vm_end & HPAGE_PMD_MASK))) {
> > +           khugepaged_enabled()) {
> >                 if (hugepage_vma_check(vma, vm_flags))
> >                         __khugepaged_enter(vma->vm_mm);
> >         }
> > @@ -2142,10 +2143,9 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages,
> >                         progress++;
> >                         continue;
> >                 }
> > -               hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
> > +
> > +               hstart = round_up(vma->vm_start, HPAGE_PMD_SIZE);
> >                 hend = vma->vm_end & HPAGE_PMD_MASK;
> > -               if (hstart >= hend)
> > -                       goto skip;
> >                 if (khugepaged_scan.address > hend)
> >                         goto skip;
> >                 if (khugepaged_scan.address < hstart)
>
> Likewise, could do round_down() here (just a suggestion)

Fine to me.

>
> > --
> > 2.26.3
> >
> >