[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHbLzkq12j+KSLegxbepzjAkOz1SE-7w5OuKwxarp_Lh+d0MOQ@mail.gmail.com>
Date: Tue, 16 Jan 2024 12:57:41 -0800
From: Yang Shi <shy828301@...il.com>
To: "Zach O'Keefe" <zokeefe@...gle.com>
Cc: Yin Fengwei <fengwei.yin@...el.com>, oliver.sang@...el.com, riel@...riel.com,
willy@...radead.org, cl@...ux.com, ying.huang@...el.com,
akpm@...ux-foundation.org, linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 2/2] mm: mmap: map MAP_STACK to VM_NOHUGEPAGE
On Tue, Jan 16, 2024 at 11:22 AM Zach O'Keefe <zokeefe@...gle.com> wrote:
>
> Thanks Yang,
>
> Should this be marked for stable? Given how easily it is for pthreads
> to allocate hugepages w/o this change, it can easily cause memory
> bloat on larger systems and/or users with high thread counts. I don't
> think that will be welcomed, and seems odd that just 6.7 should suffer
> this.
Thanks for the suggestion, fine to me.
>
> Thanks,
> Zach
>
> On Tue, Jan 9, 2024 at 5:36 PM Yin Fengwei <fengwei.yin@...el.com> wrote:
> >
> >
> >
> > On 2023/12/21 14:59, Yang Shi wrote:
> > > From: Yang Shi <yang@...amperecomputing.com>
> > >
> > > The commit efa7df3e3bb5 ("mm: align larger anonymous mappings on THP
> > > boundaries") incured regression for stress-ng pthread benchmark [1].
> > > It is because THP get allocated to pthread's stack area much more possible
> > > than before. Pthread's stack area is allocated by mmap without VM_GROWSDOWN
> > > or VM_GROWSUP flag, so kernel can't tell whether it is a stack area or not.
> > >
> > > The MAP_STACK flag is used to mark the stack area, but it is a no-op on
> > > Linux. Mapping MAP_STACK to VM_NOHUGEPAGE to prevent from allocating
> > > THP for such stack area.
> > >
> > > With this change the stack area looks like:
> > >
> > > fffd18e10000-fffd19610000 rw-p 00000000 00:00 0
> > > Size: 8192 kB
> > > KernelPageSize: 4 kB
> > > MMUPageSize: 4 kB
> > > Rss: 12 kB
> > > Pss: 12 kB
> > > Pss_Dirty: 12 kB
> > > Shared_Clean: 0 kB
> > > Shared_Dirty: 0 kB
> > > Private_Clean: 0 kB
> > > Private_Dirty: 12 kB
> > > Referenced: 12 kB
> > > Anonymous: 12 kB
> > > KSM: 0 kB
> > > LazyFree: 0 kB
> > > AnonHugePages: 0 kB
> > > ShmemPmdMapped: 0 kB
> > > FilePmdMapped: 0 kB
> > > Shared_Hugetlb: 0 kB
> > > Private_Hugetlb: 0 kB
> > > Swap: 0 kB
> > > SwapPss: 0 kB
> > > Locked: 0 kB
> > > THPeligible: 0
> > > VmFlags: rd wr mr mw me ac nh
> > >
> > > The "nh" flag is set.
> > >
> > > [1] https://lore.kernel.org/linux-mm/202312192310.56367035-oliver.sang@intel.com/
> > >
> > > Reported-by: kernel test robot <oliver.sang@...el.com>
> > > Tested-by: Oliver Sang <oliver.sang@...el.com>
> > > Cc: Yin Fengwei <fengwei.yin@...el.com>
> > > Cc: Rik van Riel <riel@...riel.com>
> > > Cc: Matthew Wilcox <willy@...radead.org>
> > > Cc: Christopher Lameter <cl@...ux.com>
> > > Cc: Huang, Ying <ying.huang@...el.com>
> > > Signed-off-by: Yang Shi <yang@...amperecomputing.com>
> >
> > Reviewed-by: Yin Fengwei <fengwei.yin@...el.com>
> >
> > > ---
> > > include/linux/mman.h | 1 +
> > > 1 file changed, 1 insertion(+)
> > >
> > > diff --git a/include/linux/mman.h b/include/linux/mman.h
> > > index 40d94411d492..dc7048824be8 100644
> > > --- a/include/linux/mman.h
> > > +++ b/include/linux/mman.h
> > > @@ -156,6 +156,7 @@ calc_vm_flag_bits(unsigned long flags)
> > > return _calc_vm_trans(flags, MAP_GROWSDOWN, VM_GROWSDOWN ) |
> > > _calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) |
> > > _calc_vm_trans(flags, MAP_SYNC, VM_SYNC ) |
> > > + _calc_vm_trans(flags, MAP_STACK, VM_NOHUGEPAGE) |
> > > arch_calc_vm_flag_bits(flags);
> > > }
> > >
> >
Powered by blists - more mailing lists