lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 1 Jul 2024 19:00:04 +0800
From: Lance Yang <ioworker0@...il.com>
To: Ryan Roberts <ryan.roberts@....com>
Cc: Barry Song <baohua@...nel.org>, akpm@...ux-foundation.org, david@...hat.com, 
	baolin.wang@...ux.alibaba.com, linux-kernel@...r.kernel.org, 
	linux-mm@...ck.org
Subject: Re: [PATCH 1/2] mm: add per-order mTHP split counters

On Mon, Jul 1, 2024 at 4:16 PM Ryan Roberts <ryan.roberts@....com> wrote:
>
> On 30/06/2024 12:34, Lance Yang wrote:
> > Hi Barry,
> >
> > Thanks for following up!
> >
> > On Sun, Jun 30, 2024 at 5:48 PM Barry Song <baohua@...nel.org> wrote:
> >>
> >> On Thu, Apr 25, 2024 at 3:41 AM Ryan Roberts <ryan.roberts@....com> wrote:
> >>>
> >>> + Barry
> >>>
> >>> On 24/04/2024 14:51, Lance Yang wrote:
> >>>> At present, the split counters in THP statistics no longer include
> >>>> PTE-mapped mTHP. Therefore, this commit introduces per-order mTHP split
> >>>> counters to monitor the frequency of mTHP splits. This will assist
> >>>> developers in better analyzing and optimizing system performance.
> >>>>
> >>>> /sys/kernel/mm/transparent_hugepage/hugepages-<size>/stats
> >>>>         split_page
> >>>>         split_page_failed
> >>>>         deferred_split_page
> >>>>
> >>>> Signed-off-by: Lance Yang <ioworker0@...il.com>
> >>>> ---
> >>>>  include/linux/huge_mm.h |  3 +++
> >>>>  mm/huge_memory.c        | 14 ++++++++++++--
> >>>>  2 files changed, 15 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> >>>> index 56c7ea73090b..7b9c6590e1f7 100644
> >>>> --- a/include/linux/huge_mm.h
> >>>> +++ b/include/linux/huge_mm.h
> >>>> @@ -272,6 +272,9 @@ enum mthp_stat_item {
> >>>>       MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE,
> >>>>       MTHP_STAT_ANON_SWPOUT,
> >>>>       MTHP_STAT_ANON_SWPOUT_FALLBACK,
> >>>> +     MTHP_STAT_SPLIT_PAGE,
> >>>> +     MTHP_STAT_SPLIT_PAGE_FAILED,
> >>>> +     MTHP_STAT_DEFERRED_SPLIT_PAGE,
> >>>>       __MTHP_STAT_COUNT
> >>>>  };
> >>>>
> >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >>>> index 055df5aac7c3..52db888e47a6 100644
> >>>> --- a/mm/huge_memory.c
> >>>> +++ b/mm/huge_memory.c
> >>>> @@ -557,6 +557,9 @@ DEFINE_MTHP_STAT_ATTR(anon_fault_fallback, MTHP_STAT_ANON_FAULT_FALLBACK);
> >>>>  DEFINE_MTHP_STAT_ATTR(anon_fault_fallback_charge, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE);
> >>>>  DEFINE_MTHP_STAT_ATTR(anon_swpout, MTHP_STAT_ANON_SWPOUT);
> >>>>  DEFINE_MTHP_STAT_ATTR(anon_swpout_fallback, MTHP_STAT_ANON_SWPOUT_FALLBACK);
> >>>> +DEFINE_MTHP_STAT_ATTR(split_page, MTHP_STAT_SPLIT_PAGE);
> >>>> +DEFINE_MTHP_STAT_ATTR(split_page_failed, MTHP_STAT_SPLIT_PAGE_FAILED);
> >>>> +DEFINE_MTHP_STAT_ATTR(deferred_split_page, MTHP_STAT_DEFERRED_SPLIT_PAGE);
> >>>>
> >>>>  static struct attribute *stats_attrs[] = {
> >>>>       &anon_fault_alloc_attr.attr,
> >>>> @@ -564,6 +567,9 @@ static struct attribute *stats_attrs[] = {
> >>>>       &anon_fault_fallback_charge_attr.attr,
> >>>>       &anon_swpout_attr.attr,
> >>>>       &anon_swpout_fallback_attr.attr,
> >>>> +     &split_page_attr.attr,
> >>>> +     &split_page_failed_attr.attr,
> >>>> +     &deferred_split_page_attr.attr,
> >>>>       NULL,
> >>>>  };
> >>>>
> >>>> @@ -3083,7 +3089,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> >>>>       XA_STATE_ORDER(xas, &folio->mapping->i_pages, folio->index, new_order);
> >>>>       struct anon_vma *anon_vma = NULL;
> >>>>       struct address_space *mapping = NULL;
> >>>> -     bool is_thp = folio_test_pmd_mappable(folio);
> >>>> +     int order = folio_order(folio);
> >>>>       int extra_pins, ret;
> >>>>       pgoff_t end;
> >>>>       bool is_hzp;
> >>>> @@ -3262,8 +3268,10 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> >>>>               i_mmap_unlock_read(mapping);
> >>>>  out:
> >>>>       xas_destroy(&xas);
> >>>> -     if (is_thp)
> >>>> +     if (order >= HPAGE_PMD_ORDER)
> >>>>               count_vm_event(!ret ? THP_SPLIT_PAGE : THP_SPLIT_PAGE_FAILED);
> >>>> +     count_mthp_stat(order, !ret ? MTHP_STAT_SPLIT_PAGE :
> >>>> +                                   MTHP_STAT_SPLIT_PAGE_FAILED);
> >>>>       return ret;
> >>>>  }
> >>>>
> >>>> @@ -3327,6 +3335,8 @@ void deferred_split_folio(struct folio *folio)
> >>>>       if (list_empty(&folio->_deferred_list)) {
> >>>>               if (folio_test_pmd_mappable(folio))
> >>>>                       count_vm_event(THP_DEFERRED_SPLIT_PAGE);
> >>>> +             count_mthp_stat(folio_order(folio),
> >>>> +                             MTHP_STAT_DEFERRED_SPLIT_PAGE);
> >>>
> >>> There is a very long conversation with Barry about adding a 'global "mTHP became
> >>> partially mapped 1 or more processes" counter (inc only)', which terminates at
> >>> [1]. There is a lot of discussion about the required semantics around the need
> >>> for partial map to cover alignment and contiguity as well as whether all pages
> >>> are mapped, and to trigger once it becomes partial in at least 1 process.
> >>>
> >>> MTHP_STAT_DEFERRED_SPLIT_PAGE is giving much simpler semantics, but less
> >>> information as a result. Barry, what's your view here? I'm guessing this doesn't
> >>> quite solve what you are looking for?
> >>
> >> This doesn't quite solve what I am looking for but I still think the
> >> patch has its value.
> >>
> >> I'm looking for a solution that can:
> >>
> >>   *  Count the amount of memory in the system for each mTHP size.
> >>   *  Determine how much memory for each mTHP size is partially unmapped.
> >>
> >> For example, in a system with 16GB of memory, we might find that we have 3GB
> >> of 64KB mTHP, and within that, 512MB is partially unmapped, potentially wasting
> >> memory at this moment.  I'm uncertain whether Lance is interested in
> >> this job :-)
> >
> > Nice, that's an interesting/valuable job for me ;)
> >
> > Let's do it separately, as 'split' and friends probably can’t be the
> > solution you
> > mentioned above, IMHO.
> >
> > Hmm... I don't have a good idea about the solution for now, but will
> > think it over
> > and come back to discuss it here.
>
> I have a grad starting in a couple of weeks and I had been planning to initially
> ask him to look at this to help him get up to speed on mTHP/mm stuff. But I have
> plenty of other things for him to do if Lance wants to take this :)

I'm very happy to do that, but it doesn't have to be just me - anyone
with a better
idea can take it on ;)

Thanks,
Lance

>
> >
> >>
> >> Counting deferred_split remains valuable as it can signal whether the system is
> >> experiencing significant partial unmapping.
> >
> > Have a nice weekend!
> > Lance
> >
> >>
> >>>
> >>> [1] https://lore.kernel.org/linux-mm/6cc7d781-884f-4d8f-a175-8609732b87eb@arm.com/
> >>>
> >>> Thanks,
> >>> Ryan
> >>>
> >>>>               list_add_tail(&folio->_deferred_list, &ds_queue->split_queue);
> >>>>               ds_queue->split_queue_len++;
> >>>>  #ifdef CONFIG_MEMCG
> >>>
> >>
> >> Thanks
> >> Barry
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ