linux-kernel - Re: [linux-next:master] [mm/hugetlb_vmemmap] 875fa64577: vm-scalability.throughput -34.3% regression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAOUHufb7_sbDG7Cs_n63gySe-c5syNUPz6kYmxQvgcFim9JZ=w@mail.gmail.com>
Date: Sat, 3 Aug 2024 16:07:55 -0600
From: Yu Zhao <yuzhao@...gle.com>
To: Oliver Sang <oliver.sang@...el.com>, Muchun Song <muchun.song@...ux.dev>
Cc: Janosch Frank <frankja@...ux.ibm.com>, oe-lkp@...ts.linux.dev, lkp@...el.com, 
	Linux Memory Management List <linux-mm@...ck.org>, Andrew Morton <akpm@...ux-foundation.org>, 
	David Hildenbrand <david@...hat.com>, Frank van der Linden <fvdl@...gle.com>, Matthew Wilcox <willy@...radead.org>, 
	Peter Xu <peterx@...hat.com>, Yang Shi <yang@...amperecomputing.com>, 
	linux-kernel@...r.kernel.org, ying.huang@...el.com, feng.tang@...el.com, 
	fengwei.yin@...el.com, Christian Borntraeger <borntraeger@...ux.ibm.com>, 
	Claudio Imbrenda <imbrenda@...ux.ibm.com>, Marc Hartmayer <mhartmay@...ux.ibm.com>, 
	Heiko Carstens <hca@...ux.ibm.com>, Yosry Ahmed <yosryahmed@...gle.com>
Subject: Re: [linux-next:master] [mm/hugetlb_vmemmap] 875fa64577:
 vm-scalability.throughput -34.3% regression

Hi Oliver,

On Fri, Jul 19, 2024 at 10:06 AM Yu Zhao <yuzhao@...gle.com> wrote:
>
> On Fri, Jul 19, 2024 at 2:44 AM Oliver Sang <oliver.sang@...el.com> wrote:
> >
> > hi, Yu Zhao,
> >
> > On Wed, Jul 17, 2024 at 09:44:33AM -0600, Yu Zhao wrote:
> > > On Wed, Jul 17, 2024 at 2:36 AM Yu Zhao <yuzhao@...gle.com> wrote:
> > > >
> > > > Hi Janosch and Oliver,
> > > >
> > > > On Wed, Jul 17, 2024 at 1:57 AM Janosch Frank <frankja@...ux.ibm.com> wrote:
> > > > >
> > > > > On 7/9/24 07:11, kernel test robot wrote:
> > > > > > Hello,
> > > > > >
> > > > > > kernel test robot noticed a -34.3% regression of vm-scalability.throughput on:
> > > > > >
> > > > > >
> > > > > > commit: 875fa64577da9bc8e9963ee14fef8433f20653e7 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers")
> > > > > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> > > > > >
> > > > > > [still regression on linux-next/master 0b58e108042b0ed28a71cd7edf5175999955b233]
> > > > > >
> > > > > This has hit s390 huge page backed KVM guests as well.
> > > > > Our simple start/stop test case went from ~5 to over 50 seconds of runtime.
> > > >
> > > > Could you try the attached patch please? Thank you.
> > >
> > > Thanks, Yosry, for spotting the following typo:
> > >   flags &= VMEMMAP_SYNCHRONIZE_RCU;
> > > It's supposed to be:
> > >   flags &= ~VMEMMAP_SYNCHRONIZE_RCU;
> > >
> > > Reattaching v2 with the above typo fixed. Please let me know, Janosch & Oliver.
> >
> > since the commit is in mainline now, I directly apply your v2 patch upon
> > bd225530a4c71 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers")
> >
> > in our tests, your v2 patch not only recovers the performance regression,
>
> Thanks for verifying the fix!
>
> > it even has +13.7% performance improvement than 5a4d8944d6b1e (parent of
> > bd225530a4c71)
>
> Glad to hear!
>
> (The original patch improved and regressed the performance at the same
> time, but the regression is bigger. The fix removed the regression and
> surfaced the improvement.)

Can you please run the benchmark again with the attached patch on top
of the last fix?

I spotted something else worth optimizing last time, and with the
patch attached, I was able to measure some significant improvements in
1GB hugeTLB allocation and free time, e.g., when allocating and free
700 1GB hugeTLB pages:

Before:
  # time echo 700 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
  real  0m13.500s
  user  0m0.000s
  sys   0m13.311s

  # time echo 0 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
  real  0m11.269s
  user  0m0.000s
  sys   0m11.187s


After:
  # time echo 700 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
  real  0m10.643s
  user  0m0.001s
  sys   0m10.487s

  # time echo 0 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
  real  0m1.541s
  user  0m0.000s
  sys   0m1.528s

Thanks!

Download attachment "hugetlb.patch" of type "application/octet-stream" (22480 bytes)