[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOUHufb7_sbDG7Cs_n63gySe-c5syNUPz6kYmxQvgcFim9JZ=w@mail.gmail.com>
Date: Sat, 3 Aug 2024 16:07:55 -0600
From: Yu Zhao <yuzhao@...gle.com>
To: Oliver Sang <oliver.sang@...el.com>, Muchun Song <muchun.song@...ux.dev>
Cc: Janosch Frank <frankja@...ux.ibm.com>, oe-lkp@...ts.linux.dev, lkp@...el.com,
Linux Memory Management List <linux-mm@...ck.org>, Andrew Morton <akpm@...ux-foundation.org>,
David Hildenbrand <david@...hat.com>, Frank van der Linden <fvdl@...gle.com>, Matthew Wilcox <willy@...radead.org>,
Peter Xu <peterx@...hat.com>, Yang Shi <yang@...amperecomputing.com>,
linux-kernel@...r.kernel.org, ying.huang@...el.com, feng.tang@...el.com,
fengwei.yin@...el.com, Christian Borntraeger <borntraeger@...ux.ibm.com>,
Claudio Imbrenda <imbrenda@...ux.ibm.com>, Marc Hartmayer <mhartmay@...ux.ibm.com>,
Heiko Carstens <hca@...ux.ibm.com>, Yosry Ahmed <yosryahmed@...gle.com>
Subject: Re: [linux-next:master] [mm/hugetlb_vmemmap] 875fa64577:
vm-scalability.throughput -34.3% regression
Hi Oliver,
On Fri, Jul 19, 2024 at 10:06 AM Yu Zhao <yuzhao@...gle.com> wrote:
>
> On Fri, Jul 19, 2024 at 2:44 AM Oliver Sang <oliver.sang@...el.com> wrote:
> >
> > hi, Yu Zhao,
> >
> > On Wed, Jul 17, 2024 at 09:44:33AM -0600, Yu Zhao wrote:
> > > On Wed, Jul 17, 2024 at 2:36 AM Yu Zhao <yuzhao@...gle.com> wrote:
> > > >
> > > > Hi Janosch and Oliver,
> > > >
> > > > On Wed, Jul 17, 2024 at 1:57 AM Janosch Frank <frankja@...ux.ibm.com> wrote:
> > > > >
> > > > > On 7/9/24 07:11, kernel test robot wrote:
> > > > > > Hello,
> > > > > >
> > > > > > kernel test robot noticed a -34.3% regression of vm-scalability.throughput on:
> > > > > >
> > > > > >
> > > > > > commit: 875fa64577da9bc8e9963ee14fef8433f20653e7 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers")
> > > > > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> > > > > >
> > > > > > [still regression on linux-next/master 0b58e108042b0ed28a71cd7edf5175999955b233]
> > > > > >
> > > > > This has hit s390 huge page backed KVM guests as well.
> > > > > Our simple start/stop test case went from ~5 to over 50 seconds of runtime.
> > > >
> > > > Could you try the attached patch please? Thank you.
> > >
> > > Thanks, Yosry, for spotting the following typo:
> > > flags &= VMEMMAP_SYNCHRONIZE_RCU;
> > > It's supposed to be:
> > > flags &= ~VMEMMAP_SYNCHRONIZE_RCU;
> > >
> > > Reattaching v2 with the above typo fixed. Please let me know, Janosch & Oliver.
> >
> > since the commit is in mainline now, I directly apply your v2 patch upon
> > bd225530a4c71 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers")
> >
> > in our tests, your v2 patch not only recovers the performance regression,
>
> Thanks for verifying the fix!
>
> > it even has +13.7% performance improvement than 5a4d8944d6b1e (parent of
> > bd225530a4c71)
>
> Glad to hear!
>
> (The original patch improved and regressed the performance at the same
> time, but the regression is bigger. The fix removed the regression and
> surfaced the improvement.)
Can you please run the benchmark again with the attached patch on top
of the last fix?
I spotted something else worth optimizing last time, and with the
patch attached, I was able to measure some significant improvements in
1GB hugeTLB allocation and free time, e.g., when allocating and free
700 1GB hugeTLB pages:
Before:
# time echo 700 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
real 0m13.500s
user 0m0.000s
sys 0m13.311s
# time echo 0 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
real 0m11.269s
user 0m0.000s
sys 0m11.187s
After:
# time echo 700 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
real 0m10.643s
user 0m0.001s
sys 0m10.487s
# time echo 0 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
real 0m1.541s
user 0m0.000s
sys 0m1.528s
Thanks!
Download attachment "hugetlb.patch" of type "application/octet-stream" (22480 bytes)
Powered by blists - more mailing lists