linux-kernel - Re: Benchmarking [PATCH v5 00/14] SLUB percpu sheaves

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aM1xlYbvp13rOs6r@milan>
Date: Fri, 19 Sep 2025 17:07:01 +0200
From: Uladzislau Rezki <urezki@...il.com>
To: "Liam R. Howlett" <Liam.Howlett@...cle.com>,
	Uladzislau Rezki <urezki@...il.com>,
	Suren Baghdasaryan <surenb@...gle.com>,
	Vlastimil Babka <vbabka@...e.cz>, paulmck@...nel.org,
	Jan Engelhardt <ej@...i.de>,
	Sudarsan Mahendran <sudarsanm@...gle.com>, cl@...two.org,
	harry.yoo@...cle.com, howlett@...il.com,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	maple-tree@...ts.infradead.org, rcu@...r.kernel.org,
	rientjes@...gle.com, roman.gushchin@...ux.dev
Subject: Re: Benchmarking [PATCH v5 00/14] SLUB percpu sheaves

On Thu, Sep 18, 2025 at 11:29:14AM -0400, Liam R. Howlett wrote:
> * Uladzislau Rezki <urezki@...il.com> [250918 07:50]:
> > On Wed, Sep 17, 2025 at 04:59:41PM -0700, Suren Baghdasaryan wrote:
> > > On Wed, Sep 17, 2025 at 9:14 AM Suren Baghdasaryan <surenb@...gle.com> wrote:
> > > >
> > > > On Tue, Sep 16, 2025 at 10:19 PM Uladzislau Rezki <urezki@...il.com> wrote:
> > > > >
> > > > > On Tue, Sep 16, 2025 at 10:09:18AM -0700, Suren Baghdasaryan wrote:
> > > > > > On Mon, Sep 15, 2025 at 8:22 AM Vlastimil Babka <vbabka@...e.cz> wrote:
> > > > > > >
> > > > > > > On 9/15/25 14:13, Paul E. McKenney wrote:
> > > > > > > > On Mon, Sep 15, 2025 at 09:51:25AM +0200, Jan Engelhardt wrote:
> > > > > > > >>
> > > > > > > >> On Saturday 2025-09-13 02:09, Sudarsan Mahendran wrote:
> > > > > > > >> >
> > > > > > > >> >Summary of the results:
> > > > > > >
> > > > > > > In any case, thanks a lot for the results!
> > > > > > >
> > > > > > > >> >- Significant change (meaning >10% difference
> > > > > > > >> >  between base and experiment) on will-it-scale
> > > > > > > >> >  tests in AMD.
> > > > > > > >> >
> > > > > > > >> >Summary of AMD will-it-scale test changes:
> > > > > > > >> >
> > > > > > > >> >Number of runs : 15
> > > > > > > >> >Direction      : + is good
> > > > > > > >>
> > > > > > > >> If STDDEV grows more than mean, there is more jitter,
> > > > > > > >> which is not "good".
> > > > > > > >
> > > > > > > > This is true.  On the other hand, the mean grew way more in absolute
> > > > > > > > terms than did STDDEV.  So might this be a reasonable tradeoff?
> > > > > > >
> > > > > > > Also I'd point out that MIN of TEST is better than MAX of BASE, which means
> > > > > > > there's always an improvement for this config. So jitter here means it's
> > > > > > > changing between better and more better :) and not between worse and (more)
> > > > > > > better.
> > > > > > >
> > > > > > > The annoying part of course is that for other configs it's consistently the
> > > > > > > opposite.
> > > > > >
> > > > > > Hi Vlastimil,
> > > > > > I ran my mmap stress test that runs 20000 cycles of mmapping 50 VMAs,
> > > > > > faulting them in then unmapping and timing only mmap and munmap calls.
> > > > > > This is not a realistic scenario but works well for A/B comparison.
> > > > > >
> > > > > > The numbers are below with sheaves showing a clear improvement:
> > > > > >
> > > > > > Baseline
> > > > > >             avg             stdev
> > > > > > mmap        2.621073        0.2525161631
> > > > > > munmap      2.292965        0.008831973052
> > > > > > total       4.914038        0.2572620923
> > > > > >
> > > > > > Sheaves
> > > > > >             avg            stdev           avg_diff        stdev_diff
> > > > > > mmap        1.561220667    0.07748897037   -40.44%        -69.31%
> > > > > > munmap      2.042071       0.03603083448   -10.94%        307.96%
> > > > > > total       3.603291667    0.113209047     -26.67%        -55.99%
> > > > > >
> > > > > Could you run your test with dropping below patch?
> > > >
> > > > Sure, will try later today and report.
> > > 
> > > Sheaves with [04/23] patch reverted:
> > > 
> > >             avg             avg_diff
> > > mmap     2.143948        -18.20%
> > > munmap     2.343707        2.21%
> > > total     4.487655        -8.68%
> > > 
> > With offloading over sheaves the mmap/munmap is faster, i assume it is
> > because of same objects are reused from the sheaves after reclaim. Whereas we,
> > kvfree_rcu() just free them.
> 
> Sorry, I am having trouble following where you think the speed up is
> coming from.
> 
> Can you clarify what you mean by offloading and reclaim in this context?
> 
[1] <Sheaves series>
             avg            stdev           avg_diff        stdev_diff
 mmap        1.561220667    0.07748897037   -40.44%        -69.31%
 munmap      2.042071       0.03603083448   -10.94%        307.96%
 total       3.603291667    0.113209047     -26.67%        -55.99%
[1] <Sheaves series>

[2] <Sheaves series but with [04/23] patch reverted>
             avg             avg_diff
 mmap     2.143948        -18.20%
 munmap     2.343707        2.21%
 total     4.487655        -8.68%
[2] <Sheaves series but with [04/23] patch reverted>

I meant those two data results. It is comparison of freeing over
or to "sheaves" and without it in the kvfree_rcu() path.

--
Uladzislau Rezki