[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <pdmjsvpkl5nsntiwfwguplajq27ak3xpboq3ab77zrbu763pq7@la3hyiqigpir>
Date: Fri, 30 Jan 2026 12:50:14 +0800
From: Hao Li <hao.li@...ux.dev>
To: Vlastimil Babka <vbabka@...e.cz>
Cc: Harry Yoo <harry.yoo@...cle.com>, Petr Tesarik <ptesarik@...e.com>,
Christoph Lameter <cl@...two.org>, David Rientjes <rientjes@...gle.com>,
Roman Gushchin <roman.gushchin@...ux.dev>, Andrew Morton <akpm@...ux-foundation.org>,
Uladzislau Rezki <urezki@...il.com>, "Liam R. Howlett" <Liam.Howlett@...cle.com>,
Suren Baghdasaryan <surenb@...gle.com>, Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Alexei Starovoitov <ast@...nel.org>, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
linux-rt-devel@...ts.linux.dev, bpf@...r.kernel.org, kasan-dev@...glegroups.com,
kernel test robot <oliver.sang@...el.com>, stable@...r.kernel.org, "Paul E. McKenney" <paulmck@...nel.org>
Subject: Re: [PATCH v4 00/22] slab: replace cpu (partial) slabs with sheaves
On Thu, Jan 29, 2026 at 04:28:01PM +0100, Vlastimil Babka wrote:
>
> So previously those would become kind of double
> cached by both sheaves and cpu (partial) slabs (and thus hopefully benefited
> more than they should) since sheaves introduction in 6.18, and now they are
> not double cached anymore?
>
I've conducted new tests, and here are the details of three scenarios:
1. Checked out commit 9d4e6ab865c4, which represents the state before the
introduction of the sheaves mechanism.
2. Tested with 6.19-rc5, which includes sheaves but does not yet apply the
"sheaves for all" patchset.
3. Applied the "sheaves for all" patchset and also included the "avoid
list_lock contention" patch.
Results:
For scenario 2 (with sheaves but without "sheaves for all"), there is a
noticeable performance improvement compared to scenario 1:
will-it-scale.128.processes +34.3%
will-it-scale.192.processes +35.4%
will-it-scale.64.processes +31.5%
will-it-scale.per_process_ops +33.7%
For scenario 3 (after applying "sheaves for all"), performance slightly
regressed compared to scenario 1:
will-it-scale.128.processes -1.3%
will-it-scale.192.processes -4.2%
will-it-scale.64.processes -1.2%
will-it-scale.per_process_ops -2.1%
Analysis:
So when the sheaf size for maple nodes is set to 32 by default, the performance
of fully adopting the sheaves mechanism roughly matches the performance of the
previous approach that relied solely on the percpu slab partial list.
The performance regression observed with the "sheaves for all" patchset can
actually be explained as follows: moving from scenario 1 to scenario 2
introduces an additional cache layer, which boosts performance temporarily.
When moving from scenario 2 to scenario 3, this additional cache layer is
removed, then performance reverted to its original level.
So I think the performance of the percpu partial list and the sheaves mechanism
is roughly the same, which is consistent with our expectations.
--
Thanks,
Hao
Powered by blists - more mailing lists