lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f7c33974-e520-387e-9e2f-1e523bfe1545@gentwo.org>
Date: Tue, 4 Nov 2025 14:11:18 -0800 (PST)
From: "Christoph Lameter (Ampere)" <cl@...two.org>
To: Vlastimil Babka <vbabka@...e.cz>
cc: Andrew Morton <akpm@...ux-foundation.org>, 
    David Rientjes <rientjes@...gle.com>, 
    Roman Gushchin <roman.gushchin@...ux.dev>, 
    Harry Yoo <harry.yoo@...cle.com>, Uladzislau Rezki <urezki@...il.com>, 
    "Liam R. Howlett" <Liam.Howlett@...cle.com>, 
    Suren Baghdasaryan <surenb@...gle.com>, 
    Sebastian Andrzej Siewior <bigeasy@...utronix.de>, 
    Alexei Starovoitov <ast@...nel.org>, linux-mm@...ck.org, 
    linux-kernel@...r.kernel.org, linux-rt-devel@...ts.linux.dev, 
    bpf@...r.kernel.org, kasan-dev@...glegroups.com, 
    Alexander Potapenko <glider@...gle.com>, Marco Elver <elver@...gle.com>, 
    Dmitry Vyukov <dvyukov@...gle.com>
Subject: Re: [PATCH RFC 00/19] slab: replace cpu (partial) slabs with
 sheaves

On Thu, 23 Oct 2025, Vlastimil Babka wrote:

> Besides (hopefully) improved performance, this removes the rather
> complicated code related to the lockless fastpaths (using
> this_cpu_try_cmpxchg128/64) and its complications with PREEMPT_RT or
> kmalloc_nolock().

Going back to a strict LIFO scheme for alloc/free removes the following
performance features:

1. Objects are served randomly from a variety of slab pages instead of
serving all available objects from a single slab page and then from the
next. This means that the objects require a larger set of TLB entries to
cover. TLB pressure will increase.

2. The number of partial slabs will increase since the free objects in a
partial page are not used up before moving onto the next. Instead free
objects from random slab pages are used.

Spatial object locality is reduced. Temporal object hotness increases.

> The lockless slab freelist+counters update operation using
> try_cmpxchg128/64 remains and is crucial for freeing remote NUMA objects
> without repeating the "alien" array flushing of SLUB, and to allow
> flushing objects from sheaves to slabs mostly without the node
> list_lock.

Hmm... So potential cache hot objects are lost that way and reused on
another node next. The role of the alien caches in SLAB was to cover that
case and we saw performance regressions without these caches.

The method of freeing still reduces the amount of remote partial slabs
that have to be managed and increases the locality of the objects.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ