lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4560a13c-a5cc-41ad-ae5e-ac40a0396286@suse.cz>
Date: Thu, 22 Jan 2026 09:37:18 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Harry Yoo <harry.yoo@...cle.com>
Cc: Petr Tesarik <ptesarik@...e.com>, Christoph Lameter <cl@...two.org>,
 David Rientjes <rientjes@...gle.com>,
 Roman Gushchin <roman.gushchin@...ux.dev>, Hao Li <hao.li@...ux.dev>,
 Andrew Morton <akpm@...ux-foundation.org>,
 Uladzislau Rezki <urezki@...il.com>,
 "Liam R. Howlett" <Liam.Howlett@...cle.com>,
 Suren Baghdasaryan <surenb@...gle.com>,
 Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
 Alexei Starovoitov <ast@...nel.org>, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org, linux-rt-devel@...ts.linux.dev,
 bpf@...r.kernel.org, kasan-dev@...glegroups.com
Subject: Re: [PATCH v3 17/21] slab: refill sheaves from all nodes

On 1/22/26 05:44, Harry Yoo wrote:
> On Fri, Jan 16, 2026 at 03:40:37PM +0100, Vlastimil Babka wrote:
>> __refill_objects() currently only attempts to get partial slabs from the
>> local node and then allocates new slab(s). Expand it to trying also
>> other nodes while observing the remote node defrag ratio, similarly to
>> get_any_partial().
>> 
>> This will prevent allocating new slabs on a node while other nodes have
>> many free slabs. It does mean sheaves will contain non-local objects in
>> that case. Allocations that care about specific node will still be
>> served appropriately, but might get a slowpath allocation.
>> 
>> Like get_any_partial() we do observe cpuset_zone_allowed(), although we
>> might be refilling a sheaf that will be then used from a different
>> allocation context.
>> 
>> We can also use the resulting refill_objects() in
>> __kmem_cache_alloc_bulk() for non-debug caches. This means
>> kmem_cache_alloc_bulk() will get better performance when sheaves are
>> exhausted. kmem_cache_alloc_bulk() cannot indicate a preferred node so
>> it's compatible with sheaves refill in preferring the local node.
>> Its users also have gfp flags that allow spinning, so document that
>> as a requirement.
>> 
>> Reviewed-by: Suren Baghdasaryan <surenb@...gle.com>
>> Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
>> ---
> 
> Could this cause strict_numa to not work as intended when
> the policy is MPOL_BIND?

Hm I guess it could be optimized differently later. I assume people running
strict_numa would also tune remote_node_defrag_ratio accordingly and don't
run into this often.

> alloc_from_pcs() has:
>> #ifdef CONFIG_NUMA
>>         if (static_branch_unlikely(&strict_numa) &&
>>                          node == NUMA_NO_NODE) {
>>
>>                 struct mempolicy *mpol = current->mempolicy;
>>
>>                 if (mpol) {
>>                         /*
>>                          * Special BIND rule support. If the local node
>>                          * is in permitted set then do not redirect
>>                          * to a particular node.
>>                          * Otherwise we apply the memory policy to get
>>                          * the node we need to allocate on.
>>                          */
>>                         if (mpol->mode != MPOL_BIND ||
>>                                         !node_isset(numa_mem_id(), mpol->nodes))
> 
> This assumes the sheaves contain (mostly, although it wasn't strictly
> guaranteed) objects from local node, and this change breaks that
> assumption.
> 
> So... perhaps remove "Special BIND rule support"?

Ideally we would check if the object in sheaf is from the permitted nodes
instead of picking the local one. In a way that doesn't make systems with
strict_numa disabled slower :)

>>
>>                                 node = mempolicy_slab_node(); 
>>                 }
>>         }
>> #endif
> 
> Otherwise LGTM.
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ