linux-kernel - Re: [RFC -next] memcg: Optimize creation performance when LRU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20251126171513.GC135004@cmpxchg.org>
Date: Wed, 26 Nov 2025 12:15:13 -0500
From: Johannes Weiner <hannes@...xchg.org>
To: Chen Ridong <chenridong@...weicloud.com>
Cc: akpm@...ux-foundation.org, david@...nel.org, lorenzo.stoakes@...cle.com,
	Liam.Howlett@...cle.com, vbabka@...e.cz, rppt@...nel.org,
	surenb@...gle.com, mhocko@...e.com, axelrasmussen@...gle.com,
	yuanchu@...gle.com, weixugc@...gle.com, zhengqi.arch@...edance.com,
	shakeel.butt@...ux.dev, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, lujialin4@...wei.com,
	chenridong@...wei.com
Subject: Re: [RFC -next] memcg: Optimize creation performance when LRU_GEN is
 enabled

On Wed, Nov 19, 2025 at 08:37:22AM +0000, Chen Ridong wrote:
> From: Chen Ridong <chenridong@...wei.com>
> 
> With LRU_GEN=y and LRU_GEN_ENABLED=n, a performance regression occurs
> when creating a large number of memory cgroups (memcgs):
> 
> 	# time mkdir testcg_{1..10000}
> 
> 	real	0m7.167s
> 	user	0m0.037s
> 	sys	0m6.773s
> 
> 	# time mkdir testcg_{1..20000}
> 
> 	real	0m27.158s
> 	user	0m0.079s
> 	sys	0m26.270s
> 
> In contrast, with LRU_GEN=n, creation of the same number of memcgs
> performs better:
> 
> 	# time mkdir testcg_{1..10000}
> 
> 	real	0m3.386s
> 	user	0m0.044s
> 	sys	0m3.009s
> 
> 	# time mkdir testcg_{1..20000}
> 
> 	real	0m6.876s
> 	user	0m0.075s
> 	sys	0m6.121s
> 
> The root cause is that lru_gen node onlining uses hlist_nulls_add_tail_rcu,
> which traverses the entire list to find the tail. This traversal scales
> with the number of memcgs, even when LRU_GEN is runtime-disabled.

Can you please look into removing the memcg LRU instead?

Use mem_cgroup_iter() with a reclaim cookie in shrink_many(), like we
do in shrink_node_memcgs().

The memcg LRU is complicated, and it only works for global reclaim; if
you have a subtree with a memory.max at the top, it'll go through
shrink_node_memcgs() already anyway.