linux-kernel - Re: [RFC -next] memcg: Optimize creation performance when LRU

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <c8c2d782-e67d-4f4f-917d-6cb198fe1175@huaweicloud.com>
Date: Tue, 6 Jan 2026 21:27:42 +0800
From: Chen Ridong <chenridong@...weicloud.com>
To: Johannes Weiner <hannes@...xchg.org>
Cc: akpm@...ux-foundation.org, david@...nel.org, lorenzo.stoakes@...cle.com,
 Liam.Howlett@...cle.com, vbabka@...e.cz, rppt@...nel.org, surenb@...gle.com,
 mhocko@...e.com, axelrasmussen@...gle.com, yuanchu@...gle.com,
 weixugc@...gle.com, zhengqi.arch@...edance.com, shakeel.butt@...ux.dev,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org, lujialin4@...wei.com,
 chenridong@...wei.com
Subject: Re: [RFC -next] memcg: Optimize creation performance when LRU_GEN is
 enabled



On 2025/11/27 1:15, Johannes Weiner wrote:
> On Wed, Nov 19, 2025 at 08:37:22AM +0000, Chen Ridong wrote:
>> From: Chen Ridong <chenridong@...wei.com>
>>
>> With LRU_GEN=y and LRU_GEN_ENABLED=n, a performance regression occurs
>> when creating a large number of memory cgroups (memcgs):
>>
>> 	# time mkdir testcg_{1..10000}
>>
>> 	real	0m7.167s
>> 	user	0m0.037s
>> 	sys	0m6.773s
>>
>> 	# time mkdir testcg_{1..20000}
>>
>> 	real	0m27.158s
>> 	user	0m0.079s
>> 	sys	0m26.270s
>>
>> In contrast, with LRU_GEN=n, creation of the same number of memcgs
>> performs better:
>>
>> 	# time mkdir testcg_{1..10000}
>>
>> 	real	0m3.386s
>> 	user	0m0.044s
>> 	sys	0m3.009s
>>
>> 	# time mkdir testcg_{1..20000}
>>
>> 	real	0m6.876s
>> 	user	0m0.075s
>> 	sys	0m6.121s
>>
>> The root cause is that lru_gen node onlining uses hlist_nulls_add_tail_rcu,
>> which traverses the entire list to find the tail. This traversal scales
>> with the number of memcgs, even when LRU_GEN is runtime-disabled.
> 
> Can you please look into removing the memcg LRU instead?
> 
> Use mem_cgroup_iter() with a reclaim cookie in shrink_many(), like we
> do in shrink_node_memcgs().
> 
> The memcg LRU is complicated, and it only works for global reclaim; if
> you have a subtree with a memory.max at the top, it'll go through
> shrink_node_memcgs() already anyway.

Hi, all,

I previously attempted to remove the memcg LRU [1], but this change introduced a regression that
increased kswapd overhead significantly.

Now circling back to this issue: does anyone have suggestions on how to address this problem
effectively?

[1] https://lore.kernel.org/cgroups/0b8ea26f-71f7-4f6d-b0d6-7d42e087a7ed@huaweicloud.com/T/#t

-- 
Best regards,
Ridong