[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c8c2d782-e67d-4f4f-917d-6cb198fe1175@huaweicloud.com>
Date: Tue, 6 Jan 2026 21:27:42 +0800
From: Chen Ridong <chenridong@...weicloud.com>
To: Johannes Weiner <hannes@...xchg.org>
Cc: akpm@...ux-foundation.org, david@...nel.org, lorenzo.stoakes@...cle.com,
Liam.Howlett@...cle.com, vbabka@...e.cz, rppt@...nel.org, surenb@...gle.com,
mhocko@...e.com, axelrasmussen@...gle.com, yuanchu@...gle.com,
weixugc@...gle.com, zhengqi.arch@...edance.com, shakeel.butt@...ux.dev,
linux-mm@...ck.org, linux-kernel@...r.kernel.org, lujialin4@...wei.com,
chenridong@...wei.com
Subject: Re: [RFC -next] memcg: Optimize creation performance when LRU_GEN is
enabled
On 2025/11/27 1:15, Johannes Weiner wrote:
> On Wed, Nov 19, 2025 at 08:37:22AM +0000, Chen Ridong wrote:
>> From: Chen Ridong <chenridong@...wei.com>
>>
>> With LRU_GEN=y and LRU_GEN_ENABLED=n, a performance regression occurs
>> when creating a large number of memory cgroups (memcgs):
>>
>> # time mkdir testcg_{1..10000}
>>
>> real 0m7.167s
>> user 0m0.037s
>> sys 0m6.773s
>>
>> # time mkdir testcg_{1..20000}
>>
>> real 0m27.158s
>> user 0m0.079s
>> sys 0m26.270s
>>
>> In contrast, with LRU_GEN=n, creation of the same number of memcgs
>> performs better:
>>
>> # time mkdir testcg_{1..10000}
>>
>> real 0m3.386s
>> user 0m0.044s
>> sys 0m3.009s
>>
>> # time mkdir testcg_{1..20000}
>>
>> real 0m6.876s
>> user 0m0.075s
>> sys 0m6.121s
>>
>> The root cause is that lru_gen node onlining uses hlist_nulls_add_tail_rcu,
>> which traverses the entire list to find the tail. This traversal scales
>> with the number of memcgs, even when LRU_GEN is runtime-disabled.
>
> Can you please look into removing the memcg LRU instead?
>
> Use mem_cgroup_iter() with a reclaim cookie in shrink_many(), like we
> do in shrink_node_memcgs().
>
> The memcg LRU is complicated, and it only works for global reclaim; if
> you have a subtree with a memory.max at the top, it'll go through
> shrink_node_memcgs() already anyway.
Hi, all,
I previously attempted to remove the memcg LRU [1], but this change introduced a regression that
increased kswapd overhead significantly.
Now circling back to this issue: does anyone have suggestions on how to address this problem
effectively?
[1] https://lore.kernel.org/cgroups/0b8ea26f-71f7-4f6d-b0d6-7d42e087a7ed@huaweicloud.com/T/#t
--
Best regards,
Ridong
Powered by blists - more mailing lists