[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251205025727.8324-1-zhongjinji@honor.com>
Date: Fri, 5 Dec 2025 10:57:27 +0800
From: zhongjinji <zhongjinji@...or.com>
To: <hannes@...xchg.org>
CC: <Liam.Howlett@...cle.com>, <akpm@...ux-foundation.org>,
<axelrasmussen@...gle.com>, <cgroups@...r.kernel.org>,
<chenridong@...wei.com>, <chenridong@...weicloud.com>, <corbet@....net>,
<david@...nel.org>, <linux-doc@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
<lorenzo.stoakes@...cle.com>, <lujialin4@...wei.com>, <mhocko@...e.com>,
<muchun.song@...ux.dev>, <roman.gushchin@...ux.dev>, <rppt@...nel.org>,
<shakeel.butt@...ux.dev>, <surenb@...gle.com>, <vbabka@...e.cz>,
<weixugc@...gle.com>, <yuanchu@...gle.com>, <yuzhao@...gle.com>,
<zhengqi.arch@...edance.com>
Subject: Re: [RFC PATCH -next 1/2] mm/mglru: use mem_cgroup_iter for global reclaim
> From: Chen Ridong <chenridong@...wei.com>
>
> The memcg LRU was originally introduced for global reclaim to enhance
> scalability. However, its implementation complexity has led to performance
> regressions when dealing with a large number of memory cgroups [1].
>
> As suggested by Johannes [1], this patch adopts mem_cgroup_iter with
> cookie-based iteration for global reclaim, aligning with the approach
> already used in shrink_node_memcgs. This simplification removes the
> dedicated memcg LRU tracking while maintaining the core functionality.
>
> It performed a stress test based on Zhao Yu's methodology [2] on a
> 1 TB, 4-node NUMA system. The results are summarized below:
>
> memcg LRU memcg iter
> stddev(pgsteal) / mean(pgsteal) 91.2% 75.7%
> sum(pgsteal) / sum(requested) 216.4% 230.5%
Are there more data available? For example, the load of kswapd or the refault values.
I am concerned about these two data points because Yu Zhao's implementation controls
the fairness of aging through memcg gen (get_memcg_gen). This helps reduce excessive
aging for certain cgroups, which is beneficial for kswapd's power consumption.
At the same time, pages that age earlier can be considered colder pages (in the entire system),
so reclaiming them should also help with the refault values.
> The new implementation demonstrates a significant improvement in
> fairness, reducing the standard deviation relative to the mean by
> 15.5 percentage points. While the reclaim accuracy shows a slight
> increase in overscan (from 85086871 to 90633890, 6.5%).
>
> The primary benefits of this change are:
> 1. Simplified codebase by removing custom memcg LRU infrastructure
> 2. Improved fairness in memory reclaim across multiple cgroups
> 3. Better performance when creating many memory cgroups
>
> [1] https://lore.kernel.org/r/20251126171513.GC135004@cmpxchg.org
> [2] https://lore.kernel.org/r/20221222041905.2431096-7-yuzhao@google.com
> Signed-off-by: Chen Ridong <chenridong@...wei.com>
Powered by blists - more mailing lists