[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aC_OT5IQVYk2wFU_@harry>
Date: Fri, 23 May 2025 10:24:31 +0900
From: Harry Yoo <harry.yoo@...cle.com>
To: Muchun Song <muchun.song@...ux.dev>
Cc: Muchun Song <songmuchun@...edance.com>, hannes@...xchg.org,
mhocko@...nel.org, roman.gushchin@...ux.dev, shakeel.butt@...ux.dev,
akpm@...ux-foundation.org, david@...morbit.com,
zhengqi.arch@...edance.com, yosry.ahmed@...ux.dev, nphamcs@...il.com,
chengming.zhou@...ux.dev, linux-kernel@...r.kernel.org,
cgroups@...r.kernel.org, linux-mm@...ck.org,
hamzamahfooz@...ux.microsoft.com, apais@...ux.microsoft.com
Subject: Re: [PATCH RFC 27/28] mm: memcontrol: eliminate the problem of dying
memory cgroup for LRU folios
On Thu, May 22, 2025 at 10:31:20AM +0800, Muchun Song wrote:
>
>
> > On May 20, 2025, at 19:27, Harry Yoo <harry.yoo@...cle.com> wrote:
> >
> > On Tue, Apr 15, 2025 at 10:45:31AM +0800, Muchun Song wrote:
> >> Pagecache pages are charged at allocation time and hold a reference
> >> to the original memory cgroup until reclaimed. Depending on memory
> >> pressure, page sharing patterns between different cgroups and cgroup
> >> creation/destruction rates, many dying memory cgroups can be pinned
> >> by pagecache pages, reducing page reclaim efficiency and wasting
> >> memory. Converting LRU folios and most other raw memory cgroup pins
> >> to the object cgroup direction can fix this long-living problem.
> >>
> >> Finally, folio->memcg_data of LRU folios and kmem folios will always
> >> point to an object cgroup pointer. The folio->memcg_data of slab
> >> folios will point to an vector of object cgroups.
> >>
> >> Signed-off-by: Muchun Song <songmuchun@...edance.com>
> >> ---
> >> include/linux/memcontrol.h | 78 +++++--------
> >> mm/huge_memory.c | 33 ++++++
> >> mm/memcontrol-v1.c | 15 ++-
> >> mm/memcontrol.c | 228 +++++++++++++++++++++++++------------
> >> 4 files changed, 222 insertions(+), 132 deletions(-)
> >
> > [...]
> >
> >> +static void lruvec_reparent_lru(struct lruvec *src, struct lruvec *dst,
> >> + enum lru_list lru)
> >> +{
> >> + int zid;
> >> + struct mem_cgroup_per_node *mz_src, *mz_dst;
> >> +
> >> + mz_src = container_of(src, struct mem_cgroup_per_node, lruvec);
> >> + mz_dst = container_of(dst, struct mem_cgroup_per_node, lruvec);
> >> +
> >> + if (lru != LRU_UNEVICTABLE)
> >> + list_splice_tail_init(&src->lists[lru], &dst->lists[lru]);
> >> +
> >> + for (zid = 0; zid < MAX_NR_ZONES; zid++) {
> >> + mz_dst->lru_zone_size[zid][lru] += mz_src->lru_zone_size[zid][lru];
> >> + mz_src->lru_zone_size[zid][lru] = 0;
> >> + }
> >> +}
> >
> > I think this function should also update memcg and lruvec stats of
> > parent memcg? Or is it intentional?
>
> Hi Harry,
>
> No. Do not need. Because the statistics are accounted hierarchically.
>
> Thanks.
Oh, you are absolutely right. I was missing that.
Thanks!
--
Cheers,
Harry / Hyeonggon
Powered by blists - more mailing lists