[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251201120105.23338-1-zhongjinji@honor.com>
Date: Mon, 1 Dec 2025 20:01:05 +0800
From: zhongjinji <zhongjinji@...or.com>
To: <ryncsn@...il.com>
CC: <21cnbao@...il.com>, <Liam.Howlett@...cle.com>,
<akpm@...ux-foundation.org>, <axelrasmussen@...gle.com>, <corbet@....net>,
<david@...hat.com>, <hannes@...xchg.org>, <linux-doc@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>,
<lorenzo.stoakes@...cle.com>, <mhocko@...nel.org>, <mhocko@...e.com>,
<rppt@...nel.org>, <shakeel.butt@...ux.dev>, <surenb@...gle.com>,
<tao.wangtao@...or.com>, <vbabka@...e.cz>, <wangzhen5@...or.com>,
<wangzicheng@...or.com>, <weixugc@...gle.com>, <willy@...radead.org>,
<yuanchu@...gle.com>, <zhengqi.arch@...edance.com>, <zhongjinji@...or.com>
Subject: Re: [PATCH 0/3] mm/lru_gen: move lru_gen control interface from debugfs to procfs
> > I strongly recommend separating this from your patchset. Avoid including
> > unrelated changes in a single patchset.
> >
> > MGLRU has a mechanism to ensure that file and anon pages can keep pace
> > with each other. In the newest kernel, the minimum generation is 2. For
> > example, if anon has only 2 generations left and we decide to reclaim
> > anon folios, we will fall back to reclaiming file pages. Sometimes,
> > this means that anon reclamation is insufficient while file pages are
> > over-reclaimed.
> >
> > static int scan_folios(unsigned long nr_to_scan, struct lruvec *lruvec,
> > struct scan_control *sc, int type, int tier,
> > struct list_head *list)
> > {
> > ...
> > if (get_nr_gens(lruvec, type) == MIN_NR_GENS)
> > return 0;
> > ...
> > }
> >
> > This is probably not a bug, but this design can sometimes work
> > suboptimally.
> >
> > Regarding this issue, both Kairui (from the Linux server side, cc-ed) and I
> > (from the Android side) have observed it. This should be addressed in
> > MGLRU's code, and we already have kernel code for that. It is unrelated
> > to your patchset, so you shouldn’t include so many unrelated changes in
> > a single patchset.
>
> Thanks for including me in the discussion.
>
> Right, we are seeing similar problems on our server too. To workaround
> it we force an age iteration before reclaiming when it happens, which
> isn't the best choice. When the LRU is long and the opposite type of
> the folios we want to reclaim is piling up in the oldest gen, a forced
> age will have to move all these folios, which leads to long tailing
> issues. Let's work on a reasonable solution for that.
We have encountered the same issue on Android. When an app is frozen
(which may mean the app will not be used for a long time), we want to
reclaim the app's anonymous pages. After all inactive anonymous pages
are reclaimed, the reclamation cannot proceed further. If we actively trigger
aging on anonymous pages at this point, the number of inactive file pages
may become very large.
To address this issue, I have tried using different max_seq values for
anonymous and file pages. When reclaiming anonymous pages through memory.reclaim,
we can age only the anonymous pages. However, this approach requires extensive
code changes, and it does not seem worthwhile to implement.
Powered by blists - more mailing lists