[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGsJ_4w1jEej+ROuLta3MSuo4pKuA5yq7=6HS5yzgK39-4SLoA@mail.gmail.com>
Date: Mon, 1 Dec 2025 15:45:56 +0800
From: Barry Song <21cnbao@...il.com>
To: wangzicheng <wangzicheng@...or.com>
Cc: "Liam R. Howlett" <Liam.Howlett@...cle.com>, Matthew Wilcox <willy@...radead.org>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>, "hannes@...xchg.org" <hannes@...xchg.org>,
"david@...hat.com" <david@...hat.com>, "axelrasmussen@...gle.com" <axelrasmussen@...gle.com>,
"yuanchu@...gle.com" <yuanchu@...gle.com>, "mhocko@...nel.org" <mhocko@...nel.org>,
"zhengqi.arch@...edance.com" <zhengqi.arch@...edance.com>,
"shakeel.butt@...ux.dev" <shakeel.butt@...ux.dev>,
"lorenzo.stoakes@...cle.com" <lorenzo.stoakes@...cle.com>, "weixugc@...gle.com" <weixugc@...gle.com>,
"vbabka@...e.cz" <vbabka@...e.cz>, "rppt@...nel.org" <rppt@...nel.org>,
"surenb@...gle.com" <surenb@...gle.com>, "mhocko@...e.com" <mhocko@...e.com>, "corbet@....net" <corbet@....net>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, wangtao <tao.wangtao@...or.com>,
wangzhen 00021541 <wangzhen5@...or.com>, zhongjinji 00025326 <zhongjinji@...or.com>,
Kairui Song <ryncsn@...il.com>
Subject: Re: [PATCH 0/3] mm/lru_gen: move lru_gen control interface from
debugfs to procfs
On Mon, Dec 1, 2025 at 2:50 PM wangzicheng <wangzicheng@...or.com> wrote:
>
> Hi Barry,
>
> > Hi Liam,
> >
> > I saw you mentioned me, so I just wanted to join in :-)
> >
> > On Sat, Nov 29, 2025 at 12:16 AM Liam R. Howlett <Liam.Howlett@...cle.com>
> > wrote:
> > >
> > > * Matthew Wilcox <willy@...radead.org> [251128 10:16]:
> > > > On Fri, Nov 28, 2025 at 10:53:12AM +0800, Zicheng Wang wrote:
> > > > > Case study:
> > > > > A widely observed issue on Android is that after application
> > > > > launch,
> > >
> > > What do you mean by application launch? What does this mean in the
> > > kernel context?
> >
> > I think there are two cases. First, a cold start: a new process is forked to
> > launch the app. Second, when the app switches from background to
> > foreground, for example when we bring it back to the screen after it has
> > been running in the background.
> >
> > In the first case, you reboot your phone and tap the YouTube icon to start
> > the app (cold launch). In the second case, you are watching a video in
> > YouTube, then switch to Facebook, and later tap the YouTube icon again to
> > bring it from background to foreground.
> >
> Thanks for the explain, that's exactly what I meant.
>
> Android lifecycle model isn't obvious outside the Android context. I’ll make that
> clearer in the next version.
> > >
> > > > > the oldest anon generation often becomes empty, and file pages are
> > > > > over-reclaimed.
> > > >
> > > > You should fix the bug, not move the debug interface to procfs. NACK.
> > >
> > > Barry recently sent an RFC [1] to affect LRU in the exit path for
> > > Android. This was proven incorrect by Johannes, iirc, in another
> > > thread I cannot find (destroys performance of calling the same command).
> >
> > My understanding is that affecting the LRU in the exit path is not generally
> > correct, but it still highlights a requirement: Linux LRU needs a way to
> > understand app-cycling behavior in an Android-like system.
> >
> > >
> > > These ideas seem both related as it points to a suboptimal LRU in the
> > > Android ecosystem, at least. It seems to stem from Androids life
> > > (cycle) choices :)
> > >
> > > I strongly agree with Willy. We don't want another userspace daemon
> > > and/or interface, but this time to play with the LRU to avoid trying
> > > to define and fix the problem.
> > >
> > > Do you know if this affects others or why it is android specific?
> >
> > The behavior Zicheng probably wants is a proactive memory reclamation
> > interface. For example, since each app may be in a different memcg, if an
> > app has been in the background for a long time, he wants to reclaim its
> > memory proactively rather than waiting until kswapd hits the watermarks.
> >
> > This may help a newly launched app obtain memory more quickly, avoiding
> > delays from reclamation, since a new app typically requires a substantial
> > amount of memory.
> >
> > Zicheng, please let me know if I’m misunderstanding anything.
>
> Yes, but not least.
>
> 1. proactive memory reclaim: yes, that's we are after.
> When an app is swiped away and kept in the background and not use for a while,
> proactively reclaiming its memcg can help new foreground apps get memory
> faster (instead of paying the cost of direct reclaim).
>
> 2. Anon v.s. File: *bias more towards anonymous* pages for background apps.
> With mglru, however, the oldest generations often contain almost no anon pages,
> so simply tuning swappiness cannot achieve that -- reclaim will still clear file cache
> in the old generations first.
> To some extent, file caches are `over-reclaimed` in such senario, leading to a disaster
> when user‑interaction threads get stuck in direct reclaim of anon pages.
I strongly recommend separating this from your patchset. Avoid including
unrelated changes in a single patchset.
MGLRU has a mechanism to ensure that file and anon pages can keep pace
with each other. In the newest kernel, the minimum generation is 2. For
example, if anon has only 2 generations left and we decide to reclaim
anon folios, we will fall back to reclaiming file pages. Sometimes,
this means that anon reclamation is insufficient while file pages are
over-reclaimed.
static int scan_folios(unsigned long nr_to_scan, struct lruvec *lruvec,
struct scan_control *sc, int type, int tier,
struct list_head *list)
{
...
if (get_nr_gens(lruvec, type) == MIN_NR_GENS)
return 0;
...
}
This is probably not a bug, but this design can sometimes work
suboptimally.
Regarding this issue, both Kairui (from the Linux server side, cc-ed) and I
(from the Android side) have observed it. This should be addressed in
MGLRU's code, and we already have kernel code for that. It is unrelated
to your patchset, so you shouldn’t include so many unrelated changes in
a single patchset.
Please keep your patchset focused solely on whether the MGLRU proactive
reclamation interface should be promoted to sysfs (LRU_GEN already has a
folder in sysfs) instead of debugfs, if there is a v2.
The following is quoted from
`Documentation/admin-guide/mm/multigen_lru.rst`.
Proactive reclaim
-----------------
Proactive reclaim induces page reclaim when there is no memory
pressure. It usually targets cold pages only. E.g., when a new job
comes in, the job scheduler wants to proactively reclaim cold pages on
the server it selected, to improve the chance of successfully landing
this new job.
Users can write the following command to ``lru_gen`` to evict
generations less than or equal to ``min_gen_nr``.
``- memcg_id node_id min_gen_nr [swappiness [nr_to_reclaim]]``
>
> See the case in the cover letter.
> ```
> memcg 54 /apps/some_app
> node 0
> 1 119804 0 85461
> 2 119804 0 5
> 3 119804 181719 18667
> 4 1752 392 244
> ```
>
>
> Since the semantic gap between user/kernel space will always exist.
> It would be great benefits for leaving some APIs for user hints, just like
> mmadvise/userfault/para-virtualization.
Nope. This is just an internal detail of MGLRU and shouldn’t be exposed
as an interface.
Hopefully, Kairui or I will send a patchset soon to address the balance
issue between file and anon pages. For now, you can use `swappiness=201`
as a temporary workaround. Take a look at bytedance's patchset.[1]
> Exposing such hints to the kernel can help improve overall system performance.
[1] https://lore.kernel.org/linux-mm/cover.1744169302.git.hezhongkun.hzk@bytedance.com/
Thanks
Barry
Powered by blists - more mailing lists