lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMgjq7DOPMp3Eq9_SxmxNhY7S5--3uf0PByNAJOAEne9hb+T9Q@mail.gmail.com>
Date: Mon, 1 Dec 2025 17:00:36 +0800
From: Kairui Song <ryncsn@...il.com>
To: Barry Song <21cnbao@...il.com>
Cc: wangzicheng <wangzicheng@...or.com>, "Liam R. Howlett" <Liam.Howlett@...cle.com>, 
	Matthew Wilcox <willy@...radead.org>, 
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>, "hannes@...xchg.org" <hannes@...xchg.org>, 
	"david@...hat.com" <david@...hat.com>, "axelrasmussen@...gle.com" <axelrasmussen@...gle.com>, 
	"yuanchu@...gle.com" <yuanchu@...gle.com>, "mhocko@...nel.org" <mhocko@...nel.org>, 
	"zhengqi.arch@...edance.com" <zhengqi.arch@...edance.com>, 
	"shakeel.butt@...ux.dev" <shakeel.butt@...ux.dev>, 
	"lorenzo.stoakes@...cle.com" <lorenzo.stoakes@...cle.com>, "weixugc@...gle.com" <weixugc@...gle.com>, 
	"vbabka@...e.cz" <vbabka@...e.cz>, "rppt@...nel.org" <rppt@...nel.org>, 
	"surenb@...gle.com" <surenb@...gle.com>, "mhocko@...e.com" <mhocko@...e.com>, "corbet@....net" <corbet@....net>, 
	"linux-mm@...ck.org" <linux-mm@...ck.org>, 
	"linux-doc@...r.kernel.org" <linux-doc@...r.kernel.org>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, wangtao <tao.wangtao@...or.com>, 
	wangzhen 00021541 <wangzhen5@...or.com>, zhongjinji 00025326 <zhongjinji@...or.com>
Subject: Re: [PATCH 0/3] mm/lru_gen: move lru_gen control interface from
 debugfs to procfs

On Mon, Dec 1, 2025 at 3:46 PM Barry Song <21cnbao@...il.com> wrote:
>
> On Mon, Dec 1, 2025 at 2:50 PM wangzicheng <wangzicheng@...or.com> wrote:
> >
> > Hi Barry,
> >
> > > Hi Liam,
> > >
> > > I saw you mentioned me, so I just wanted to join in :-)
> > >
> > > On Sat, Nov 29, 2025 at 12:16 AM Liam R. Howlett <Liam.Howlett@...cle.com>
> > > wrote:
> > > >
> > > > * Matthew Wilcox <willy@...radead.org> [251128 10:16]:
> > > > > On Fri, Nov 28, 2025 at 10:53:12AM +0800, Zicheng Wang wrote:
> > > > > > Case study:
> > > > > > A widely observed issue on Android is that after application
> > > > > > launch,
> > > >
> > > > What do you mean by application launch?  What does this mean in the
> > > > kernel context?
> > >
> > > I think there are two cases. First, a cold start: a new process is forked to
> > > launch the app. Second, when the app switches from background to
> > > foreground, for example when we bring it back to the screen after it has
> > > been running in the background.
> > >
> > > In the first case, you reboot your phone and tap the YouTube icon to start
> > > the app (cold launch). In the second case, you are watching a video in
> > > YouTube, then switch to Facebook, and later tap the YouTube icon again to
> > > bring it from background to foreground.
> > >
> > Thanks for the explain, that's exactly what I meant.
> >
> > Android lifecycle model isn't obvious outside the Android context. I’ll make that
> > clearer in the next version.
> > > >
> > > > > > the oldest anon generation often becomes empty, and file pages are
> > > > > > over-reclaimed.
> > > > >
> > > > > You should fix the bug, not move the debug interface to procfs.  NACK.
> > > >
> > > > Barry recently sent an RFC [1] to affect LRU in the exit path for
> > > > Android.  This was proven incorrect by Johannes, iirc, in another
> > > > thread I cannot find (destroys performance of calling the same command).
> > >
> > > My understanding is that affecting the LRU in the exit path is not generally
> > > correct, but it still highlights a requirement: Linux LRU needs a way to
> > > understand app-cycling behavior in an Android-like system.
> > >
> > > >
> > > > These ideas seem both related as it points to a suboptimal LRU in the
> > > > Android ecosystem, at least.  It seems to stem from Androids life
> > > > (cycle) choices :)
> > > >
> > > > I strongly agree with Willy.  We don't want another userspace daemon
> > > > and/or interface, but this time to play with the LRU to avoid trying
> > > > to define and fix the problem.
> > > >
> > > > Do you know if this affects others or why it is android specific?
> > >
> > > The behavior Zicheng probably wants is a proactive memory reclamation
> > > interface. For example, since each app may be in a different memcg, if an
> > > app has been in the background for a long time, he wants to reclaim its
> > > memory proactively rather than waiting until kswapd hits the watermarks.
> > >
> > > This may help a newly launched app obtain memory more quickly, avoiding
> > > delays from reclamation, since a new app typically requires a substantial
> > > amount of memory.
> > >
> > > Zicheng, please let me know if I’m misunderstanding anything.
> >
> > Yes, but not least.
> >
> > 1. proactive memory reclaim: yes, that's we are after.
> > When an app is swiped away and kept in the background and not use for a while,
> > proactively reclaiming its memcg can help new foreground apps get memory
> > faster (instead of paying the cost of direct reclaim).
> >
> > 2. Anon v.s. File: *bias more towards anonymous* pages for background apps.
> > With mglru, however, the oldest generations often contain almost no anon pages,
> > so simply tuning swappiness cannot achieve that -- reclaim will still clear file cache
> > in the old generations first.
> > To some extent, file caches are `over-reclaimed` in such senario, leading to a disaster
> > when user‑interaction threads get stuck in direct reclaim of anon pages.
> I strongly recommend separating this from your patchset. Avoid including
> unrelated changes in a single patchset.
>
> MGLRU has a mechanism to ensure that file and anon pages can keep pace
> with each other. In the newest kernel, the minimum generation is 2. For
> example, if anon has only 2 generations left and we decide to reclaim
> anon folios, we will fall back to reclaiming file pages. Sometimes,
> this means that anon reclamation is insufficient while file pages are
> over-reclaimed.
>
> static int scan_folios(unsigned long nr_to_scan, struct lruvec *lruvec,
>                        struct scan_control *sc, int type, int tier,
>                        struct list_head *list)
> {
>         ...
>         if (get_nr_gens(lruvec, type) == MIN_NR_GENS)
>                 return 0;
>         ...
> }
>
> This is probably not a bug, but this design can sometimes work
> suboptimally.
>
> Regarding this issue, both Kairui (from the Linux server side, cc-ed) and I
> (from the Android side) have observed it. This should be addressed in
> MGLRU's code, and we already have kernel code for that. It is unrelated
> to your patchset, so you shouldn’t include so many unrelated changes in
> a single patchset.

Thanks for including me in the discussion.

Right, we are seeing similar problems on our server too. To workaround
it we force an age iteration before reclaiming when it happens, which
isn't the best choice. When the LRU is long and the opposite type of
the folios we want to reclaim is piling up in the oldest gen, a forced
age will have to move all these folios, which leads to long tailing
issues. Let's work on a reasonable solution for that.

>
> Please keep your patchset focused solely on whether the MGLRU proactive
> reclamation interface should be promoted to sysfs (LRU_GEN already has a
> folder in sysfs) instead of debugfs, if there is a v2.
>
> The following is quoted from
> `Documentation/admin-guide/mm/multigen_lru.rst`.
>
> Proactive reclaim
> -----------------
> Proactive reclaim induces page reclaim when there is no memory
> pressure. It usually targets cold pages only. E.g., when a new job
> comes in, the job scheduler wants to proactively reclaim cold pages on
> the server it selected, to improve the chance of successfully landing
> this new job.
>
> Users can write the following command to ``lru_gen`` to evict
> generations less than or equal to ``min_gen_nr``.
>
>     ``- memcg_id node_id min_gen_nr [swappiness [nr_to_reclaim]]``
>
>
> >
> > See the case in the cover letter.
> > ```
> > memcg    54 /apps/some_app
> > node     0
> > 1     119804          0       85461
> > 2     119804          0           5
> > 3     119804     181719       18667
> > 4       1752        392         244
> > ```
> >
> >
> > Since the semantic gap between user/kernel space will always exist.
> > It would be great benefits for leaving some APIs for user hints, just like
> > mmadvise/userfault/para-virtualization.
>
> Nope. This is just an internal detail of MGLRU and shouldn’t be exposed
> as an interface.
> Hopefully, Kairui or I will send a patchset soon to address the balance
> issue between file and anon pages. For now, you can use `swappiness=201`
> as a temporary workaround. Take a look at bytedance's patchset.[1]

Agree, Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ