[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e6eydzdvuiktmalhcmoiwsgzjbw5v7t4532fkbroylwr5cqetx@v6pgjaoxgmyz>
Date: Thu, 15 Jan 2026 17:00:04 +0000
From: Yosry Ahmed <yosry.ahmed@...ux.dev>
To: Nhat Pham <nphamcs@...il.com>
Cc: Gregory Price <gourry@...rry.net>, linux-mm@...ck.org,
cgroups@...r.kernel.org, linux-cxl@...r.kernel.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org, kernel-team@...a.com,
longman@...hat.com, tj@...nel.org, hannes@...xchg.org, mkoutny@...e.com,
corbet@....net, gregkh@...uxfoundation.org, rafael@...nel.org, dakr@...nel.org,
dave@...olabs.net, jonathan.cameron@...wei.com, dave.jiang@...el.com,
alison.schofield@...el.com, vishal.l.verma@...el.com, ira.weiny@...el.com,
dan.j.williams@...el.com, akpm@...ux-foundation.org, vbabka@...e.cz, surenb@...gle.com,
mhocko@...e.com, jackmanb@...gle.com, ziy@...dia.com, david@...nel.org,
lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com, rppt@...nel.org,
axelrasmussen@...gle.com, yuanchu@...gle.com, weixugc@...gle.com, yury.norov@...il.com,
linux@...musvillemoes.dk, rientjes@...gle.com, shakeel.butt@...ux.dev, chrisl@...nel.org,
kasong@...cent.com, shikemeng@...weicloud.com, bhe@...hat.com, baohua@...nel.org,
chengming.zhou@...ux.dev, roman.gushchin@...ux.dev, muchun.song@...ux.dev,
osalvador@...e.de, matthew.brost@...el.com, joshua.hahnjy@...il.com,
rakie.kim@...com, byungchul@...com, ying.huang@...ux.alibaba.com,
apopple@...dia.com, cl@...two.org, harry.yoo@...cle.com, zhengqi.arch@...edance.com
Subject: Re: [RFC PATCH v3 7/8] mm/zswap: compressed ram direct integration
On Tue, Jan 13, 2026 at 04:49:20PM +0900, Nhat Pham wrote:
> On Tue, Jan 13, 2026 at 4:35 PM Nhat Pham <nphamcs@...il.com> wrote:
> >
> > > This part needs more thought. Zswap cannot charge a full page because
> > > then from the memcg perspective reclaim is not making any progress.
> > > OTOH, as you mention, from the system perspective we just consumed a
> > > full page, so not charging that would be inconsistent.
> > >
> > > This is not a zswap-specific thing though, even with cram.c we have to
> > > figure out how to charge memory on the compressed node to the memcg.
> > > It's perhaps not as much of a problem as with zswap because we are not
> > > dealing with reclaim not making progress.
> > >
> > > Maybe the memcg limits need to be "enlightened" about different tiers?
> > > We did have such discussions in the past outside the context of
> > > compressed memory, for memory tiering in general.
> >
> > What if we add a reclaim flag that says "hey, we are hitting actual
> > memory limit and need to make memory reclaim forward progress".
> >
> > Then, we can have zswap skip compressed cxl backend and fall back to
> > real compression.
> >
> > (Maybe also demotion, which only move memory from one node to another,
> > as well as the new cram.c stuff? This will technically also save some
> > wasted work, as in the status quo we will need to do a demotion pass
> > first, before having to reclaiom memory from the bottom tier anyway?
> > But not sure if we want this).
>
> Some more thoughts - right now demotion is kinda similar, right? We
> move pages from one node (fast tier) to another (slow tier). This
> frees up space in the fast tier, but it actually doesn't change the
> memcg memory usage. So we are not making "forward progress" with this
> either.
>
> I suppose this is fine-ish, because reclaim subsystem can then proceed
> by reclaiming from the bottom tier, which will now go to disk swap,
> zswap, etc.
>
> Can we achieve the same effect by making pages in
> zswap-backed-by-compressed-cxl reclaimable:
>
> 1. Recompression - take them off compressed cxl and store them in
> zswap proper (i.e in-memory compression).
I think the whole point of using compressed cxl with zswap is saving
memory in the top-tier, so this would be counter-productive (probably
even if we use slightly less memory in the top-tier).
>
> 2. Just enable zswap shrinker and have memory reclaim move these pages
> into disk swap. This will have a much more drastic performance
> implications though :)
I think what you're getting it as that we can still make forward
progress after memory lands in compressed cxl. But moving memory to
compressed cxl is already forward progress that reclaim won't capture if
we charge memory as a full page. I think this is the crux of the issue.
We need to figure out how to make accounting work such that moving
memory to compressed cxl is forward progress, but make sure we don't
break the overall accounting consisteny. If we only charge the actual
compressed size, then from the system perspective there is a page that
is only partially charged and the rest of it is more-or-less leaked.
Powered by blists - more mailing lists