[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5p4iyah6zlrnxpbsis32c4m5lrjj3pq7xwcugq35d2entwfai2@n2r6y3ga2ie5>
Date: Tue, 6 Jan 2026 13:20:32 +0900
From: Sergey Senozhatsky <senozhatsky@...omium.org>
To: Yosry Ahmed <yosry.ahmed@...ux.dev>
Cc: Sergey Senozhatsky <senozhatsky@...omium.org>,
Andrew Morton <akpm@...ux-foundation.org>, Nhat Pham <nphamcs@...il.com>, Minchan Kim <minchan@...nel.org>,
Johannes Weiner <hannes@...xchg.org>, Brian Geffon <bgeffon@...gle.com>, linux-kernel@...r.kernel.org,
Herbert Xu <herbert@...dor.apana.org.au>, linux-mm@...ck.org
Subject: Re: [RFC PATCH 2/2] zsmalloc: chain-length configuration should
consider other metrics
On (26/01/05 15:58), Yosry Ahmed wrote:
> On Mon, Jan 05, 2026 at 10:42:51AM +0900, Sergey Senozhatsky wrote:
> > On (26/01/02 18:29), Yosry Ahmed wrote:
> > > On Thu, Jan 01, 2026 at 10:38:14AM +0900, Sergey Senozhatsky wrote:
> > [..]
> > >
> > > I worry that the heuristics are too hand-wavy
> >
> > I don't disagree. Am not super excited about the heuristics either.
> >
> > > and I wonder if the memcpy savings actually show up as perf improvements
> > > in any real life workload. Do we have data about this?
> >
> > I don't have real life 16K PAGE_SIZE devices. However, on 16K PAGE_SIZE
> > systems we have "normal" size-classes up to a very large size, and normal
> > class means chaining of 0-order physical pages, and chaining means spanning.
> > So on 16K memcpy overhead is expected to be somewhat noticeable.
>
> I don't disagree that it could be a problem, I am just against
> optimizations without data. It makes it hard to modify these heuristics
> later or remove them, since we don't really know what effect they had in
> the first place.
>
> We also don't know if the 0.5% increase in memory usage is actually
> offset by CPU gains.
Sure, we are on the same page here.
Another area where we potentially could apply similar heuristics
is size-calsses merge logic: sheer fact that two size-classes have
similar objects per zspage and pages per zspage does not necessarily
mean that merging them will be beneficial. E.g. if padding between
class->size and smallest possible object (when multiplied by the number
of objects per zspage) becomes a large enough wasted space.
But again, heuristics are hard. I'm fine with us dropping that idea
for the time being.
> > > I also vaguely recall discussions about other ways to avoid the memcpy
> > > using scatterlists, so I am wondering if this is the right metric to
> > > optimize.
> >
> > As far as I understand SG-list based approach is that it will require
> > implementing split-data handling on the compression algorithms side,
> > which is not trivial (especially if the only reason to do that is
> > zsmalloc).
>
> I am not sure tbh, adding Herbert here. I remember looking at the code
> in scomp_acomp_comp_decomp() at some point, and I think it will take
> care of non-contiguous SG-lists. Not sure if that's the correct place to
> look tho.
Ah, so it does kmap under the hood. I suppose that can work.
> > Alternatively, we maybe can try to vmap spanning objects:
>
> Using vmap makes sense in theory, but in practice (at least for zswap)
> it doesn't help
OK.
Powered by blists - more mailing lists