[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e764d05a-6a83-4563-9f28-3f1a3e28727b@konsulko.se>
Date: Wed, 23 Apr 2025 21:53:48 +0200
From: Vitaly Wool <vitaly.wool@...sulko.se>
To: Yosry Ahmed <yosry.ahmed@...ux.dev>
Cc: linux-mm@...ck.org, akpm@...ux-foundation.org,
linux-kernel@...r.kernel.org, Nhat Pham <nphamcs@...il.com>,
Shakeel Butt <shakeel.butt@...ux.dev>, Johannes Weiner <hannes@...xchg.org>,
Igor Belousov <igor.b@...dev.am>, Minchan Kim <minchan@...nel.org>,
Sergey Senozhatsky <senozhatsky@...omium.org>
Subject: Re: [PATCH v4] mm: add zblock allocator
On 4/22/25 12:46, Yosry Ahmed wrote:
> I didn't look too closely but I generally agree that we should improve
> zsmalloc where possible rather than add a new allocator. We are trying
> not to repeat the zbud/z3fold or slub/slob stories here. Zsmalloc is
> getting a lot of mileage from both zswap and zram, and is more-or-less
> battle-tested. Let's work toward building upon that instead of starting
> over.
The thing here is, zblock is using a very different approach to small
object allocation. The idea is: we have an array of descriptors which
correspond to multi-page blocks divided in chunks of equal size
(block_size[i]). For each object of size x we find the descriptor n such as:
block_size[n-1] < n < block_size[n]
and then we store that object in an empty slot in one of the blocks.
Thus, the density is high, the search is fast (rbtree based) and there
are no objects spanning over 2 pages, so no extra memcpy involved.
And with the latest zblock, we see that it has a clear advantage in
performance over zsmalloc, retaining roughly the same allocation density
for 4K pages and scoring better on 16K pages. E. g. on a kernel compilation:
* zsmalloc/zstd/make -j32 bzImage
real 8m0.594s
user 39m37.783s
sys 8m24.262s
Zswap: 200600 kB <-- after build completion
Zswapped: 854072 kB <-- after build completion
zswpin 309774
zswpout 1538332
* zblock/zstd/make -j32 bzImage
real 7m35.546s
user 38m03.475s
sys 7m47.407s
Zswap: 250940 kB <-- after build completion
Zswapped: 870660 kB <-- after build completion
zswpin 248606
zswpout 1277319
So what we see here is that zblock is definitely faster and at least not
worse with regard to allocation density under heavy load. It has
slightly worse _idle_ allocation density but since it will quickly catch
up under load it is not really important. What is important is that its
characteristics don't deteriorate over time. Overall, zblock is simple
and efficient and there is /raison d'etre/ for it.
Now, it is indeed possible to partially rework zsmalloc using zblock's
algorithm but this will be a rather substantial change, equal or bigger
in effort to implementing the approach described above from scratch (and
this is what we did), and with such drastic changes most of the testing
that has been done with zsmalloc would be invalidated, and we'll be out
in the wild anyway. So even though I see your point, I don't think it
applies in this particular case.
~Vitaly
Powered by blists - more mailing lists