linux-kernel - Re: [PATCH v2] mm: add zblock allocator

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250408195533.GA99052@cmpxchg.org>
Date: Tue, 8 Apr 2025 15:55:33 -0400
From: Johannes Weiner <hannes@...xchg.org>
To: Igor Belousov <igor.b@...dev.am>
Cc: Nhat Pham <nphamcs@...il.com>, vitaly.wool@...sulko.se,
	linux-mm@...ck.org, akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org, Shakeel Butt <shakeel.butt@...ux.dev>,
	Yosry Ahmed <yosryahmed@...gle.com>
Subject: Re: [PATCH v2] mm: add zblock allocator

On Tue, Apr 08, 2025 at 01:20:11PM +0400, Igor Belousov wrote:
> >> >> Now what's funny is that when I tried to compare how 32 threaded build
> >> >> would behave on a 8-core VM I couldn't do it because it OOMs with
> >> >> zsmalloc as zswap backend. With zblock it doesn't, though, and the
> >> >> results are:
> >> >> real    12m14.012s
> >> >> user    39m37.777s
> >> >> sys     14m6.923s
> >> >> Zswap:            440148 kB
> >> >> Zswapped:         924452 kB
> >> >> zswpin 594812
> >> >> zswpout 2802454
> >> >> zswpwb 10878
> >>
> >> It's LZ4 for all the test runs.
> > 
> > Can you try zstd and let me know how it goes :)
> 
> Sure. zstd/8 cores/make -j32:
> 
> zsmalloc:
> real	7m36.413s
> user	38m0.481s
> sys	7m19.108s
> Zswap:            211028 kB
> Zswapped:         925904 kB
> zswpin 397851
> zswpout 1625707
> zswpwb 5126
> 
> zblock:
> real	7m55.009s
> user	39m23.147s
> sys	7m44.004s
> Zswap:            253068 kB
> Zswapped:         919956 kB
> zswpin 456843
> zswpout 2058963
> zswpwb 3921

So zstd results in nearly double the compression ratio, which in turn
cuts total execution time *almost in half*.

The numbers speak for themselves. Compression efficiency >>> allocator
speed, because compression efficiency ultimately drives the continuous
*rate* at which allocations need to occur. You're trying to optimize a
constant coefficient at the expense of a higher-order one, which is a
losing proposition.

This is a general NAK from me on any new allocators that cannot match
or outdo zsmalloc storage density in common scenarios. I'm sorry, but
I really don't see any reason to do this.

We also should probably make zstd the zswap default.