linux-kernel - Re: [PATCH v6 00/16] zswap IAA compress batching

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKEwX=PRzZEYOuTECjjqYbUDXUjMzOc-R5s14-iX8qevDxGBpA@mail.gmail.com>
Date: Tue, 11 Feb 2025 09:52:32 -0800
From: Nhat Pham <nphamcs@...il.com>
To: Eric Biggers <ebiggers@...nel.org>
Cc: Kanchana P Sridhar <kanchana.p.sridhar@...el.com>, linux-kernel@...r.kernel.org, 
	linux-mm@...ck.org, hannes@...xchg.org, yosry.ahmed@...ux.dev, 
	chengming.zhou@...ux.dev, usamaarif642@...il.com, ryan.roberts@....com, 
	21cnbao@...il.com, akpm@...ux-foundation.org, linux-crypto@...r.kernel.org, 
	herbert@...dor.apana.org.au, davem@...emloft.net, clabbe@...libre.com, 
	ardb@...nel.org, surenb@...gle.com, kristen.c.accardi@...el.com, 
	wajdi.k.feghali@...el.com, vinodh.gopal@...el.com
Subject: Re: [PATCH v6 00/16] zswap IAA compress batching

On Tue, Feb 11, 2025 at 9:05 AM Eric Biggers <ebiggers@...nel.org> wrote:
>
> On Wed, Feb 05, 2025 at 11:20:46PM -0800, Kanchana P Sridhar wrote:
>
> So, zswap is passed a large folio to swap out, and it divides it into 4K pages
> and compresses each independently.  The performance improvement in this patchset
> comes entirely from compressing the folio's pages in parallel, synchronously,
> using IAA.
>
> Before even considering IAA and going through all the pain of supporting
> batching with an off-CPU offload, wouldn't it make a lot more sense to try just
> compressing each folio in software as a single unit?  Compared to the existing
> approach of compressing the folio in 4K chunks, that should be much faster and
> produce a much better compression ratio.  Compression algorithms are very much
> designed for larger amounts of data, so that they can find more matches.
>
> It looks like the mm subsystem used to always break up folios when swapping them
> out, but that is now been fixed.  It looks like zswap just hasn't been updated
> to do otherwise yet?
>
> FWIW, here are some speed and compression ratio results I collected in a
> compression benchmark module that tests feeding vmlinux (uncompressed_size:
> 26624 KiB) though zstd in 4 KiB page or 2 MiB folio-sized chunks:
>
> zstd level 3, 4K chunks: 86 ms; compressed_size 9429 KiB
> zstd level 3, 2M chunks: 57 ms; compressed_size 8251 KiB
> zstd level 1, 4K chunks: 65 ms; compressed_size 9806 KiB
> zstd level 1, 2M chunks: 34 ms; compressed_size 8878 KiB
>
> The current zswap parameterization is "zstd level 3, 4K chunks".  I would
> recommend "zstd level 1, 2M chunks", which would be 2.5 times as fast and give a
> 6% better compression ratio.
>
> What is preventing zswap from compressing whole folios?

Thanks for the input, Eric! That was one of the directions we have
been exploring for zswap and zram. Here's what's going on:

The first issue is zsmalloc, which is the backend memory allocator for
zswap, currently does not support larger-than-4K object size. Barry
Song is working on this:

https://lore.kernel.org/linux-mm/20241121222521.83458-1-21cnbao@gmail.com/

Performance-wise, compressing whole folios also means that at swap-in
time, you have to decompress and load the entire folio/chunk. This can
create extra memory pressure (for example, you have to either allocate
a huge page or multiple small pages for the folio/chunk), which is
particularly bad when the system is already in trouble :) I believe
that is one of the blockers for the above patch series as well.