linux-kernel - Re: [PATCH v14 26/26] mm: zswap: Batched zswap_compress() for compress batching of large folios.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKEwX=PqWQ_39BuApc_bT1WKQMJyNPDs+Gv0JAU5cTa1KNDj9g@mail.gmail.com>
Date: Sat, 31 Jan 2026 16:48:43 -0800
From: Nhat Pham <nphamcs@...il.com>
To: "Sridhar, Kanchana P" <kanchana.p.sridhar@...el.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "linux-mm@...ck.org" <linux-mm@...ck.org>, 
	"hannes@...xchg.org" <hannes@...xchg.org>, "yosry.ahmed@...ux.dev" <yosry.ahmed@...ux.dev>, 
	"chengming.zhou@...ux.dev" <chengming.zhou@...ux.dev>, 
	"usamaarif642@...il.com" <usamaarif642@...il.com>, "ryan.roberts@....com" <ryan.roberts@....com>, 
	"21cnbao@...il.com" <21cnbao@...il.com>, 
	"ying.huang@...ux.alibaba.com" <ying.huang@...ux.alibaba.com>, 
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>, 
	"senozhatsky@...omium.org" <senozhatsky@...omium.org>, "sj@...nel.org" <sj@...nel.org>, 
	"kasong@...cent.com" <kasong@...cent.com>, 
	"linux-crypto@...r.kernel.org" <linux-crypto@...r.kernel.org>, 
	"herbert@...dor.apana.org.au" <herbert@...dor.apana.org.au>, "davem@...emloft.net" <davem@...emloft.net>, 
	"clabbe@...libre.com" <clabbe@...libre.com>, "ardb@...nel.org" <ardb@...nel.org>, 
	"ebiggers@...gle.com" <ebiggers@...gle.com>, "surenb@...gle.com" <surenb@...gle.com>, 
	"Accardi, Kristen C" <kristen.c.accardi@...el.com>, "Gomes, Vinicius" <vinicius.gomes@...el.com>, 
	"Cabiddu, Giovanni" <giovanni.cabiddu@...el.com>, "Feghali, Wajdi K" <wajdi.k.feghali@...el.com>
Subject: Re: [PATCH v14 26/26] mm: zswap: Batched zswap_compress() for
 compress batching of large folios.

On Sat, Jan 31, 2026 at 12:32 PM Sridhar, Kanchana P
<kanchana.p.sridhar@...el.com> wrote:
>
>
> > -----Original Message-----
> > From: Nhat Pham <nphamcs@...il.com>
> > Sent: Friday, January 30, 2026 5:13 PM
> > To: Sridhar, Kanchana P <kanchana.p.sridhar@...el.com>
> > Cc: linux-kernel@...r.kernel.org; linux-mm@...ck.org;
> > hannes@...xchg.org; yosry.ahmed@...ux.dev; chengming.zhou@...ux.dev;
> > usamaarif642@...il.com; ryan.roberts@....com; 21cnbao@...il.com;
> > ying.huang@...ux.alibaba.com; akpm@...ux-foundation.org;
> > senozhatsky@...omium.org; sj@...nel.org; kasong@...cent.com; linux-
> > crypto@...r.kernel.org; herbert@...dor.apana.org.au;
> > davem@...emloft.net; clabbe@...libre.com; ardb@...nel.org;
> > ebiggers@...gle.com; surenb@...gle.com; Accardi, Kristen C
> > <kristen.c.accardi@...el.com>; Gomes, Vinicius <vinicius.gomes@...el.com>;
> > Cabiddu, Giovanni <giovanni.cabiddu@...el.com>; Feghali, Wajdi K
> > <wajdi.k.feghali@...el.com>
> > Subject: Re: [PATCH v14 26/26] mm: zswap: Batched zswap_compress() for
> > compress batching of large folios.
> >
> > On Sat, Jan 24, 2026 at 7:36 PM Kanchana P Sridhar
> > <kanchana.p.sridhar@...el.com> wrote:
> > >
> > > We introduce a new batching implementation of zswap_compress() for
> > > compressors that do and do not support batching. This eliminates code
> > > duplication and facilitates code maintainability with the introduction
> > > of compress batching.
> > >
> > > The vectorized implementation of calling the earlier zswap_compress()
> > > sequentially, one page at a time in zswap_store_pages(), is replaced
> > > with this new version of zswap_compress() that accepts multiple pages to
> > > compress as a batch.
> > >
> > > If the compressor does not support batching, each page in the batch is
> > > compressed and stored sequentially. If the compressor supports batching,
> > > for e.g., 'deflate-iaa', the Intel IAA hardware accelerator, the batch
> > > is compressed in parallel in hardware.
> > >
> > > If the batch is compressed without errors, the compressed buffers for
> > > the batch are stored in zsmalloc. In case of compression errors, the
> > > current behavior based on whether the folio is enabled for zswap
> > > writeback, is preserved.
> > >
> > > The batched zswap_compress() incorporates Herbert's suggestion for
> > > SG lists to represent the batch's inputs/outputs to interface with the
> > > crypto API [1].
> > >
> > > Performance data:
> > > =================
> > > As suggested by Barry, this is the performance data gathered on Intel
> > > Sapphire Rapids with two workloads:
> > >
> > > 1) 30 usemem processes in a 150 GB memory limited cgroup, each
> > >    allocates 10G, i.e, effectively running at 50% memory pressure.
> > > 2) kernel_compilation "defconfig", 32 threads, cgroup memory limit set
> > >    to 1.7 GiB (50% memory pressure, since baseline memory usage is 3.4
> > >    GiB): data averaged across 10 runs.
> > >
> > > To keep comparisons simple, all testing was done without the
> > > zswap shrinker.
> > >
> > >
> > ==============================================================
> > ===========
> > >   IAA                 mm-unstable-1-23-2026             v14
> > >
> > ==============================================================
> > ===========
> > >     zswap compressor            deflate-iaa     deflate-iaa   IAA Batching
> > >                                                                   vs.
> > >                                                             IAA Sequential
> > >
> > ==============================================================
> > ===========
> > >  usemem30, 64K folios:
> > >
> > >     Total throughput (KB/s)       6,226,967      10,551,714       69%
> > >     Average throughput (KB/s)       207,565         351,723       69%
> > >     elapsed time (sec)                99.19           67.45      -32%
> > >     sys time (sec)                 2,356.19        1,580.47      -33%
> > >
> > >  usemem30, PMD folios:
> > >
> > >     Total throughput (KB/s)       6,347,201      11,315,500       78%
> > >     Average throughput (KB/s)       211,573         377,183       78%
> > >     elapsed time (sec)                88.14           63.37      -28%
> > >     sys time (sec)                 2,025.53        1,455.23      -28%
> > >
> > >  kernel_compilation, 64K folios:
> > >
> > >     elapsed time (sec)               100.10           98.74     -1.4%
> > >     sys time (sec)                   308.72          301.23       -2%
> > >
> > >  kernel_compilation, PMD folios:
> > >
> > >     elapsed time (sec)                95.29           93.44     -1.9%
> > >     sys time (sec)                   346.21          344.48     -0.5%
> > >
> > ==============================================================
> > ===========
> > >
> > >
> > ==============================================================
> > ===========
> > >   ZSTD                mm-unstable-1-23-2026             v14
> > >
> > ==============================================================
> > ===========
> > >     zswap compressor                   zstd            zstd     v14 ZSTD
> > >                                                              Improvement
> > >
> > ==============================================================
> > ===========
> > >  usemem30, 64K folios:
> > >
> > >     Total throughput (KB/s)       6,032,326       6,047,448      0.3%
> > >     Average throughput (KB/s)       201,077         201,581      0.3%
> > >     elapsed time (sec)                97.52           95.33     -2.2%
> > >     sys time (sec)                 2,415.40        2,328.38       -4%
> > >
> > >  usemem30, PMD folios:
> > >
> > >     Total throughput (KB/s)       6,570,404       6,623,962      0.8%
> > >     Average throughput (KB/s)       219,013         220,798      0.8%
> > >     elapsed time (sec)                89.17           88.25       -1%
> > >     sys time (sec)                 2,126.69        2,043.08       -4%
> > >
> > >  kernel_compilation, 64K folios:
> > >
> > >     elapsed time (sec)               100.89           99.98     -0.9%
> > >     sys time (sec)                   417.49          414.62     -0.7%
> > >
> > >  kernel_compilation, PMD folios:
> > >
> > >     elapsed time (sec)                98.26           97.38     -0.9%
> > >     sys time (sec)                   487.14          473.16     -2.9%
> > >
> > ==============================================================
> > ===========
> >
> > The rest of the patch changelog (architectural and future
> > considerations)  can stay in the cover letter. Let's not duplicate
> > information :)
> >
> > Keep the patch changelog limited to only the changes in the patch
> > itself (unless we need some clarifications imminently relevant).
>
> Hi Nhat,
>
> Thanks for this comment. Yosry had also pointed this out in [1]. I have
> been including the architectural and future considerations in this change log
> since Andrew had asked me to do so. I hope this is Ok?

Ah hmmmmm. For some reasons I was under the assumption that usually
Andrew would concatenate the patch cover letter and the patch
changelog before merging. Oh well.

If Andrew prefers including that here then I'm fine with it.

>
> [1]: https://patchwork.kernel.org/comment/26706240/
>
> >
> > I'll review the remainder of the patch later :)
>
> Sure.
>
> Thanks,
> Kanchana