lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SA3PR11MB81201972CB6D308FB4659E7DC93AA@SA3PR11MB8120.namprd11.prod.outlook.com>
Date: Fri, 29 Aug 2025 18:39:30 +0000
From: "Sridhar, Kanchana P" <kanchana.p.sridhar@...el.com>
To: Barry Song <21cnbao@...il.com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>, "hannes@...xchg.org"
	<hannes@...xchg.org>, "yosry.ahmed@...ux.dev" <yosry.ahmed@...ux.dev>,
	"nphamcs@...il.com" <nphamcs@...il.com>, "chengming.zhou@...ux.dev"
	<chengming.zhou@...ux.dev>, "usamaarif642@...il.com"
	<usamaarif642@...il.com>, "ryan.roberts@....com" <ryan.roberts@....com>,
	"ying.huang@...ux.alibaba.com" <ying.huang@...ux.alibaba.com>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	"senozhatsky@...omium.org" <senozhatsky@...omium.org>,
	"linux-crypto@...r.kernel.org" <linux-crypto@...r.kernel.org>,
	"herbert@...dor.apana.org.au" <herbert@...dor.apana.org.au>,
	"davem@...emloft.net" <davem@...emloft.net>, "clabbe@...libre.com"
	<clabbe@...libre.com>, "ardb@...nel.org" <ardb@...nel.org>,
	"ebiggers@...gle.com" <ebiggers@...gle.com>, "surenb@...gle.com"
	<surenb@...gle.com>, "Accardi, Kristen C" <kristen.c.accardi@...el.com>,
	"Gomes, Vinicius" <vinicius.gomes@...el.com>, "Feghali, Wajdi K"
	<wajdi.k.feghali@...el.com>, "Gopal, Vinodh" <vinodh.gopal@...el.com>,
	"Sridhar, Kanchana P" <kanchana.p.sridhar@...el.com>
Subject: RE: [PATCH v11 22/24] mm: zswap: Allocate pool batching resources if
 the compressor supports batching.


> -----Original Message-----
> From: Barry Song <21cnbao@...il.com>
> Sent: Thursday, August 28, 2025 8:42 PM
> To: Sridhar, Kanchana P <kanchana.p.sridhar@...el.com>
> Cc: linux-kernel@...r.kernel.org; linux-mm@...ck.org;
> hannes@...xchg.org; yosry.ahmed@...ux.dev; nphamcs@...il.com;
> chengming.zhou@...ux.dev; usamaarif642@...il.com;
> ryan.roberts@....com; ying.huang@...ux.alibaba.com; akpm@...ux-
> foundation.org; senozhatsky@...omium.org; linux-crypto@...r.kernel.org;
> herbert@...dor.apana.org.au; davem@...emloft.net;
> clabbe@...libre.com; ardb@...nel.org; ebiggers@...gle.com;
> surenb@...gle.com; Accardi, Kristen C <kristen.c.accardi@...el.com>;
> Gomes, Vinicius <vinicius.gomes@...el.com>; Feghali, Wajdi K
> <wajdi.k.feghali@...el.com>; Gopal, Vinodh <vinodh.gopal@...el.com>
> Subject: Re: [PATCH v11 22/24] mm: zswap: Allocate pool batching resources
> if the compressor supports batching.
> 
> On Fri, Aug 29, 2025 at 10:57 AM Sridhar, Kanchana P
> <kanchana.p.sridhar@...el.com> wrote:
> >
> >
> > > -----Original Message-----
> > > From: Barry Song <21cnbao@...il.com>
> > > Sent: Thursday, August 28, 2025 4:29 PM
> > > To: Sridhar, Kanchana P <kanchana.p.sridhar@...el.com>
> > > Cc: linux-kernel@...r.kernel.org; linux-mm@...ck.org;
> > > hannes@...xchg.org; yosry.ahmed@...ux.dev; nphamcs@...il.com;
> > > chengming.zhou@...ux.dev; usamaarif642@...il.com;
> > > ryan.roberts@....com; ying.huang@...ux.alibaba.com; akpm@...ux-
> > > foundation.org; senozhatsky@...omium.org; linux-
> crypto@...r.kernel.org;
> > > herbert@...dor.apana.org.au; davem@...emloft.net;
> > > clabbe@...libre.com; ardb@...nel.org; ebiggers@...gle.com;
> > > surenb@...gle.com; Accardi, Kristen C <kristen.c.accardi@...el.com>;
> > > Gomes, Vinicius <vinicius.gomes@...el.com>; Feghali, Wajdi K
> > > <wajdi.k.feghali@...el.com>; Gopal, Vinodh <vinodh.gopal@...el.com>
> > > Subject: Re: [PATCH v11 22/24] mm: zswap: Allocate pool batching
> resources
> > > if the compressor supports batching.
> > >
> > > > >
> > > > > If ZSWAP_MAX_BATCH_SIZE is set to 8 and there is no hardware
> batching,
> > > > > compression is done with a step size of 1. If the hardware step size is
> 4,
> > > > > compression occurs in two steps. If the hardware step size is 6, the
> first
> > > > > compression uses a step size of 6, and the second uses a step size of 2.
> > > > > Do you think this will work?
> > > >
> > > > Hi Barry,
> > > >
> > > > This would be non-optimal from code simplicity and latency
> perspectives.
> > > > One of the benefits of using the hardware accelerator's "batch
> parallelism"
> > > > is cost amortization across the batch. We might lose this benefit if we
> make
> > > > multiple calls to zswap_compress() to ask the hardware accelerator to
> > > > batch compress in smaller batches. Compression throughput would also
> > > > be sub-optimal.
> > >
> > > I guess it wouldn’t be an issue if both ZSWAP_MAX_BATCH_SIZE and
> > > pool->compr_batch_size are powers of two. As you mentioned, we still
> > > gain improvement with ZSWAP_MAX_BATCH_SIZE batching even when
> > > pool->compr_batch_size == 1, by compressing pages one by one but
> > > batching other work such as zswap_entries_cache_alloc_batch() ?
> > >
> > > >
> > > > In my patch-series, the rule is simple: if an algorithm has specified a
> > > > batch-size, carve out the folio by that "batch-size" # of pages to be
> > > > compressed as a batch in zswap_compress(). This custom batch-size
> > > > is capped at ZSWAP_MAX_BATCH_SIZE.
> > > >
> > > > If an algorithm has not specified a batch-size, the default batch-size
> > > > is 1. In this case, carve out the folio by ZSWAP_MAX_BATCH_SIZE
> > > > # of pages to be compressed as a batch in zswap_compress().
> > >
> > > Yes, I understand your rule. However, having two global variables is still
> > > somewhat confusing. It might be clearer to use a single variable with a
> > > comment, since one variable can clearly determine the value of the other.
> > >
> > > Can we get the batch_size at runtime based on pool->compr_batch_size?
> > >
> > > /*
> > >  * If hardware compression supports batching, we use the hardware step
> size.
> > >  * Otherwise, we use ZSWAP_MAX_BATCH_SIZE for batching, but still
> > > compress
> > >  * one page at a time.
> > >  */
> > > batch_size = pool->compr_batch_size > 1 ? pool->compr_batch_size :
> > >              ZSWAP_MAX_BATCH_SIZE;
> > >
> > > We probably don’t need this if both pool->compr_batch_size and
> > > ZSWAP_MAX_BATCH_SIZE are powers of two?
> >
> > I am not sure I understand this rationale, but I do want to reiterate
> > that the patch-set implements a simple set of rules/design choices
> > to provide a batching framework for software and hardware compressors,
> > that has shown good performance improvements with both, while
> > unifying zswap_store()/zswap_compress() code paths for both.
> 
> I’m really curious: if ZSWAP_MAX_BATCH_SIZE = 8 and
> compr_batch_size = 4, why wouldn’t batch_size = 8 and
> compr_batch_size = 4 perform better than batch_size = 4 and
> compr_batch_size = 4?
> 
> In short, I’d like the case of compr_batch_size == 1 to be treated the same
> as compr_batch_size == 2, 4, etc., since you can still see performance
> improvements when ZSWAP_MAX_BATCH_SIZE = 8 and compr_batch_size ==
> 1,
> as batching occurs even outside compression.
> 
> Therefore, I would expect batch_size == 8 and compr_batch_size == 2 to
> perform better than when both are 2.
> 
> The only thing preventing this from happening is that compr_batch_size
> might be 5, 6, or 7, which are not powers of two?

It would be interesting to see if a generalization of pool->compr_batch_size
being a factor "N" (where N > 1) of ZSWAP_MAX_BATCH_SIZE yields better
performance than the current set of rules. However, we would still need to
handle the case where it is not, as you mention, which might still necessitate
the use of a distinct pool->batch_size to avoid re-calculating this dynamically,
when this information doesn't change after pool creation.

The current implementation gives preference to the algorithm to determine
not just the batch compression step-size, but also the working-set size for
other zswap processing for the batch, i.e., bulk allocation of entries,
zpool writes, etc. The algorithm's batch-size is what zswap uses for the latter
(the zswap_store_pages() in my patch-set). This has been shown to work
well.

To change this design to be driven instead by ZSWAP_MAX_BATCH_SIZE
always (while handling non-factor pool->compr_batch_size) requires more
data gathering. I am inclined to keep the existing implementation and
we can continue to improve upon this if its Ok with you.

> 
> >
> > As explained before, keeping the two variables as distinct u8 members
> > of struct zswap_pool is a design choice with these benefits:
> >
> > 1) Saves computes by avoiding computing this in performance-critical
> >     zswap_store() code. I have verified that dynamically computing the
> >     batch_size based on pool->compr_batch_size impacts latency.
> 
> Ok, I’m a bit surprised, since this small computation wouldn’t
> cause a branch misprediction at all.
> 
> In any case, if you want to keep both variables, that’s fine.
> But can we at least rename one of them? For example, use
> pool->store_batch_size and pool->compr_batch_size instead of
> pool->batch_size and pool->compr_batch_size, since pool->batch_size
> generally has a broader semantic scope than compr_batch_size.

Sure. I will change pool->batch_size to be pool->store_batch_size.

Thanks,
Kanchana

> 
> Thanks
> Barry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ