linux-kernel - RE: [PATCH v13 13/22] crypto: iaa - IAA Batching for parallel compressions/decompressions.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <SJ2PR11MB8472EB3D482C1455BD5A8EFFC9C8A@SJ2PR11MB8472.namprd11.prod.outlook.com>
Date: Sun, 16 Nov 2025 18:53:08 +0000
From: "Sridhar, Kanchana P" <kanchana.p.sridhar@...el.com>
To: Herbert Xu <herbert@...dor.apana.org.au>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>, "hannes@...xchg.org"
	<hannes@...xchg.org>, "yosry.ahmed@...ux.dev" <yosry.ahmed@...ux.dev>,
	"nphamcs@...il.com" <nphamcs@...il.com>, "chengming.zhou@...ux.dev"
	<chengming.zhou@...ux.dev>, "usamaarif642@...il.com"
	<usamaarif642@...il.com>, "ryan.roberts@....com" <ryan.roberts@....com>,
	"21cnbao@...il.com" <21cnbao@...il.com>, "ying.huang@...ux.alibaba.com"
	<ying.huang@...ux.alibaba.com>, "akpm@...ux-foundation.org"
	<akpm@...ux-foundation.org>, "senozhatsky@...omium.org"
	<senozhatsky@...omium.org>, "sj@...nel.org" <sj@...nel.org>,
	"kasong@...cent.com" <kasong@...cent.com>, "linux-crypto@...r.kernel.org"
	<linux-crypto@...r.kernel.org>, "davem@...emloft.net" <davem@...emloft.net>,
	"clabbe@...libre.com" <clabbe@...libre.com>, "ardb@...nel.org"
	<ardb@...nel.org>, "ebiggers@...gle.com" <ebiggers@...gle.com>,
	"surenb@...gle.com" <surenb@...gle.com>, "Accardi, Kristen C"
	<kristen.c.accardi@...el.com>, "Gomes, Vinicius" <vinicius.gomes@...el.com>,
	"Feghali, Wajdi K" <wajdi.k.feghali@...el.com>, "Gopal, Vinodh"
	<vinodh.gopal@...el.com>, "Sridhar, Kanchana P"
	<kanchana.p.sridhar@...el.com>
Subject: RE: [PATCH v13 13/22] crypto: iaa - IAA Batching for parallel
 compressions/decompressions.


> -----Original Message-----
> From: Herbert Xu <herbert@...dor.apana.org.au>
> Sent: Friday, November 14, 2025 1:59 AM
> To: Sridhar, Kanchana P <kanchana.p.sridhar@...el.com>
> Cc: linux-kernel@...r.kernel.org; linux-mm@...ck.org;
> hannes@...xchg.org; yosry.ahmed@...ux.dev; nphamcs@...il.com;
> chengming.zhou@...ux.dev; usamaarif642@...il.com;
> ryan.roberts@....com; 21cnbao@...il.com;
> ying.huang@...ux.alibaba.com; akpm@...ux-foundation.org;
> senozhatsky@...omium.org; sj@...nel.org; kasong@...cent.com; linux-
> crypto@...r.kernel.org; davem@...emloft.net; clabbe@...libre.com;
> ardb@...nel.org; ebiggers@...gle.com; surenb@...gle.com; Accardi,
> Kristen C <kristen.c.accardi@...el.com>; Gomes, Vinicius
> <vinicius.gomes@...el.com>; Feghali, Wajdi K <wajdi.k.feghali@...el.com>;
> Gopal, Vinodh <vinodh.gopal@...el.com>
> Subject: Re: [PATCH v13 13/22] crypto: iaa - IAA Batching for parallel
> compressions/decompressions.
> 
> On Tue, Nov 04, 2025 at 01:12:26AM -0800, Kanchana P Sridhar wrote:
> >
> > +/**
> > + * This API provides IAA compress batching functionality for use by swap
> > + * modules.
> > + *
> > + * @ctx:  compression ctx for the requested IAA mode (fixed/dynamic).
> > + * @parent_req: The "parent" iaa_req that contains SG lists for the batch's
> > + *              inputs and outputs.
> > + * @unit_size: The unit size to apply to @parent_req->slen to get the
> number of
> > + *             scatterlists it contains.
> > + *
> > + * The caller should check the individual sg->lengths in the @parent_req
> for
> > + * errors, including incompressible page errors.
> > + *
> > + * Returns 0 if all compress requests in the batch complete successfully,
> > + * -EINVAL otherwise.
> > + */
> > +static int iaa_comp_acompress_batch(
> > +	struct iaa_compression_ctx *ctx,
> > +	struct iaa_req *parent_req,
> > +	unsigned int unit_size)
> > +{
> > +	struct iaa_batch_ctx *cpu_ctx = raw_cpu_ptr(iaa_batch_ctx);
> > +	int nr_reqs = parent_req->slen / unit_size;
> > +	int errors[IAA_CRYPTO_MAX_BATCH_SIZE];
> > +	int *dlens[IAA_CRYPTO_MAX_BATCH_SIZE];
> > +	bool compressions_done = false;
> > +	struct sg_page_iter sgiter;
> > +	struct scatterlist *sg;
> > +	struct iaa_req **reqs;
> > +	int i, err = 0;
> > +
> > +	mutex_lock(&cpu_ctx->mutex);
> > +
> > +	reqs = cpu_ctx->reqs;
> > +
> > +	__sg_page_iter_start(&sgiter, parent_req->src, nr_reqs,
> > +			     parent_req->src->offset/unit_size);
> > +
> > +	for (i = 0; i < nr_reqs; ++i, ++sgiter.sg_pgoffset) {
> > +		sg_set_page(reqs[i]->src, sg_page_iter_page(&sgiter),
> PAGE_SIZE, 0);
> > +		reqs[i]->slen = PAGE_SIZE;
> > +	}
> > +
> > +	for_each_sg(parent_req->dst, sg, nr_reqs, i) {
> > +		sg->length = PAGE_SIZE;
> > +		dlens[i] = &sg->length;
> > +		reqs[i]->dst = sg;
> > +		reqs[i]->dlen = PAGE_SIZE;
> > +	}
> > +
> > +	iaa_set_req_poll(reqs, nr_reqs, true);
> > +
> > +	/*
> > +	 * Prepare and submit the batch of iaa_reqs to IAA. IAA will process
> > +	 * these compress jobs in parallel.
> > +	 */
> > +	for (i = 0; i < nr_reqs; ++i) {
> > +		errors[i] = iaa_comp_acompress(ctx, reqs[i]);
> > +
> > +		if (likely(errors[i] == -EINPROGRESS)) {
> > +			errors[i] = -EAGAIN;
> > +		} else if (unlikely(errors[i])) {
> > +			*dlens[i] = errors[i];
> > +			err = -EINVAL;
> > +		} else {
> > +			*dlens[i] = reqs[i]->dlen;
> > +		}
> > +	}
> > +
> > +	/*
> > +	 * Asynchronously poll for and process IAA compress job completions.
> > +	 */
> > +	while (!compressions_done) {
> > +		compressions_done = true;
> > +
> > +		for (i = 0; i < nr_reqs; ++i) {
> > +			/*
> > +			 * Skip, if the compression has already completed
> > +			 * successfully or with an error.
> > +			 */
> > +			if (errors[i] != -EAGAIN)
> > +				continue;
> > +
> > +			errors[i] = iaa_comp_poll(ctx, reqs[i]);
> > +
> > +			if (errors[i]) {
> > +				if (likely(errors[i] == -EAGAIN)) {
> > +					compressions_done = false;
> > +				} else {
> > +					*dlens[i] = errors[i];
> > +					err = -EINVAL;
> > +				}
> > +			} else {
> > +				*dlens[i] = reqs[i]->dlen;
> > +			}
> > +		}
> > +	}
> 
> Why is this polling necessary?
> 
> The crypto_acomp interface is async, even if the only user that
> you're proposing is synchronous.
> 
> IOW the driver shouldn't care about synchronous polling at all.
> Just invoke the callback once all the sub-requests are complete
> and the wait call in zswap will take care of the rest.

Hi Herbert,

This is a simple/low-overhead implementation that tries to avail of
hardware parallelism by launching multiple compress/decompress jobs
to the accelerator. Each job runs independently of the other from a
driver perspective. For e.g., no assumptions are made in the driver
about submission order vis-à-vis completion order. Completions can
occur asynchronously.

The polling is intended for exactly the purpose you mention, namely,
to know when all the sub-requests are complete and to set the sg->length
as each sub-request completes. Please let me know if I understood your
question correctly.

Thanks,
Kanchana

> 
> Cheers,
> --
> Email: Herbert Xu <herbert@...dor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt