[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <SJ2PR11MB8472EB3D482C1455BD5A8EFFC9C8A@SJ2PR11MB8472.namprd11.prod.outlook.com>
Date: Sun, 16 Nov 2025 18:53:08 +0000
From: "Sridhar, Kanchana P" <kanchana.p.sridhar@...el.com>
To: Herbert Xu <herbert@...dor.apana.org.au>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>, "hannes@...xchg.org"
<hannes@...xchg.org>, "yosry.ahmed@...ux.dev" <yosry.ahmed@...ux.dev>,
"nphamcs@...il.com" <nphamcs@...il.com>, "chengming.zhou@...ux.dev"
<chengming.zhou@...ux.dev>, "usamaarif642@...il.com"
<usamaarif642@...il.com>, "ryan.roberts@....com" <ryan.roberts@....com>,
"21cnbao@...il.com" <21cnbao@...il.com>, "ying.huang@...ux.alibaba.com"
<ying.huang@...ux.alibaba.com>, "akpm@...ux-foundation.org"
<akpm@...ux-foundation.org>, "senozhatsky@...omium.org"
<senozhatsky@...omium.org>, "sj@...nel.org" <sj@...nel.org>,
"kasong@...cent.com" <kasong@...cent.com>, "linux-crypto@...r.kernel.org"
<linux-crypto@...r.kernel.org>, "davem@...emloft.net" <davem@...emloft.net>,
"clabbe@...libre.com" <clabbe@...libre.com>, "ardb@...nel.org"
<ardb@...nel.org>, "ebiggers@...gle.com" <ebiggers@...gle.com>,
"surenb@...gle.com" <surenb@...gle.com>, "Accardi, Kristen C"
<kristen.c.accardi@...el.com>, "Gomes, Vinicius" <vinicius.gomes@...el.com>,
"Feghali, Wajdi K" <wajdi.k.feghali@...el.com>, "Gopal, Vinodh"
<vinodh.gopal@...el.com>, "Sridhar, Kanchana P"
<kanchana.p.sridhar@...el.com>
Subject: RE: [PATCH v13 13/22] crypto: iaa - IAA Batching for parallel
compressions/decompressions.
> -----Original Message-----
> From: Herbert Xu <herbert@...dor.apana.org.au>
> Sent: Friday, November 14, 2025 1:59 AM
> To: Sridhar, Kanchana P <kanchana.p.sridhar@...el.com>
> Cc: linux-kernel@...r.kernel.org; linux-mm@...ck.org;
> hannes@...xchg.org; yosry.ahmed@...ux.dev; nphamcs@...il.com;
> chengming.zhou@...ux.dev; usamaarif642@...il.com;
> ryan.roberts@....com; 21cnbao@...il.com;
> ying.huang@...ux.alibaba.com; akpm@...ux-foundation.org;
> senozhatsky@...omium.org; sj@...nel.org; kasong@...cent.com; linux-
> crypto@...r.kernel.org; davem@...emloft.net; clabbe@...libre.com;
> ardb@...nel.org; ebiggers@...gle.com; surenb@...gle.com; Accardi,
> Kristen C <kristen.c.accardi@...el.com>; Gomes, Vinicius
> <vinicius.gomes@...el.com>; Feghali, Wajdi K <wajdi.k.feghali@...el.com>;
> Gopal, Vinodh <vinodh.gopal@...el.com>
> Subject: Re: [PATCH v13 13/22] crypto: iaa - IAA Batching for parallel
> compressions/decompressions.
>
> On Tue, Nov 04, 2025 at 01:12:26AM -0800, Kanchana P Sridhar wrote:
> >
> > +/**
> > + * This API provides IAA compress batching functionality for use by swap
> > + * modules.
> > + *
> > + * @ctx: compression ctx for the requested IAA mode (fixed/dynamic).
> > + * @parent_req: The "parent" iaa_req that contains SG lists for the batch's
> > + * inputs and outputs.
> > + * @unit_size: The unit size to apply to @parent_req->slen to get the
> number of
> > + * scatterlists it contains.
> > + *
> > + * The caller should check the individual sg->lengths in the @parent_req
> for
> > + * errors, including incompressible page errors.
> > + *
> > + * Returns 0 if all compress requests in the batch complete successfully,
> > + * -EINVAL otherwise.
> > + */
> > +static int iaa_comp_acompress_batch(
> > + struct iaa_compression_ctx *ctx,
> > + struct iaa_req *parent_req,
> > + unsigned int unit_size)
> > +{
> > + struct iaa_batch_ctx *cpu_ctx = raw_cpu_ptr(iaa_batch_ctx);
> > + int nr_reqs = parent_req->slen / unit_size;
> > + int errors[IAA_CRYPTO_MAX_BATCH_SIZE];
> > + int *dlens[IAA_CRYPTO_MAX_BATCH_SIZE];
> > + bool compressions_done = false;
> > + struct sg_page_iter sgiter;
> > + struct scatterlist *sg;
> > + struct iaa_req **reqs;
> > + int i, err = 0;
> > +
> > + mutex_lock(&cpu_ctx->mutex);
> > +
> > + reqs = cpu_ctx->reqs;
> > +
> > + __sg_page_iter_start(&sgiter, parent_req->src, nr_reqs,
> > + parent_req->src->offset/unit_size);
> > +
> > + for (i = 0; i < nr_reqs; ++i, ++sgiter.sg_pgoffset) {
> > + sg_set_page(reqs[i]->src, sg_page_iter_page(&sgiter),
> PAGE_SIZE, 0);
> > + reqs[i]->slen = PAGE_SIZE;
> > + }
> > +
> > + for_each_sg(parent_req->dst, sg, nr_reqs, i) {
> > + sg->length = PAGE_SIZE;
> > + dlens[i] = &sg->length;
> > + reqs[i]->dst = sg;
> > + reqs[i]->dlen = PAGE_SIZE;
> > + }
> > +
> > + iaa_set_req_poll(reqs, nr_reqs, true);
> > +
> > + /*
> > + * Prepare and submit the batch of iaa_reqs to IAA. IAA will process
> > + * these compress jobs in parallel.
> > + */
> > + for (i = 0; i < nr_reqs; ++i) {
> > + errors[i] = iaa_comp_acompress(ctx, reqs[i]);
> > +
> > + if (likely(errors[i] == -EINPROGRESS)) {
> > + errors[i] = -EAGAIN;
> > + } else if (unlikely(errors[i])) {
> > + *dlens[i] = errors[i];
> > + err = -EINVAL;
> > + } else {
> > + *dlens[i] = reqs[i]->dlen;
> > + }
> > + }
> > +
> > + /*
> > + * Asynchronously poll for and process IAA compress job completions.
> > + */
> > + while (!compressions_done) {
> > + compressions_done = true;
> > +
> > + for (i = 0; i < nr_reqs; ++i) {
> > + /*
> > + * Skip, if the compression has already completed
> > + * successfully or with an error.
> > + */
> > + if (errors[i] != -EAGAIN)
> > + continue;
> > +
> > + errors[i] = iaa_comp_poll(ctx, reqs[i]);
> > +
> > + if (errors[i]) {
> > + if (likely(errors[i] == -EAGAIN)) {
> > + compressions_done = false;
> > + } else {
> > + *dlens[i] = errors[i];
> > + err = -EINVAL;
> > + }
> > + } else {
> > + *dlens[i] = reqs[i]->dlen;
> > + }
> > + }
> > + }
>
> Why is this polling necessary?
>
> The crypto_acomp interface is async, even if the only user that
> you're proposing is synchronous.
>
> IOW the driver shouldn't care about synchronous polling at all.
> Just invoke the callback once all the sub-requests are complete
> and the wait call in zswap will take care of the rest.
Hi Herbert,
This is a simple/low-overhead implementation that tries to avail of
hardware parallelism by launching multiple compress/decompress jobs
to the accelerator. Each job runs independently of the other from a
driver perspective. For e.g., no assumptions are made in the driver
about submission order vis-à-vis completion order. Completions can
occur asynchronously.
The polling is intended for exactly the purpose you mention, namely,
to know when all the sub-requests are complete and to set the sg->length
as each sub-request completes. Please let me know if I understood your
question correctly.
Thanks,
Kanchana
>
> Cheers,
> --
> Email: Herbert Xu <herbert@...dor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Powered by blists - more mailing lists