[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aBLRKuhJOSF8kGZv@gondor.apana.org.au>
Date: Thu, 1 May 2025 09:40:58 +0800
From: Herbert Xu <herbert@...dor.apana.org.au>
To: Kanchana P Sridhar <kanchana.p.sridhar@...el.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org, hannes@...xchg.org,
yosry.ahmed@...ux.dev, nphamcs@...il.com, chengming.zhou@...ux.dev,
usamaarif642@...il.com, ryan.roberts@....com, 21cnbao@...il.com,
ying.huang@...ux.alibaba.com, akpm@...ux-foundation.org,
linux-crypto@...r.kernel.org, davem@...emloft.net,
clabbe@...libre.com, ardb@...nel.org, ebiggers@...gle.com,
surenb@...gle.com, kristen.c.accardi@...el.com,
wajdi.k.feghali@...el.com, vinodh.gopal@...el.com
Subject: Re: [PATCH v9 10/19] crypto: acomp - New interfaces to facilitate
batching support in acomp & drivers.
On Wed, Apr 30, 2025 at 01:52:56PM -0700, Kanchana P Sridhar wrote:
>
> @@ -127,6 +131,22 @@ struct acomp_req {
> struct crypto_acomp {
> int (*compress)(struct acomp_req *req);
> int (*decompress)(struct acomp_req *req);
> + unsigned int (*get_batch_size)(void);
> + bool (*batch_compress)(
> + struct acomp_req *reqs[],
> + struct page *pages[],
> + u8 *dsts[],
> + unsigned int dlens[],
> + int errors[],
> + int nr_reqs);
> + bool (*batch_decompress)(
> + struct acomp_req *reqs[],
> + u8 *srcs[],
> + struct page *pages[],
> + unsigned int slens[],
> + unsigned int dlens[],
> + int errors[],
> + int nr_reqs);
I shelved request chaining because allocating one request per page
is actively harmful to performance. So we should not add any
interface that is based on one request per page.
My plan is to supply a whole folio through acomp_request_set_src_folio
and mark it as a batch request with a data unit size of 4K, e.g.:
acomp_request_set_src_folio(req, folio, 0, len);
acomp_request_set_data_unit(req, 4096);
Then the algorithm can dice it up in whatever way it sees fit. For
algorithms that don't support batching, the acompress API should dice
it up and feed it to the algorithm piece-meal.
IOW the folio loop in zswap_store would be moved into the Crypto API.
This is contingent on one API change, bringing back NULL dst support
to acompress. This way zswap does not need to worry about allocating
memory that might not even be needed (when pages compress well).
This won't look like the useless NULL dst we had before which simply
pre-allocated memory rather than allocating them on demand.
What acompress should do is allocate one dst page at a time, once that
is filled up, then allocate one more. They should be chained up in an
SG list. Pages that do not compress can be marked as a zero-length
entry in the SG list.
If the allocation fails at any point in time, simply stop the
batching at that point and return the SG list of what has been
compressed so far. After processing the returned pages, zswap
can then call acompress again with an offset into the folio to
continue compression.
To prevent pathological cases of zero progress, zswap can provide
one pre-allocated page to seed the process. For iaa, it should
just allocate as many pages as it needs for batching, and if that
fails, simply fall back to no batching and do things one page at
a time (or however many pages you manage to allocate).
I'll whip up a quick POC and we can work on top of it.
Cheers,
--
Email: Herbert Xu <herbert@...dor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
Powered by blists - more mailing lists