lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <keys236tojsj7a4lx6tyqtr3hbhvtjtkbpb73zejgzxmegjwrb@i2xkzvgp5ake>
Date: Fri, 14 Nov 2025 19:44:14 +0000
From: Yosry Ahmed <yosry.ahmed@...ux.dev>
To: "Sridhar, Kanchana P" <kanchana.p.sridhar@...el.com>
Cc: SeongJae Park <sj@...nel.org>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "linux-mm@...ck.org" <linux-mm@...ck.org>, 
	"hannes@...xchg.org" <hannes@...xchg.org>, "nphamcs@...il.com" <nphamcs@...il.com>, 
	"chengming.zhou@...ux.dev" <chengming.zhou@...ux.dev>, "usamaarif642@...il.com" <usamaarif642@...il.com>, 
	"ryan.roberts@....com" <ryan.roberts@....com>, "21cnbao@...il.com" <21cnbao@...il.com>, 
	"ying.huang@...ux.alibaba.com" <ying.huang@...ux.alibaba.com>, "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>, 
	"senozhatsky@...omium.org" <senozhatsky@...omium.org>, "kasong@...cent.com" <kasong@...cent.com>, 
	"linux-crypto@...r.kernel.org" <linux-crypto@...r.kernel.org>, "herbert@...dor.apana.org.au" <herbert@...dor.apana.org.au>, 
	"davem@...emloft.net" <davem@...emloft.net>, "clabbe@...libre.com" <clabbe@...libre.com>, 
	"ardb@...nel.org" <ardb@...nel.org>, "ebiggers@...gle.com" <ebiggers@...gle.com>, 
	"surenb@...gle.com" <surenb@...gle.com>, "Accardi, Kristen C" <kristen.c.accardi@...el.com>, 
	"Gomes, Vinicius" <vinicius.gomes@...el.com>, "Feghali, Wajdi K" <wajdi.k.feghali@...el.com>, 
	"Gopal, Vinodh" <vinodh.gopal@...el.com>
Subject: Re: [PATCH v13 22/22] mm: zswap: Batched zswap_compress() with
 compress batching of large folios.

On Fri, Nov 14, 2025 at 07:23:42PM +0000, Sridhar, Kanchana P wrote:
[..]
 > > > > >
> > > > > > > +		if (err && !wb_enabled)
> > > > > > > +			goto compress_error;
> > > > > > > +
> > > > > > > +		for_each_sg(acomp_ctx->sg_outputs->sgl, sg,
> > nr_comps, k) {
> > > > > > > +			j = k + i;
> > > > > >
[..]
> > > > >
> > > > > >
> > > > > > > +			dst = acomp_ctx->buffers[k];
> > > > > > > +			dlen = sg->length | *errp;
> > > > > >
> > > > > > Why are we doing this?
> > > > > >
> > > > > > > +
> > > > > > > +			if (dlen < 0) {
> > > > > >
> > > > > > We should do the incompressible page handling also if dlen is
> > PAGE_SIZE,
> > > > > > or if the compression failed (I guess that's the intention of bit OR'ing
> > > > > > with *errp?)
> > > > >
> > > > > Yes, indeed: that's the intention of bit OR'ing with *errp.
> > > >
> > > > ..and you never really answered my question. In the exising code we
> > > > store the page as incompressible if writeback is enabled AND
> > > > crypto_wait_req() fails or dlen is zero or PAGE_SIZE. We check above
> > > > if crypto_wait_req() fails and writeback is disabled, but what about the
> > > > rest?
> > >
> > > Let me explain this some more. The new code only relies on the assumption
> > > that if dlen is zero or >= PAGE_SIZE, the compressor will not return a 0
> > > ("successful status"). In other words, the compressor will return an error
> > status
> > > in this case, which is expected to be a negative error code.
> > 
> > I am not sure if all compressors do that, especially for the case where
> > dlen >= PAGE_SIZE. The existing code does not assume that there will be
> > an error code in these cases.
> > 
> > For the dlen == 0 case, the check was introduced recently by commit
> > dca4437a5861 ("mm/zswap: store <PAGE_SIZE compression failed page
> > as-is"). Looking through the history it seems like it was introduced in
> > v4 of that patch but I don't see the reasoning.
> 
> The existing code did not check for dlen == 0 and dlen >= PAGE_SIZE
> prior to commit dca4437a5861 ("mm/zswap: store <PAGE_SIZE compression
> failed page as-is") either. We need SeongJae or Herbert to clarify whether
> this check is needed, or if it is sufficient to rely on comp_ret, the return from
> crypto_wait_req().
> 
> > 
> > SeongJae, did you observe any compressors returning dlen == 0 but no
> > error code? What was the reasoning behind the dlen == 0 check?
> > 
> > >
> > > Under these (hopefully valid) assumptions, the code handles the simple case
> > > of an error compression return status and writeback is disabled, by the
> > > "goto compress_error".
> > >
> > > The rest is handled by these:
> > >
> > > 1) First, I need to adapt the use of sg_outputs->sgl->length to represent the
> > > compress length for software compressors, so I do this after
> > crypto_wait_req()
> > > returns:
> > >
> > >                 acomp_ctx->sg_outputs->sgl->length = acomp_ctx->req->dlen;
> > 
> > For SW compressors, why is acomp_ctx->sg_outputs->sgl->length not set?
> > IIUC we are using the same API for SW and HW compressors, why is the
> > output length in different places for each of them?
> 
> This is to first implement the SG lists batching interface in iaa_crypto, while
> maintaining backward compatibility for SW compressors with the new API.
> I believe we may want to adapt the crypto API to SW compressors
> at a later point. I also believe this would be outside the scope of this patch.
> It would be nice if Herbert can share his vision on this aspect.
> 
> > 
> > >
> > > I did not want to propose any changes to crypto software compressors
> > protocols.
> > >
> > > 2) After the check for the "if (err && !wb_enabled)" case, the new code has
> > this:
> > >
> > >                 for_each_sg(acomp_ctx->sg_outputs->sgl, sg, nr_comps, k) {
> > >                         j = k + i;
> > >                         dst = acomp_ctx->buffers[k];
> > >                         dlen = sg->length | *errp;
> > >
> > >                         if (dlen < 0) {
> > >                                 dlen = PAGE_SIZE;
> > >                                 dst = kmap_local_page(folio_page(folio, start + j));
> > >                         }
> > >
> > > For batching compressors, namely, iaa_crypto, the individual output SG
> > > lists sg->length follows the requirements from Herbert: each sg->length
> > > is the compressed length or the error status (a negative error code).
> > >
> > > Then all I need to know whether to store the page as incompressible
> > > is to either directly test if sg->length is negative (for batching compressors),
> > > or sg->length bit-OR'ed with the crypto_wait_req() return status ("err")
> > > is negative. This is accomplished by the "dlen = sg->length | *errp;".
> > >
> > > I believe this maintains backward compatibility with the existing code.
> > > Please let me know if you agree.
> > 
> > For batching compressors, will 'err' be set as well, or just sg->length?
> > If it's just sg->length, then we need to check again if WB is enabled
> > here before storing the page uncompressed. Right?
> 
> iaa_crypto will set 'err' and set the sg->length as per the batching interface
> spec from Herbert.

So both 'err' and sg->length will contain the same error? In this case
why do we need to check if dlen < 0? Shouldn't checking 'err' be
sufficient? and it would work for both SW and HW and we wouldn't need
errp. Did I miss something?

> 
> > 
> > >
> > > >
> > > > We don't check again if writeback is enabled before storing the page is
> > > > incompressible, and we do not check if dlen is zero or PAGE_SIZE. Are
> > > > these cases no longer possible?
> > >
> > > Hope the above explanation clarifies things some more? These case
> > > are possible, and as long as they return an error status, they should be
> > > correctly handled by the new code.
> > 
> > As mentioned above, I am not sure if that's correct for dlen >=
> > PAGE_SIZE.
> 
> We need to get clarity on this from SeongJae/Herbert.
> 
> > 
> > >
> > > >
> > > > Also, why use errp, why not explicitly use the appropriate error code?
> > > > It's also unclear to me why the error code is always zero with HW
> > > > compression?
> > >
> > > This is because of the sg->length requirements (compressed length/error)
> > > for the batching interface suggested by Herbert. Hence, I upfront define
> > > err_sg to 0, and, set errp to &err_sg for batching compressors. For software
> > > compressors, errp is set to &err, namely, the above check will always apply
> > > the software compressor's error status to the compressed length via
> > > the bit-OR to determine if the page needs to be stored uncompressed.
> > 
> > Thanks for the clarification. I understand that the error code has
> > different sources for SW and HW compressors, but I do not like using
> > errp as an indirection. It makes the code unclear. I would rather we
> > explicitly check err for SW compressors and dlen for HW compressors.
> > 
> > That being said, I thought what Herbert suggested was that the same API
> > is used for both SW and HW compressors. IOW, either way we submit a
> > batch of pages (8 pages for SW compressors), and then the crypto API
> > would either give the entire batch to the compressor if it supports
> > batching, or loop over them internally and hand them page-by-page to
> > the compressor.
> 
> That was not how I understood Herbert's suggestion for the batching interface.
> He did suggest the following:
> 
> "Before the call to acomp, the destination SG list should contain as
> many elements as the number of units.  On return, the dst lengths
> should be stored in each destination SG entry."
> 
> I have incorporated this suggestion in the iaa_crypto driver. For SW
> compressors, I have tried not to propose any API changes, while making
> sure that the zswap changes for the SG lists batching API work as expected
> for SW without too much special-casing code.
> 
> I suppose I always assumed that we would update SW compressors later,
> and not as part of this patch-set.

I am not sure I understand what changes lie in the crypto layer and what
changes lie in the SW compressors. I am not suggesting we do any
modification to the SW compressors.

I imagined that the crypto layer would present a uniform API regardless
of whether or not the compressor supports batching. Ideally zswap would
pass in a batch to crypto and it would figure out if it needs to break
them down or not. Then the output length and errors would be presented
uniformly to the caller.

That being said, I am not at all familiar with how crypto works and how
straightforward that would be. Herbert, WDYT?

> 
> > 
> > This would simplify usage as we do not have to handle the differences in
> > zswap.
> 
> That's the nice thing about SG lists - I think the zswap_compress() calls to
> the new batching API appears agnostic to SW and HW compressors.
> Other than the upfront "errp = (pool->compr_batch_size == 1) ? &err : &err_sg;"
> the logical code organization of the new zswap_compress() is quite similar to
> the existing code. The post-compress "dlen = sg->length | *errp;" handles the rest.

It would be even nicer if the batches are also abstracted by SG lists.

Also, I don't like how the error codes and output lengths are presented
differently for HW and SW compressors.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ