linux-kernel - Re: [RFC PATCH] mm/zswap: store compression failed page as-is

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20250731170922.15509-1-sj@kernel.org>
Date: Thu, 31 Jul 2025 10:09:22 -0700
From: SeongJae Park <sj@...nel.org>
To: Johannes Weiner <hannes@...xchg.org>
Cc: SeongJae Park <sj@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Chengming Zhou <chengming.zhou@...ux.dev>,
	Nhat Pham <nphamcs@...il.com>,
	Takero Funaki <flintglass@...il.com>,
	Yosry Ahmed <yosry.ahmed@...ux.dev>,
	kernel-team@...a.com,
	linux-kernel@...r.kernel.org,
	linux-mm@...ck.org
Subject: Re: [RFC PATCH] mm/zswap: store compression failed page as-is

On Thu, 31 Jul 2025 11:27:01 -0400 Johannes Weiner <hannes@...xchg.org> wrote:

> Hi SJ,
> 
> On Wed, Jul 30, 2025 at 04:40:59PM -0700, SeongJae Park wrote:
> > When zswap writeback is enabled and it fails compressing a given page,
> > zswap lets the page be swapped out to the backing swap device.  This
> > behavior breaks the zswap's writeback LRU order, and hence users can
> > experience unexpected latency spikes.
> 
> +1 Thanks for working on this!

My pleasure :)

[...]
> >     config            0       1-1     1-2      1-3      2-1     2-2      2-3
> >     perf_normalized   1.0000  0.0060  0.0230   0.0310   0.0030  0.0116   0.0003
> >     perf_stdev_ratio  0.06    0.04    0.11     0.14     0.03    0.05     0.26
> >     zswpin            0       0       1701702  6982188  0       2479848  815742
> >     zswpout           0       0       1725260  7048517  0       2535744  931420
> >     zswpwb            0       0       0        0        0       0        0
> >     pswpin            0       468612  481270   0        476434  535772   0
> >     pswpout           0       574634  689625   0        612683  591881   0
> 
> zswpwb being zero across the board suggests the zswap shrinker was not
> enabled. Can you double check that?

You're correct, I didn't.

> 
> We should only take on incompressible pages to maintain LRU order on
> their way to disk. If we don't try to move them out, then it's better
> to reject them and avoid the metadata overhead.

I agree.  I will update the test to explore the writeback effect, and share the
results, by the posting of the next version of this patch.

[...]
> > +/*
> > + * If the compression is failed, try saving the content as is without
> > + * compression, to keep the LRU order.  This can increase memory overhead from
> > + * metadata, but in common zswap use cases where there are sufficient amount of
> > + * compressible pages, the overhead should be not ciritical, and can be
> > + * mitigated by the writeback.  Also, the decompression overhead is optimized.
> > + *
> > + * When the writeback is disabled, however, the additional overhead could be
> > + * problematic.  For the case, just return the failure.  swap_writeout() will
> > + * put the page back to the active LRU list in the case.
> > + */
> > +static int zswap_handle_compression_failure(int comp_ret, struct page *page,
> > +		struct zswap_entry *entry)
> > +{
> > +	if (!zswap_save_incompressible_pages)
> > +		return comp_ret;
> > +	if (!mem_cgroup_zswap_writeback_enabled(
> > +				folio_memcg(page_folio(page))))
> > +		return comp_ret;
> > +
> > +	entry->orig_data = kmalloc_node(PAGE_SIZE, GFP_NOWAIT | __GFP_NORETRY |
> > +			__GFP_HIGHMEM | __GFP_MOVABLE, page_to_nid(page));
> > +	if (!entry->orig_data)
> > +		return -ENOMEM;
> > +	memcpy_from_page(entry->orig_data, page, 0, PAGE_SIZE);
> > +	entry->length = PAGE_SIZE;
> > +	atomic_long_inc(&zswap_stored_uncompressed_pages);
> > +	return 0;
> > +}
> 
> Better to still use the backend (zsmalloc) for storage. It'll give you
> migratability, highmem handling, stats etc.

Nhat also pointed out the migratability.  Thank you for further letting me know
even more benefits from zsmalloc reuse.  As I also replied to Nhat, I agree
those are important benefits, and I will rework on the next version to use the
backend.

> 
> So if compression fails, still do zpool_malloc(), but for PAGE_SIZE
> and copy over the uncompressed page contents.
> 
> struct zswap_entry has a hole after bool referenced, so you can add a
> flag to mark those uncompressed entries at no extra cost.
> 
> Then you can detect this case in zswap_decompress() and handle the
> uncompressed copy into @folio accordingly.

I think we could still use 'zswap_entry->length == PAGE_SIZE' as the indicator,
As long as we ensure that always means the content is incompressed, following
Nhat's suggestion[1].

Please let me know if I'm missing something.

[1] https://lore.kernel.org/CAKEwX=Py+yvxtR5zt-1DtskhGWWHkRP_h8kneEHSrcQ947=m9Q@mail.gmail.com


Thanks,
SJ