linux-kernel - Re: [PATCH v2] mm/zswap: store <PAGE_SIZE compression failed page as-is

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250813204844.GC115258@cmpxchg.org>
Date: Wed, 13 Aug 2025 16:48:44 -0400
From: Johannes Weiner <hannes@...xchg.org>
To: Shakeel Butt <shakeel.butt@...ux.dev>
Cc: Chris Li <chrisl@...nel.org>, SeongJae Park <sj@...nel.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Chengming Zhou <chengming.zhou@...ux.dev>,
	David Hildenbrand <david@...hat.com>, Nhat Pham <nphamcs@...il.com>,
	Yosry Ahmed <yosry.ahmed@...ux.dev>, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, Takero Funaki <flintglass@...il.com>,
	Hugh Dickins <hughd@...gle.com>
Subject: Re: [PATCH v2] mm/zswap: store <PAGE_SIZE compression failed page
 as-is

On Wed, Aug 13, 2025 at 12:42:32PM -0700, Shakeel Butt wrote:
> On Wed, Aug 13, 2025 at 10:07:18AM -0700, Chris Li wrote:
> > 
> > If you store uncompressed data in the zpool, zpool has metadata
> > overhead, e.g. allocating the entry->handle for uncompressed pages.
> > If the page is not compressed, another idea is just skip the zpool,
> > store it as a page in the zswap entry as page. We can make a union of
> > entry->handle and entry->incompressble_page. If entry->length ==
> > PAGE_SIZE, use entry->incompressable_page as a page.
> 
> The main problem being solved here is to avoid the scenario where the
> incompressible pages are being rotated in LRUs and zswapped multiple
> times and wasting CPU on compressing incompressible pages. SJ's approach
> solves the issue but with some memory overhead (zswap entry). With your
> suggestion and to solve the mentioned issue, we will need to change some
> core parts of reclaim (__remove_mapping()), LRU handling (swap cache
> pages not in LRUs) and refault (putting such pages back in LRU and
> should it handle read and write faults differently). So, the cons of
> that approach is more complex code.

What Chris is proposing would also fix that, even for configurations
without writeback. So I'm not opposed to it.

However, for deployments where writeback *is* enabled, this code is an
improvement over the status quo. And it's not in conflict with a
broader fix for !writeback setups, so it's not an either-or scenario.

Specifically for the writeback case, the metadata overhead is not much
of a concern: we can just write back more zswap tail to make up for
it; the more important thing is that we can now do so in LRU order.

The premise being that writing an additional cold page from zswap to
disk to make room for a slightly inefficiently stored warm page is
better than rejecting and sending the *warm* page to disk instead.

So I agree with you Chris. But also think that's follow-up work for
somebody who cares about the !writeback case.