linux-kernel - Re: [PATCH] mm: Add Kcompressd for accelerated memory compression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKEwX=ObLVcbR9q7ZvR3WE2hhmxLpk1bSuvcbWZwo5o5vPCDRA@mail.gmail.com>
Date: Tue, 17 Jun 2025 07:21:47 -0700
From: Nhat Pham <nphamcs@...il.com>
To: Barry Song <21cnbao@...il.com>
Cc: hannes@...xchg.org, akpm@...ux-foundation.org, andrew.yang@...iatek.com, 
	angelogioacchino.delregno@...labora.com, casper.li@...iatek.com, 
	chinwen.chang@...iatek.com, james.hsu@...iatek.com, 
	linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org, 
	linux-mediatek@...ts.infradead.org, linux-mm@...ck.org, 
	matthias.bgg@...il.com, minchan@...nel.org, qun-wei.lin@...iatek.com, 
	rppt@...nel.org, senozhatsky@...omium.org, SeongJae Park <sj@...nel.org>
Subject: Re: [PATCH] mm: Add Kcompressd for accelerated memory compression

On Sun, Jun 15, 2025 at 8:41 PM Barry Song <21cnbao@...il.com> wrote:
>
> Hi Nhat, Johannes,
>
> >> The way you implemented this adds time-and-space overhead even on
> >> systems that don't have any sort of swap compression enabled.
>
> I agree — we can eliminate the time and space overhead by refining the
> code to hook kcompressed only when zswap or zram is enabled.
>
> >>
> >> That seems unnecessary. There is an existing method for asynchronous
> >> writeback, and pageout() is naturally fully set up to handle this.
> >>
> >> IMO the better way to do this is to make zswap_store() (and
> >> zram_bio_write()?) asynchronous. Make those functions queue the work
> >> and wake the compression daemon, and then have the daemon call
> >> folio_end_writeback() / bio_endio() when it's done with it.
>
> > +1.
>
>
> But,
> How could this be possible for zswap? zswap_store() is only a frontend —
> we still need its return value to determine whether __swap_writepage()
> is required. Waiting for the result of zswap_store() is inherently a
> synchronous step.

Hmm, I might be misunderstanding either of you, but it sounds like
what you're describing here does not contradict what Johannes is
proposing?

>
> My point is that folio_end_writeback() and bio_endio() can only be
> called after the entire zswap_store() → __swap_writepage() sequence is
> completed. That’s why both are placed in the new kcompressed.

Hmm, how about:

1. Inside zswap_store(), we first obtain the obj_cgroup reference,
check cgroup and pool limit, and grab a zswap pool reference (in
effect, determining the slot allocator and compressor).

2. Next, we try to queue the work to kcompressd, saving the folio and
the zswap pool (and whatever else we need for the continuation). If
this fails, we can proceed with the old synchronous path.

3. In kcompressed daemon, we perform the continuation of
zswap_store(): compression, slot allocation, storing, zswap's LRU
modification, etc. If this fails, we check if the mem_cgroup enables
writeback. If it's enabled, we can call __swap_writepage(). Ideally,
if writeback is disabled, we should activate the page, but it might
not be possible since shrink_folio_list() might already re-add the
page to the inactive lru. Maybe some modification of pageout() and
shrink_folio_list() can make this work, but I haven't thought too
deeply about it :) If it's impossible, we can perform async
compression only for cgroups that enable writeback for now. Once we
fix zswap's handling of incompressible pages, we can revisit this
decision (+ SJ).

TLDR: move the work-queueing step forward a bit, into the middle of
zswap_store().

One benefit of this is we skip pages of cgroups that disable zswap, or
when zswap pool is full.

>
> The use of folio_end_writeback() and bio_endio() was the case for zRAM
> in Qun-Wei's RFC.
>
> https://lore.kernel.org/linux-mm/20250307120141.1566673-3-qun-wei.lin@mediatek.com/
>
> However, the implementation tightly coupled zRAM with reclamation logic.
> For example, zRAM needed to know whether it was running in the kswapd
> context, which is not ideal for a generic block device — the role zRAM
> is supposed to play. Additionally, the code was not shared between zswap
> and zRAM.
>
> Thanks
> Barry