lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250501140226.GE2020@cmpxchg.org>
Date: Thu, 1 May 2025 10:02:26 -0400
From: Johannes Weiner <hannes@...xchg.org>
To: Qun-Wei Lin <qun-wei.lin@...iatek.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
	Mike Rapoport <rppt@...nel.org>,
	Matthias Brugger <matthias.bgg@...il.com>,
	AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>,
	Nhat Pham <nphamcs@...il.com>,
	Sergey Senozhatsky <senozhatsky@...omium.org>,
	Minchan Kim <minchan@...nel.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
	linux-mediatek@...ts.infradead.org,
	Casper Li <casper.li@...iatek.com>,
	Chinwen Chang <chinwen.chang@...iatek.com>,
	Andrew Yang <andrew.yang@...iatek.com>,
	James Hsu <james.hsu@...iatek.com>, Barry Song <21cnbao@...il.com>
Subject: Re: [PATCH] mm: Add Kcompressd for accelerated memory compression

On Wed, Apr 30, 2025 at 04:26:41PM +0800, Qun-Wei Lin wrote:
> This patch series introduces a new mechanism called kcompressd to
> improve the efficiency of memory reclaiming in the operating system.
> 
> Problem:
>   In the current system, the kswapd thread is responsible for both scanning
>   the LRU pages and handling memory compression tasks (such as those
>   involving ZSWAP/ZRAM, if enabled). This combined responsibility can lead
>   to significant performance bottlenecks, especially under high memory
>   pressure. The kswapd thread becomes a single point of contention, causing
>   delays in memory reclaiming and overall system performance degradation.
> 
> Solution:
>   Introduced kcompressd to handle asynchronous compression during memory
>   reclaim, improving efficiency by offloading compression tasks from
>   kswapd. This allows kswapd to focus on its primary task of page reclaim
>   without being burdened by the additional overhead of compression.
> 
> In our handheld devices, we found that applying this mechanism under high
> memory pressure scenarios can increase the rate of pgsteal_anon per second
> by over 260% compared to the situation with only kswapd. Additionally, we
> observed a reduction of over 50% in page allocation stall occurrences,
> further demonstrating the effectiveness of kcompressd in alleviating memory
> pressure and improving system responsiveness.

Yes, I think parallelizing this work makes a lot of sense.

> Co-developed-by: Barry Song <21cnbao@...il.com>
> Signed-off-by: Barry Song <21cnbao@...il.com>
> Signed-off-by: Qun-Wei Lin <qun-wei.lin@...iatek.com>
> Reference: Re: [PATCH 0/2] Improve Zram by separating compression context from kswapd - Barry Song
>            https://lore.kernel.org/lkml/20250313093005.13998-1-21cnbao@gmail.com/
> ---
>  include/linux/mmzone.h |  6 ++++
>  mm/mm_init.c           |  1 +
>  mm/page_io.c           | 71 ++++++++++++++++++++++++++++++++++++++++++
>  mm/swap.h              |  6 ++++
>  mm/vmscan.c            | 25 +++++++++++++++
>  5 files changed, 109 insertions(+)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 6ccec1bf2896..93c9195a54ae 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -23,6 +23,7 @@
>  #include <linux/page-flags.h>
>  #include <linux/local_lock.h>
>  #include <linux/zswap.h>
> +#include <linux/kfifo.h>
>  #include <asm/page.h>
>  
>  /* Free memory management - zoned buddy allocator.  */
> @@ -1398,6 +1399,11 @@ typedef struct pglist_data {
>  
>  	int kswapd_failures;		/* Number of 'reclaimed == 0' runs */
>  
> +#define KCOMPRESS_FIFO_SIZE 256
> +	wait_queue_head_t kcompressd_wait;
> +	struct task_struct *kcompressd;
> +	struct kfifo kcompress_fifo;

The way you implemented this adds time-and-space overhead even on
systems that don't have any sort of swap compression enabled.

That seems unnecessary. There is an existing method for asynchronous
writeback, and pageout() is naturally fully set up to handle this.

IMO the better way to do this is to make zswap_store() (and
zram_bio_write()?) asynchronous. Make those functions queue the work
and wake the compression daemon, and then have the daemon call
folio_end_writeback() / bio_endio() when it's done with it.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ