lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b9fc1efa457aa29104737e5f6a4868602a3078e5.camel@mediatek.com>
Date: Mon, 10 Mar 2025 13:21:35 +0000
From: Qun-wei Lin (林群崴) <Qun-wei.Lin@...iatek.com>
To: "21cnbao@...il.com" <21cnbao@...il.com>
CC: Andrew Yang (楊智強) <Andrew.Yang@...iatek.com>,
	Casper Li (李中榮) <casper.li@...iatek.com>,
	"chrisl@...nel.org" <chrisl@...nel.org>,
	James Hsu (徐慶薰) <James.Hsu@...iatek.com>,
	AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>,
	"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mediatek@...ts.infradead.org" <linux-mediatek@...ts.infradead.org>,
	"ira.weiny@...el.com" <ira.weiny@...el.com>, "linux-mm@...ck.org"
	<linux-mm@...ck.org>, "dave.jiang@...el.com" <dave.jiang@...el.com>,
	"schatzberg.dan@...il.com" <schatzberg.dan@...il.com>,
	Chinwen Chang (張錦文)
	<chinwen.chang@...iatek.com>, "viro@...iv.linux.org.uk"
	<viro@...iv.linux.org.uk>, "ryan.roberts@....com" <ryan.roberts@....com>,
	"minchan@...nel.org" <minchan@...nel.org>, "axboe@...nel.dk"
	<axboe@...nel.dk>, "linux-block@...r.kernel.org"
	<linux-block@...r.kernel.org>, "kasong@...cent.com" <kasong@...cent.com>,
	"nvdimm@...ts.linux.dev" <nvdimm@...ts.linux.dev>, "vishal.l.verma@...el.com"
	<vishal.l.verma@...el.com>, "matthias.bgg@...il.com"
	<matthias.bgg@...il.com>, "linux-arm-kernel@...ts.infradead.org"
	<linux-arm-kernel@...ts.infradead.org>, "ying.huang@...el.com"
	<ying.huang@...el.com>, "senozhatsky@...omium.org"
	<senozhatsky@...omium.org>, "dan.j.williams@...el.com"
	<dan.j.williams@...el.com>
Subject: Re: [PATCH 0/2] Improve Zram by separating compression context from
 kswapd

On Sat, 2025-03-08 at 08:34 +1300, Barry Song wrote:
> 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> 
> 
> On Sat, Mar 8, 2025 at 1:02 AM Qun-Wei Lin <qun-wei.lin@...iatek.com>
> wrote:
> > 
> > This patch series introduces a new mechanism called kcompressd to
> > improve the efficiency of memory reclaiming in the operating
> > system. The
> > main goal is to separate the tasks of page scanning and page
> > compression
> > into distinct processes or threads, thereby reducing the load on
> > the
> > kswapd thread and enhancing overall system performance under high
> > memory
> > pressure conditions.
> > 
> > Problem:
> >  In the current system, the kswapd thread is responsible for both
> >  scanning the LRU pages and compressing pages into the ZRAM. This
> >  combined responsibility can lead to significant performance
> > bottlenecks,
> >  especially under high memory pressure. The kswapd thread becomes a
> >  single point of contention, causing delays in memory reclaiming
> > and
> >  overall system performance degradation.
> > 
> > Target:
> >  The target of this invention is to improve the efficiency of
> > memory
> >  reclaiming. By separating the tasks of page scanning and page
> >  compression into distinct processes or threads, the system can
> > handle
> >  memory pressure more effectively.
> 
> Sounds great. However, we also have a time window where folios under
> writeback are kept, whereas previously, writeback was done
> synchronously
> without your patch. This may temporarily increase memory usage until
> the
> kept folios are re-scanned.
> 
> So, you’ve observed that folio_rotate_reclaimable() runs shortly
> while the
> async thread completes compression? Then the kept folios are shortly
> re-scanned?
> 

Yes, these folios may need to be re-scanned, so
folio_rotate_reclaimable() will be run. This can be observed from the
increase in pgrotated in /proc/vmstat.

> > 
> > Patch 1:
> > - Introduces 2 new feature flags, BLK_FEAT_READ_SYNCHRONOUS and
> >   SWP_READ_SYNCHRONOUS_IO.
> > 
> > Patch 2:
> > - Implemented the core functionality of Kcompressd and made
> > necessary
> >   modifications to the zram driver to support it.
> > 
> > In our handheld devices, we found that applying this mechanism
> > under high
> > memory pressure scenarios can increase the rate of pgsteal_anon per
> > second
> > by over 260% compared to the situation with only kswapd.
> 
> Sounds really great.
> 
> What compression algorithm is being used? I assume that after
> switching to a
> different compression algorithms, the benefits will change
> significantly. For
> example, Zstd might not show as much improvement.
> How was the CPU usage ratio between page scan/unmap and compression
> observed before applying this patch?
> 

The original tests were based on LZ4.
We have observed that the CPU time spent on scanning the LRU and
compressing folios is approximately in 3:7.

We also try ZSTD as the zram backend, but the the number of anonymous
folios reclaimed per second did not differ significantly from LZ4 (the
benefits were far less compared to what could be achieved with parallel
processing). Even with ZSTD, we were still able to reach around 800,000
pgsteal_anon per second using kcompressd.


> > 
> > Qun-Wei Lin (2):
> >   mm: Split BLK_FEAT_SYNCHRONOUS and SWP_SYNCHRONOUS_IO into
> > separate
> >     read and write flags
> >   kcompressd: Add Kcompressd for accelerated zram compression
> > 
> >  drivers/block/brd.c             |   3 +-
> >  drivers/block/zram/Kconfig      |  11 ++
> >  drivers/block/zram/Makefile     |   3 +-
> >  drivers/block/zram/kcompressd.c | 340
> > ++++++++++++++++++++++++++++++++
> >  drivers/block/zram/kcompressd.h |  25 +++
> >  drivers/block/zram/zram_drv.c   |  21 +-
> >  drivers/nvdimm/btt.c            |   3 +-
> >  drivers/nvdimm/pmem.c           |   5 +-
> >  include/linux/blkdev.h          |  24 ++-
> >  include/linux/swap.h            |  31 +--
> >  mm/memory.c                     |   4 +-
> >  mm/page_io.c                    |   6 +-
> >  mm/swapfile.c                   |   7 +-
> >  13 files changed, 446 insertions(+), 37 deletions(-)
> >  create mode 100644 drivers/block/zram/kcompressd.c
> >  create mode 100644 drivers/block/zram/kcompressd.h
> > 
> > --
> > 2.45.2
> > 
> 
> Thanks
> Barry

Best Regards,
Qun-wei

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ