linux-kernel - Re: [PATCH] mm: Add Kcompressd for accelerated memory compression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKEwX=MNVwd_Z1PyBt7swd2VhUVivRN-5E+kHm-3XAPka0d84w@mail.gmail.com>
Date: Thu, 1 May 2025 08:50:24 -0700
From: Nhat Pham <nphamcs@...il.com>
To: Qun-Wei Lin <qun-wei.lin@...iatek.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>, Mike Rapoport <rppt@...nel.org>, 
	Matthias Brugger <matthias.bgg@...il.com>, 
	AngeloGioacchino Del Regno <angelogioacchino.delregno@...labora.com>, 
	Sergey Senozhatsky <senozhatsky@...omium.org>, Minchan Kim <minchan@...nel.org>, linux-mm@...ck.org, 
	linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org, 
	linux-mediatek@...ts.infradead.org, Casper Li <casper.li@...iatek.com>, 
	Chinwen Chang <chinwen.chang@...iatek.com>, Andrew Yang <andrew.yang@...iatek.com>, 
	James Hsu <james.hsu@...iatek.com>, Barry Song <21cnbao@...il.com>, 
	Johannes Weiner <hannes@...xchg.org>, Yosry Ahmed <yosry.ahmed@...ux.dev>, 
	Chengming Zhou <chengming.zhou@...ux.dev>, Shakeel Butt <shakeel.butt@...ux.dev>, 
	Kairui Song <ryncsn@...il.com>, Joshua Hahn <joshua.hahnjy@...il.com>
Subject: Re: [PATCH] mm: Add Kcompressd for accelerated memory compression

On Wed, Apr 30, 2025 at 1:27 AM Qun-Wei Lin <qun-wei.lin@...iatek.com> wrote:
>
> This patch series introduces a new mechanism called kcompressd to
> improve the efficiency of memory reclaiming in the operating system.
>
> Problem:
>   In the current system, the kswapd thread is responsible for both scanning
>   the LRU pages and handling memory compression tasks (such as those
>   involving ZSWAP/ZRAM, if enabled). This combined responsibility can lead
>   to significant performance bottlenecks, especially under high memory
>   pressure. The kswapd thread becomes a single point of contention, causing
>   delays in memory reclaiming and overall system performance degradation.
>
> Solution:
>   Introduced kcompressd to handle asynchronous compression during memory
>   reclaim, improving efficiency by offloading compression tasks from
>   kswapd. This allows kswapd to focus on its primary task of page reclaim
>   without being burdened by the additional overhead of compression.
>
> In our handheld devices, we found that applying this mechanism under high
> memory pressure scenarios can increase the rate of pgsteal_anon per second
> by over 260% compared to the situation with only kswapd. Additionally, we
> observed a reduction of over 50% in page allocation stall occurrences,
> further demonstrating the effectiveness of kcompressd in alleviating memory
> pressure and improving system responsiveness.
>

Oh btw, testing this on a simple kernel building task triggers this:

[  133.349908] WARNING: CPU: 0 PID: 50 at mm/memcontrol.c:5330
obj_cgroup_charge_zswap+0x22e/0x250
[  133.350505] Modules linked in: virtio_net pata_acpi net_failover
failover virtio_rng rng_core ata_piix libata scsi_mod scsi_common
[  133.351366] CPU: 0 UID: 0 PID: 50 Comm: kcompressd0 Not tainted
6.14.0-ge65b549702a5 #218
[  133.351940] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[  133.352717] RIP: 0010:obj_cgroup_charge_zswap+0x22e/0x250
[  133.353118] Code: d2 ff 85 c0 0f 85 7a fe ff ff be ff ff ff ff 48
c7 c7 88 da f1 91 e8 a1 b4 a3 00 85 c0 0f 85 61 fe ff ff 0f 0b e9 5a
fe ff ff <0f> 0b e9 f5 fd ff ff e8 36 ae a3 00 e9 78 fe ff ff e8 2c ae
a3 00
[  133.354372] RSP: 0018:ffff9f99803bbc00 EFLAGS: 00010246
[  133.354782] RAX: ffff970f42a9a900 RBX: 000000000000013e RCX: 0000000000000002
[  133.355269] RDX: 0000000000000000 RSI: 000000000000013e RDI: ffff970f475eab40
[  133.355774] RBP: ffff970f475eab40 R08: 0000000000000000 R09: 0000000000000000
[  133.356269] R10: ffffffff90a21205 R11: ffffffff90a211ab R12: ffffffff90a21205
[  133.356782] R13: ffffc4984041ff40 R14: ffff970f42e66000 R15: 000000000000013e
[  133.357279] FS:  0000000000000000(0000) GS:ffff970fbdc00000(0000)
knlGS:0000000000000000
[  133.357807] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  133.358186] CR2: 00007f33950c5030 CR3: 00000000038ea000 CR4: 00000000000006f0
[  133.358656] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  133.359121] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  133.359597] Call Trace:
[  133.359767]  <TASK>
[  133.359914]  ? __warn+0x94/0x190
[  133.360136]  ? obj_cgroup_charge_zswap+0x22e/0x250
[  133.360476]  ? report_bug+0x168/0x170
[  133.360742]  ? handle_bug+0x53/0x90
[  133.360982]  ? exc_invalid_op+0x18/0x70
[  133.361240]  ? asm_exc_invalid_op+0x1a/0x20
[  133.361536]  ? zswap_store+0x755/0xf80
[  133.361798]  ? zswap_store+0x6fb/0xf80
[  133.362071]  ? zswap_store+0x755/0xf80
[  133.362338]  ? obj_cgroup_charge_zswap+0x22e/0x250
[  133.362661]  ? zswap_store+0x755/0xf80
[  133.362943]  zswap_store+0x7e7/0xf80
[  133.363203]  ? __pfx_kcompressd+0x10/0x10
[  133.363472]  kcompressd+0xb1/0x180
[  133.363724]  ? __pfx_autoremove_wake_function+0x10/0x10
[  133.364082]  kthread+0xef/0x230
[  133.364298]  ? __pfx_kthread+0x10/0x10
[  133.364548]  ret_from_fork+0x34/0x50
[  133.364810]  ? __pfx_kthread+0x10/0x10
[  133.365063]  ret_from_fork_asm+0x1a/0x30
[  133.365321]  </TASK>
[  133.365471] irq event stamp: 18
[  133.365680] hardirqs last  enabled at (17): [<ffffffff914bd0ef>]
_raw_spin_unlock_irqrestore+0x4f/0x60
[  133.366289] hardirqs last disabled at (18): [<ffffffff914b2031>]
__schedule+0x6b1/0xe80
[  133.366824] softirqs last  enabled at (0): [<ffffffff906b1caf>]
copy_process+0x9af/0x2b50
[  133.367366] softirqs last disabled at (0): [<0000000000000000>] 0x0
[  133.367844] ---[ end trace 0000000000000000 ]---

Seems like we're trigger this warning in the zswap cgroup check (see
obj_cgroup_may_zswap() in mm/memcontrol.c for more details):

VM_WARN_ON_ONCE(!(current->flags & PF_MEMALLOC));

Might wanna fix this...