linux-kernel - Re: [PATCH v2 0/6] mm: zswap: global shrinker fix and proactive shrink

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKEwX=PhhAiZ_P6YmdsJrtrftuHwzjbR7Hn6n-3aaYD4mVdPYQ@mail.gmail.com>
Date: Fri, 26 Jul 2024 11:13:21 -0700
From: Nhat Pham <nphamcs@...il.com>
To: Takero Funaki <flintglass@...il.com>
Cc: Johannes Weiner <hannes@...xchg.org>, Yosry Ahmed <yosryahmed@...gle.com>, 
	Chengming Zhou <chengming.zhou@...ux.dev>, Jonathan Corbet <corbet@....net>, 
	Andrew Morton <akpm@...ux-foundation.org>, 
	Domenico Cerasuolo <cerasuolodomenico@...il.com>, linux-mm@...ck.org, linux-doc@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 0/6] mm: zswap: global shrinker fix and proactive shrink

On Mon, Jul 15, 2024 at 1:20 AM Takero Funaki <flintglass@...il.com> wrote:
>
> 2024年7月13日(土) 8:02 Nhat Pham <nphamcs@...il.com>:
>
> It was tested on an Azure VM with SSD-backed storage. The total IOPS
> was capped at 4K IOPS by the VM host. The max throughput of the global
> shrinker was around 16 MB/s. Proactive shrinking cannot prevent
> pool_limit_hit since memory allocation can be on the order of GB/s.
> (The benchmark script allocates 2 GB sequentially, which was
> compressed to 1.3 GB, while the zswap pool was limited to 200 MB.)

Hmmm I noticed that in a lot of other swap read/write paths (in
__read_swap_cache_async(), or in shrink_lruvec()), we are doing block
device plugging (blk_{start|finish}_plug()). The global shrinker path,
however, is currently not doing this - it's triggered in a workqueue,
separate from all these reclaim paths.

I wonder if there are any values to doing the same for zswap global
shrinker. We do acquire a mutex (which can sleep) for every page,
which can unplug, but IIUC we only sleep when the mutex is currently
held by another task, and the mutex is per-CPU. The compression
algorithm is usually non-sleeping as well (for e.g, zstd). So maybe
there could be improvement in throughput here?

(Btw - friendly reminder that everyone should use zsmalloc as the default :))

Anyway, I haven't really played with this, and I don't have the right
setup that mimics your use case. If you do decide to give this a shot,
let me know :)