linux-kernel - Re: [RFC PATCH 00/14] Virtual Swap Space

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <983965b6-2262-4f72-a672-39085dcdaa3c@gmail.com>
Date: Tue, 8 Apr 2025 14:04:06 +0100
From: Usama Arif <usamaarif642@...il.com>
To: Nhat Pham <nphamcs@...il.com>, linux-mm@...ck.org
Cc: akpm@...ux-foundation.org, hannes@...xchg.org, hughd@...gle.com,
 yosry.ahmed@...ux.dev, mhocko@...nel.org, roman.gushchin@...ux.dev,
 shakeel.butt@...ux.dev, muchun.song@...ux.dev, len.brown@...el.com,
 chengming.zhou@...ux.dev, kasong@...cent.com, chrisl@...nel.org,
 huang.ying.caritas@...il.com, ryan.roberts@....com, viro@...iv.linux.org.uk,
 baohua@...nel.org, osalvador@...e.de, lorenzo.stoakes@...cle.com,
 christophe.leroy@...roup.eu, pavel@...nel.org, kernel-team@...a.com,
 linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
 linux-pm@...r.kernel.org
Subject: Re: [RFC PATCH 00/14] Virtual Swap Space



On 08/04/2025 00:42, Nhat Pham wrote:
> 
> V. Benchmarking
> 
> As a proof of concept, I run the prototype through some simple
> benchmarks:
> 
> 1. usemem: 16 threads, 2G each, memory.max = 16G
> 
> I benchmarked the following usemem commands:
> 
> time usemem --init-time -w -O -s 10 -n 16 2g
> 
> Baseline:
> real: 33.96s
> user: 25.31s
> sys: 341.09s
> average throughput: 111295.45 KB/s
> average free time: 2079258.68 usecs
> 
> New Design:
> real: 35.87s
> user: 25.15s
> sys: 373.01s
> average throughput: 106965.46 KB/s
> average free time: 3192465.62 usecs
> 
> To root cause this regression, I ran perf on the usemem program, as
> well as on the following stress-ng program:
> 
> perf record -ag -e cycles -G perf_cg -- ./stress-ng/stress-ng  --pageswap $(nproc) --pageswap-ops 100000
> 
> and observed the (predicted) increase in lock contention on swap cache
> accesses. This regression is alleviated if I put together the
> following hack: limit the virtual swap space to a sufficient size for
> the benchmark, range partition the swap-related data structures (swap
> cache, zswap tree, etc.) based on the limit, and distribute the
> allocation of virtual swap slotss among these partitions (on a per-CPU
> basis):
> 
> real: 34.94s
> user: 25.28s
> sys: 360.25s
> average throughput: 108181.15 KB/s
> average free time: 2680890.24 usecs
> 
> As mentioned above, I will implement proper dynamic swap range
> partitioning in a follow up work.
> 
> 2. Kernel building: zswap enabled, 52 workers (one per processor),
> memory.max = 3G.
> 
> Baseline:
> real: 183.55s
> user: 5119.01s
> sys: 655.16s
> 
> New Design:
> real: mean: 184.5s
> user: mean: 5117.4s
> sys: mean: 695.23s
> 
> New Design (Static Partition)
> real: 183.95s
> user: 5119.29s
> sys: 664.24s
> 

Hi Nhat,

Thanks for the patches! I have glanced over a couple of them, but this was the main question that came to my mind.

Just wanted to check if you had a look at the memory regression during these benchmarks?

Also what is sizeof(swp_desc)? Maybe we can calculate the memory overhead as sizeof(swp_desc) * swap size/PAGE_SIZE?

For a 64G swap that is filled with private anon pages, the overhead in MB might be (sizeof(swp_desc) in bytes * 16M) - 16M (zerobitmap) - 16M*8 (swap map)?

This looks like a sizeable memory regression?

Thanks,
Usama