[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <983965b6-2262-4f72-a672-39085dcdaa3c@gmail.com>
Date: Tue, 8 Apr 2025 14:04:06 +0100
From: Usama Arif <usamaarif642@...il.com>
To: Nhat Pham <nphamcs@...il.com>, linux-mm@...ck.org
Cc: akpm@...ux-foundation.org, hannes@...xchg.org, hughd@...gle.com,
yosry.ahmed@...ux.dev, mhocko@...nel.org, roman.gushchin@...ux.dev,
shakeel.butt@...ux.dev, muchun.song@...ux.dev, len.brown@...el.com,
chengming.zhou@...ux.dev, kasong@...cent.com, chrisl@...nel.org,
huang.ying.caritas@...il.com, ryan.roberts@....com, viro@...iv.linux.org.uk,
baohua@...nel.org, osalvador@...e.de, lorenzo.stoakes@...cle.com,
christophe.leroy@...roup.eu, pavel@...nel.org, kernel-team@...a.com,
linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
linux-pm@...r.kernel.org
Subject: Re: [RFC PATCH 00/14] Virtual Swap Space
On 08/04/2025 00:42, Nhat Pham wrote:
>
> V. Benchmarking
>
> As a proof of concept, I run the prototype through some simple
> benchmarks:
>
> 1. usemem: 16 threads, 2G each, memory.max = 16G
>
> I benchmarked the following usemem commands:
>
> time usemem --init-time -w -O -s 10 -n 16 2g
>
> Baseline:
> real: 33.96s
> user: 25.31s
> sys: 341.09s
> average throughput: 111295.45 KB/s
> average free time: 2079258.68 usecs
>
> New Design:
> real: 35.87s
> user: 25.15s
> sys: 373.01s
> average throughput: 106965.46 KB/s
> average free time: 3192465.62 usecs
>
> To root cause this regression, I ran perf on the usemem program, as
> well as on the following stress-ng program:
>
> perf record -ag -e cycles -G perf_cg -- ./stress-ng/stress-ng --pageswap $(nproc) --pageswap-ops 100000
>
> and observed the (predicted) increase in lock contention on swap cache
> accesses. This regression is alleviated if I put together the
> following hack: limit the virtual swap space to a sufficient size for
> the benchmark, range partition the swap-related data structures (swap
> cache, zswap tree, etc.) based on the limit, and distribute the
> allocation of virtual swap slotss among these partitions (on a per-CPU
> basis):
>
> real: 34.94s
> user: 25.28s
> sys: 360.25s
> average throughput: 108181.15 KB/s
> average free time: 2680890.24 usecs
>
> As mentioned above, I will implement proper dynamic swap range
> partitioning in a follow up work.
>
> 2. Kernel building: zswap enabled, 52 workers (one per processor),
> memory.max = 3G.
>
> Baseline:
> real: 183.55s
> user: 5119.01s
> sys: 655.16s
>
> New Design:
> real: mean: 184.5s
> user: mean: 5117.4s
> sys: mean: 695.23s
>
> New Design (Static Partition)
> real: 183.95s
> user: 5119.29s
> sys: 664.24s
>
Hi Nhat,
Thanks for the patches! I have glanced over a couple of them, but this was the main question that came to my mind.
Just wanted to check if you had a look at the memory regression during these benchmarks?
Also what is sizeof(swp_desc)? Maybe we can calculate the memory overhead as sizeof(swp_desc) * swap size/PAGE_SIZE?
For a 64G swap that is filled with private anon pages, the overhead in MB might be (sizeof(swp_desc) in bytes * 16M) - 16M (zerobitmap) - 16M*8 (swap map)?
This looks like a sizeable memory regression?
Thanks,
Usama
Powered by blists - more mailing lists