[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240210-zswap-global-lru-v3-0-200495333595@bytedance.com>
Date: Fri, 16 Feb 2024 08:55:03 +0000
From: Chengming Zhou <zhouchengming@...edance.com>
To: Johannes Weiner <hannes@...xchg.org>, Yosry Ahmed <yosryahmed@...gle.com>, Nhat Pham <nphamcs@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-mm@...ck.org, Yosry Ahmed <yosryahmed@...gle.com>, linux-kernel@...r.kernel.org,
Chengming Zhou <zhouchengming@...edance.com>
Subject: [PATCH v3 0/2] mm/zswap: optimize for dynamic zswap_pools
Changes in v3:
- Improve the commit messages and comments, per Yosry.
- Use percpu_ref_is_zero() for debug purpose, per Yosry.
- Collect tag.
- Link to v2: https://lore.kernel.org/r/20240210-zswap-global-lru-v2-0-fbee3b11a62e@bytedance.com
Changes in v2:
- fix build error when !CONFIG_MEMCG_KMEM.
- make zswap struct static and fix some error paths, per Yosry.
- add another shrink_lock to protect zswap.next_shrink, per Yosry.
- keep "WARN_ON(percpu_ref_tryget(&pool->ref))" in pool release path
for debug, per Nhat.
- improve the commit messages.
- Link to v1: https://lore.kernel.org/r/20240210-zswap-global-lru-v1-0-853473d7b0da@bytedance.com
Dynamic pool creation has been supported for a long time, which maybe
not used so much in practice. But with the per-memcg lru merged, the
current structure of zswap_pool's lru and shrinker become less optimal.
In the current structure, each zswap_pool has its own lru, shrinker and
shrink_work, but only the latest zswap_pool will be the current used.
1. When memory has pressure, all shrinkers of zswap_pools will try to
shrink its lru list, there is no order between them.
2. When zswap limit hit, only the last zswap_pool's shrink_work will
try to shrink its own lru, which is inefficient.
A more natural way is to have a global zswap lru shared between all
zswap_pools, and so is the shrinker. The code becomes much simpler too.
Another optimization is changing zswap_pool kref to percpu_ref, which
will be taken reference by every zswap entry. So the scalability is
better.
Testing kernel build (32 threads) in tmpfs with memory.max=2GB.
(zswap shrinker and writeback enabled with one 50GB swapfile,
on a 128 CPUs x86-64 machine, below is the average of 5 runs)
mm-unstable zswap-global-lru
real 63.20 63.12
user 1061.75 1062.95
sys 268.74 264.44
Signed-off-by: Chengming Zhou <zhouchengming@...edance.com>
---
Chengming Zhou (2):
mm/zswap: global lru and shrinker shared by all zswap_pools
mm/zswap: change zswap_pool kref to percpu_ref
mm/zswap.c | 207 +++++++++++++++++++++++++++----------------------------------
1 file changed, 93 insertions(+), 114 deletions(-)
---
base-commit: 191d97734e41a5c9f90a2f6636fdd335ae1d435d
change-id: 20240210-zswap-global-lru-94d49316178b
Best regards,
--
Chengming Zhou <zhouchengming@...edance.com>
Powered by blists - more mailing lists