[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250806161748.76651-1-ryncsn@gmail.com>
Date: Thu, 7 Aug 2025 00:17:45 +0800
From: Kairui Song <ryncsn@...il.com>
To: linux-mm@...ck.org
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Kemeng Shi <shikemeng@...weicloud.com>,
Chris Li <chrisl@...nel.org>,
Nhat Pham <nphamcs@...il.com>,
Baoquan He <bhe@...hat.com>,
Barry Song <baohua@...nel.org>,
"Huang, Ying" <ying.huang@...ux.alibaba.com>,
linux-kernel@...r.kernel.org,
Kairui Song <kasong@...cent.com>
Subject: [PATCH v2 0/3] mm, swap: improve cluster scan strategy
From: Kairui Song <kasong@...cent.com>
This series improves the large allocation performance and reduces
the failure rate. Some design of the cluster alloactor was later
found to be improvable after thorough testing.
The allocator spent too much effort scanning the fragment list, which
is not helpful in most setups, but causes serious contention of the
list lock (si->lock). Besides, the allocator prefers free clusters
when searching for a new cluster due to historical reasons, which
causes fragmentation issues.
So make the allocator only scan one cluster for high order allocation,
and prefer nonfull cluster. This both improves the performance and
reduces fragmentation.
For example, build kernel test with make -j96 and 10G ZRAM with 64kB
mTHP enabled shows better performance and a lower failure rate:
Before: sys time: 11609.69s 64kB/swpout: 1787051 64kB/swpout_fallback: 20917
After: sys time: 5587.53s 64kB/swpout: 1811598 64kB/swpout_fallback: 0
System time is cut in half, and the failure rate drops to zero. Larger
allocations in a hybrid workload also showed a major improvement:
512kB swap failure rate:
Before: swpout:11663 swpout_fallback:1767
After: swpout:14480 swpout_fallback:6
2M swap failure rate:
Before: swpout:24 swpout_fallback:1442
After: swpout:1329 swpout_fallback:7
Kairui Song (3):
mm, swap: only scan one cluster in fragment list
mm, swap: remove fragment clusters counter
mm, swap: prefer nonfull over free clusters
include/linux/swap.h | 1 -
mm/swapfile.c | 68 +++++++++++++++++++++++---------------------
2 files changed, 36 insertions(+), 33 deletions(-)
---
V1: https://lore.kernel.org/linux-mm/20250804172439.2331-1-ryncsn@gmail.com/
Changelog:
- Split into 3 patches, no code change [ Chris Li ]
- Rebase and rerun the test to see if removing the fragment cluster counter
helps to improve the performance, as expected, it's marginal.
- Collect Ack/Review-by [ Nhat Pham, Chris Li ]
--
2.50.1
Powered by blists - more mailing lists