[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240129175423.1987-1-ryncsn@gmail.com>
Date: Tue, 30 Jan 2024 01:54:15 +0800
From: Kairui Song <ryncsn@...il.com>
To: linux-mm@...ck.org
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Chris Li <chrisl@...nel.org>,
"Huang, Ying" <ying.huang@...el.com>,
Hugh Dickins <hughd@...gle.com>,
Johannes Weiner <hannes@...xchg.org>,
Matthew Wilcox <willy@...radead.org>,
Michal Hocko <mhocko@...e.com>,
Yosry Ahmed <yosryahmed@...gle.com>,
David Hildenbrand <david@...hat.com>,
linux-kernel@...r.kernel.org,
Kairui Song <kasong@...cent.com>
Subject: [PATCH v3 0/7] swapin refactor for optimization and unified readahead
From: Kairui Song <kasong@...cent.com>
This series tries to unify and clean up the swapin path, introduce minor
optimization, and make both shmem swapoff make use of SWP_SYNCHRONOUS_IO
flag to skip readahead and swapcache for better performance.
Test results:
- swap out 10G zero-filled data to ZRAM then read them in:
Before: 11143285 us
After: 10692644 us (+4.1%)
- swapping off a 10G ZRAM (lzo-rle) after same workload:
Before:
time swapoff /dev/zram0
real 0m12.337s
user 0m0.001s
sys 0m12.329s
After:
time swapoff /dev/zram0
real 0m9.728s
user 0m0.001s
sys 0m9.719s
- shmem FIO test 1 on a Ryzen 5900HX:
fio -name=tmpfs --numjobs=16 --directory=/tmpfs --size=960m \
--ioengine=mmap --rw=randread --random_distribution=zipf:0.5 \
--time_based --ramp_time=1m --runtime=5m --group_reporting
(using brd as swap, 2G memcg limit)
Before:
bw ( MiB/s): min= 1167, max= 1732, per=100.00%, avg=1460.82, stdev= 4.38, samples=9536
iops : min=298938, max=443557, avg=373964.41, stdev=1121.27, samples=9536
After (+3.5%):
bw ( MiB/s): min= 1285, max= 1738, per=100.00%, avg=1512.88, stdev= 4.34, samples=9456
iops : min=328957, max=445105, avg=387294.21, stdev=1111.15, samples=9456
- shmem FIO test 2 on a Ryzen 5900HX:
fio -name=tmpfs --numjobs=16 --directory=/tmpfs --size=960m \
--ioengine=mmap --rw=randread --random_distribution=zipf:1.2 \
--time_based --ramp_time=1m --runtime=5m --group_reporting
(using brd as swap, 2G memcg limit)
Before:
bw ( MiB/s): min= 5296, max= 7112, per=100.00%, avg=6131.93, stdev=17.09, samples=9536
iops : min=1355934, max=1820833, avg=1569769.11, stdev=4375.93, samples=9536
After (+3.1%):
bw ( MiB/s): min= 5466, max= 7173, per=100.00%, avg=6324.51, stdev=16.66, samples=9521
iops : min=1399355, max=1836435, avg=1619068.90, stdev=4263.94, samples=9521
- Some built objects are very slightly smaller (gcc 13.2.1):
/scripts/bloat-o-meter ./vmlinux ./vmlinux.new
add/remove: 4/2 grow/shrink: 1/10 up/down: 818/-983 (-165)
Function old new delta
swapin_entry - 482 +482
mm_counter - 248 +248
shmem_swapin_folio 1412 1468 +56
__pfx_swapin_entry - 16 +16
__pfx_mm_counter - 16 +16
__read_swap_cache_async 738 736 -2
copy_present_pte 1258 1249 -9
mem_cgroup_swapin_charge_folio 297 285 -12
__pfx_swapin_readahead 16 - -16
swap_cache_get_folio 364 345 -19
do_anonymous_page 1488 1458 -30
unuse_pte_range 889 833 -56
free_p4d_range 524 446 -78
restore_exclusive_pte 937 822 -115
do_swap_page 2969 2817 -152
swapin_readahead 239 - -239
copy_nonpresent_pte 1478 1223 -255
Total: Before=26056243, After=26056078, chg -0.00%
V2: https://lore.kernel.org/linux-mm/20240102175338.62012-1-ryncsn@gmail.com/
Update from V2:
- Many code path clean up (merge swapin_entry with swapin_entry_mpol,
drop second param of mem_cgroup_swapin_charge_folio, swapin_entry
takes a pointer to folio as return value instaed of pointer to
boolean to reduce LOC and logic), thanks for Huang, Ying.
- Don't use cluster readhead for swapoff, the performance is worse
than VMA readahead for NVME.
- Add a refactor patch for swap_cache_get_folio.
V1: https://lore.kernel.org/linux-mm/20231119194740.94101-1-ryncsn@gmail.com/T/
Update from V1:
- Rebased based on mm-unstable.
- Remove behaviour changing patches, will submit in seperate series
later.
- Code style, naming and comments updates.
- Thanks to Chris Li for very detailed and helpful review of V1. Thanks
to Matthew Wilcox and Huang Ying for helpful suggestions.
Kairui Song (7):
mm/swapfile.c: add back some comment
mm/swap: move no readahead swapin code to a stand-alone helper
mm/swap: always account swapped in page into current memcg
mm/swap: introduce swapin_entry for unified readahead policy
mm/swap: avoid a duplicated swap cache lookup for SWP_SYNCHRONOUS_IO
mm/swap, shmem: use unified swapin helper for shmem
mm/swap: refactor swap_cache_get_folio
include/linux/memcontrol.h | 4 +-
mm/memcontrol.c | 5 +-
mm/memory.c | 45 ++--------
mm/shmem.c | 50 +++++++----
mm/swap.h | 23 ++---
mm/swap_state.c | 176 ++++++++++++++++++++++++++-----------
mm/swapfile.c | 20 +++--
7 files changed, 190 insertions(+), 133 deletions(-)
--
2.43.0
Powered by blists - more mailing lists