[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200520232525.798933-4-hannes@cmpxchg.org>
Date: Wed, 20 May 2020 19:25:14 -0400
From: Johannes Weiner <hannes@...xchg.org>
To: linux-mm@...ck.org
Cc: Rik van Riel <riel@...riel.com>,
Minchan Kim <minchan.kim@...il.com>,
Michal Hocko <mhocko@...e.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
linux-kernel@...r.kernel.org, kernel-team@...com
Subject: [PATCH 03/14] mm: allow swappiness that prefers reclaiming anon over the file workingset
With the advent of fast random IO devices (SSDs, PMEM) and in-memory
swap devices such as zswap, it's possible for swap to be much faster
than filesystems, and for swapping to be preferable over thrashing
filesystem caches.
Allow setting swappiness - which defines the rough relative IO cost of
cache misses between page cache and swap-backed pages - to reflect
such situations by making the swap-preferred range configurable.
v2: clarify how to calculate swappiness (Minchan Kim)
Signed-off-by: Johannes Weiner <hannes@...xchg.org>
---
Documentation/admin-guide/sysctl/vm.rst | 23 ++++++++++++++++++-----
kernel/sysctl.c | 3 ++-
mm/vmscan.c | 2 +-
3 files changed, 21 insertions(+), 7 deletions(-)
diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index 0329a4d3fa9e..d46d5b7013c6 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -831,14 +831,27 @@ When page allocation performance is not a bottleneck and you want all
swappiness
==========
-This control is used to define how aggressive the kernel will swap
-memory pages. Higher values will increase aggressiveness, lower values
-decrease the amount of swap. A value of 0 instructs the kernel not to
-initiate swap until the amount of free and file-backed pages is less
-than the high water mark in a zone.
+This control is used to define the rough relative IO cost of swapping
+and filesystem paging, as a value between 0 and 200. At 100, the VM
+assumes equal IO cost and will thus apply memory pressure to the page
+cache and swap-backed pages equally; lower values signify more
+expensive swap IO, higher values indicates cheaper.
+
+Keep in mind that filesystem IO patterns under memory pressure tend to
+be more efficient than swap's random IO. An optimal value will require
+experimentation and will also be workload-dependent.
The default value is 60.
+For in-memory swap, like zram or zswap, as well as hybrid setups that
+have swap on faster devices than the filesystem, values beyond 100 can
+be considered. For example, if the random IO against the swap device
+is on average 2x faster than IO from the filesystem, swappiness should
+be 133 (x + 2x = 200, 2x = 133.33).
+
+At 0, the kernel will not initiate swap until the amount of free and
+file-backed pages is less than the high watermark in a zone.
+
unprivileged_userfaultfd
========================
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 8a176d8727a3..7f15d292e44c 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -131,6 +131,7 @@ static unsigned long zero_ul;
static unsigned long one_ul = 1;
static unsigned long long_max = LONG_MAX;
static int one_hundred = 100;
+static int two_hundred = 200;
static int one_thousand = 1000;
#ifdef CONFIG_PRINTK
static int ten_thousand = 10000;
@@ -1391,7 +1392,7 @@ static struct ctl_table vm_table[] = {
.mode = 0644,
.proc_handler = proc_dointvec_minmax,
.extra1 = SYSCTL_ZERO,
- .extra2 = &one_hundred,
+ .extra2 = &two_hundred,
},
#ifdef CONFIG_HUGETLB_PAGE
{
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 70b0e2c6a4b9..43f88b1a4f14 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -161,7 +161,7 @@ struct scan_control {
#endif
/*
- * From 0 .. 100. Higher means more swappy.
+ * From 0 .. 200. Higher means more swappy.
*/
int vm_swappiness = 60;
/*
--
2.26.2
Powered by blists - more mailing lists