[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200323234151.10AF5617@viggo.jf.intel.com>
Date: Mon, 23 Mar 2020 16:41:51 -0700
From: Dave Hansen <dave.hansen@...ux.intel.com>
To: linux-kernel@...r.kernel.org
Cc: Dave Hansen <dave.hansen@...ux.intel.com>, mhocko@...e.com,
jannh@...gle.com, vbabka@...e.cz, minchan@...nel.org,
dancol@...gle.com, joel@...lfernandes.org,
akpm@...ux-foundation.org
Subject: [PATCH 2/2] mm/madvise: skip MADV_PAGEOUT on shared swap cache pages
From: Dave Hansen <dave.hansen@...ux.intel.com>
MADV_PAGEOUT might interfere with other processes if it is
allowed to reclaim pages shared with other processses. A
previous patch tried to avoid this for anonymous pages
which were shared by a fork(). It did this by checking
page_mapcount().
That works great for mapped pages. But, it can not detect
unmapped swap cache pages. This has not been a problem,
until the previous patch which added the ability for
MADV_PAGEOUT to *find* swap cache pages.
A process doing MADV_PAGEOUT which finds an unmapped swap
cache page and evicts it might interfere with another process
which had the same page mapped. But, such a page would have
a page_mapcount() of 1 since the page is only actually mapped
in the *other* process. The page_mapcount() test would fail
to detect the situation.
Thankfully, there is a reference count for swap entries.
To fix this, simply consult both page_mapcount() and the swap
reference count via page_swapcount().
I rigged up a little test program to try to create these
situations. Basically, if the parent "reader" RSS changes
in response to MADV_PAGEOUT actions in the child, there is
a problem.
https://www.sr71.net/~dave/intel/madv-pageout.c
Signed-off-by: Dave Hansen <dave.hansen@...ux.intel.com>
Cc: Michal Hocko <mhocko@...e.com>
Cc: Jann Horn <jannh@...gle.com>
Cc: Vlastimil Babka <vbabka@...e.cz>
Cc: Minchan Kim <minchan@...nel.org>
Cc: Daniel Colascione <dancol@...gle.com>
Cc: "Joel Fernandes (Google)" <joel@...lfernandes.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>
---
b/mm/madvise.c | 37 +++++++++++++++++++++++++++++--------
1 file changed, 29 insertions(+), 8 deletions(-)
diff -puN mm/madvise.c~madv-pageout-ignore-shared-swap-cache mm/madvise.c
--- a/mm/madvise.c~madv-pageout-ignore-shared-swap-cache 2020-03-23 16:30:52.022385888 -0700
+++ b/mm/madvise.c 2020-03-23 16:41:15.448384333 -0700
@@ -261,6 +261,7 @@ static struct page *pte_get_reclaim_page
{
swp_entry_t entry;
struct page *page;
+ int nr_page_references = 0;
/* Totally empty PTE: */
if (pte_none(ptent))
@@ -271,7 +272,7 @@ static struct page *pte_get_reclaim_page
page = vm_normal_page(vma, addr, ptent);
if (page)
get_page(page);
- return page;
+ goto got_page;
}
/*
@@ -292,7 +293,33 @@ static struct page *pte_get_reclaim_page
* The PTE was a true swap entry. The page may be in
* the swap cache.
*/
- return lookup_swap_cache(entry, vma, addr);
+ page = lookup_swap_cache(entry, vma, addr);
+ if (!page)
+ return NULL;
+got_page:
+ /*
+ * Account for references to the swap entry. These
+ * might be "upgraded" to a normal mapping at any
+ * time.
+ */
+ if (PageSwapCache(page))
+ nr_page_references += page_swapcount(page);
+
+ /*
+ * Account for all mappings of the page, including
+ * when it is in the swap cache. This ensures that
+ * MADV_PAGOUT not interfere with anything shared
+ * with another process.
+ */
+ nr_page_references += page_mapcount(page);
+
+ /* Any extra references? Do not reclaim it. */
+ if (nr_page_references > 1) {
+ put_page(page);
+ return NULL;
+ }
+
+ return page;
}
/*
@@ -477,12 +504,6 @@ regular_page:
continue;
}
- /* Do not interfere with other mappings of this page */
- if (page_mapcount(page) != 1) {
- put_page(page);
- continue;
- }
-
VM_BUG_ON_PAGE(PageTransCompound(page), page);
if (!is_swap_pte(ptent) && pte_young(ptent)) {
_
Powered by blists - more mailing lists