lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20200323234151.10AF5617@viggo.jf.intel.com>
Date:   Mon, 23 Mar 2020 16:41:51 -0700
From:   Dave Hansen <dave.hansen@...ux.intel.com>
To:     linux-kernel@...r.kernel.org
Cc:     Dave Hansen <dave.hansen@...ux.intel.com>, mhocko@...e.com,
        jannh@...gle.com, vbabka@...e.cz, minchan@...nel.org,
        dancol@...gle.com, joel@...lfernandes.org,
        akpm@...ux-foundation.org
Subject: [PATCH 2/2] mm/madvise: skip MADV_PAGEOUT on shared swap cache pages


From: Dave Hansen <dave.hansen@...ux.intel.com>

MADV_PAGEOUT might interfere with other processes if it is
allowed to reclaim pages shared with other processses.  A
previous patch tried to avoid this for anonymous pages
which were shared by a fork().  It did this by checking
page_mapcount().

That works great for mapped pages.  But, it can not detect
unmapped swap cache pages.  This has not been a problem,
until the previous patch which added the ability for
MADV_PAGEOUT to *find* swap cache pages.

A process doing MADV_PAGEOUT which finds an unmapped swap
cache page and evicts it might interfere with another process
which had the same page mapped.  But, such a page would have
a page_mapcount() of 1 since the page is only actually mapped
in the *other* process.  The page_mapcount() test would fail
to detect the situation.

Thankfully, there is a reference count for swap entries.
To fix this, simply consult both page_mapcount() and the swap
reference count via page_swapcount().

I rigged up a little test program to try to create these
situations.  Basically, if the parent "reader" RSS changes
in response to MADV_PAGEOUT actions in the child, there is
a problem.

	https://www.sr71.net/~dave/intel/madv-pageout.c

Signed-off-by: Dave Hansen <dave.hansen@...ux.intel.com>
Cc: Michal Hocko <mhocko@...e.com>
Cc: Jann Horn <jannh@...gle.com>
Cc: Vlastimil Babka <vbabka@...e.cz>
Cc: Minchan Kim <minchan@...nel.org>
Cc: Daniel Colascione <dancol@...gle.com>
Cc: "Joel Fernandes (Google)" <joel@...lfernandes.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>
---

 b/mm/madvise.c |   37 +++++++++++++++++++++++++++++--------
 1 file changed, 29 insertions(+), 8 deletions(-)

diff -puN mm/madvise.c~madv-pageout-ignore-shared-swap-cache mm/madvise.c
--- a/mm/madvise.c~madv-pageout-ignore-shared-swap-cache	2020-03-23 16:30:52.022385888 -0700
+++ b/mm/madvise.c	2020-03-23 16:41:15.448384333 -0700
@@ -261,6 +261,7 @@ static struct page *pte_get_reclaim_page
 {
 	swp_entry_t entry;
 	struct page *page;
+	int nr_page_references = 0;
 
 	/* Totally empty PTE: */
 	if (pte_none(ptent))
@@ -271,7 +272,7 @@ static struct page *pte_get_reclaim_page
 		page = vm_normal_page(vma, addr, ptent);
 		if (page)
 			get_page(page);
-		return page;
+		goto got_page;
 	}
 
 	/*
@@ -292,7 +293,33 @@ static struct page *pte_get_reclaim_page
 	 * The PTE was a true swap entry.  The page may be in
 	 * the swap cache.
 	 */
-	return lookup_swap_cache(entry, vma, addr);
+	page = lookup_swap_cache(entry, vma, addr);
+	if (!page)
+		return NULL;
+got_page:
+	/*
+	 * Account for references to the swap entry.  These
+	 * might be "upgraded" to a normal mapping at any
+	 * time.
+	 */
+	if (PageSwapCache(page))
+		nr_page_references += page_swapcount(page);
+
+	/*
+	 * Account for all mappings of the page, including
+	 * when it is in the swap cache.  This ensures that
+	 * MADV_PAGOUT not interfere with anything shared
+	 * with another process.
+	 */
+	nr_page_references += page_mapcount(page);
+
+	/* Any extra references?  Do not reclaim it. */
+	if (nr_page_references > 1) {
+		put_page(page);
+		return NULL;
+	}
+
+	return page;
 }
 
 /*
@@ -477,12 +504,6 @@ regular_page:
 			continue;
 		}
 
-		/* Do not interfere with other mappings of this page */
-		if (page_mapcount(page) != 1) {
-			put_page(page);
-			continue;
-		}
-
 		VM_BUG_ON_PAGE(PageTransCompound(page), page);
 
 		if (!is_swap_pte(ptent) && pte_young(ptent)) {
_

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ