lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 12 Mar 2020 09:22:48 +0100
From:   Michal Hocko <mhocko@...nel.org>
To:     Jann Horn <jannh@...gle.com>
Cc:     Minchan Kim <minchan@...nel.org>, Linux-MM <linux-mm@...ck.org>,
        kernel list <linux-kernel@...r.kernel.org>,
        Daniel Colascione <dancol@...gle.com>,
        Dave Hansen <dave.hansen@...el.com>,
        "Joel Fernandes (Google)" <joel@...lfernandes.org>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: interaction of MADV_PAGEOUT with CoW anonymous mappings?

[Cc akpm]

So what about this?

>From eca97990372679c097a88164ff4b3d7879b0e127 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@...e.com>
Date: Thu, 12 Mar 2020 09:04:35 +0100
Subject: [PATCH] mm: do not allow MADV_PAGEOUT for CoW pages

Jann has brought up a very interesting point [1]. While shared pages are
excluded from MADV_PAGEOUT normally, CoW pages can be easily reclaimed
that way. This can lead to all sorts of hard to debug problems. E.g.
performance problems outlined by Daniel [2]. There are runtime
environments where there is a substantial memory shared among security
domains via CoW memory and a easy to reclaim way of that memory, which
MADV_{COLD,PAGEOUT} offers, can lead to either performance degradation
in for the parent process which might be more privileged or even open
side channel attacks. The feasibility of the later is not really clear
to me TBH but there is no real reason for exposure at this stage. It
seems there is no real use case to depend on reclaiming CoW memory via
madvise at this stage so it is much easier to simply disallow it and
this is what this patch does. Put it simply MADV_{PAGEOUT,COLD} can
operate only on the exclusively owned memory which is a straightforward
semantic.

[1] http://lkml.kernel.org/r/CAG48ez0G3JkMq61gUmyQAaCq=_TwHbi1XKzWRooxZkv08PQKuw@mail.gmail.com
[2] http://lkml.kernel.org/r/CAKOZueua_v8jHCpmEtTB6f3i9e2YnmX4mqdYVWhV4E=Z-n+zRQ@mail.gmail.com

Signed-off-by: Michal Hocko <mhocko@...e.com>
---
 mm/madvise.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 43b47d3fae02..4bb30ed6c8d2 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -335,12 +335,14 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
 		}
 
 		page = pmd_page(orig_pmd);
+
+		/* Do not interfere with other mappings of this page */
+		if (page_mapcount(page) != 1)
+			goto huge_unlock;
+
 		if (next - addr != HPAGE_PMD_SIZE) {
 			int err;
 
-			if (page_mapcount(page) != 1)
-				goto huge_unlock;
-
 			get_page(page);
 			spin_unlock(ptl);
 			lock_page(page);
@@ -426,6 +428,10 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
 			continue;
 		}
 
+		/* Do not interfere with other mappings of this page */
+		if (page_mapcount(page) != 1)
+			continue;
+
 		VM_BUG_ON_PAGE(PageTransCompound(page), page);
 
 		if (pte_young(ptent)) {
-- 
2.24.1

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ