lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240221085036.105621-1-21cnbao@gmail.com>
Date: Wed, 21 Feb 2024 21:50:36 +1300
From: Barry Song <21cnbao@...il.com>
To: akpm@...ux-foundation.org,
	linux-mm@...ck.org
Cc: linux-kernel@...r.kernel.org,
	Barry Song <v-songbaohua@...o.com>,
	Yin Fengwei <fengwei.yin@...el.com>,
	Yu Zhao <yuzhao@...gle.com>,
	Ryan Roberts <ryan.roberts@....com>,
	David Hildenbrand <david@...hat.com>,
	Kefeng Wang <wangkefeng.wang@...wei.com>,
	Matthew Wilcox <willy@...radead.org>,
	Minchan Kim <minchan@...nel.org>,
	Vishal Moola <vishal.moola@...il.com>,
	Yang Shi <shy828301@...il.com>
Subject: [PATCH] madvise:madvise_cold_or_pageout_pte_range(): allow split while folio_estimated_sharers = 0

From: Barry Song <v-songbaohua@...o.com>

The purpose is stopping splitting large folios whose mapcount are 2 or
above. Folios whose estimated_shares = 0 should be still perfect and
even better candidates than estimated_shares = 1.

Consider a pte-mapped large folio with 16 subpages, if we unmap 1-15,
the current code will split folios and reclaim them while madvise goes
on this folio; but if we unmap subpage 0, we will keep this folio and
break. This is weird.

For pmd-mapped large folios, we can still use "= 1" as the condition
as anyway we have the entire map for it. So this patch doesn't change
the condition for pmd-mapped large folios.
This also explains why we had been using "= 1" for both pmd-mapped and
pte-mapped large folios before commit 07e8c82b5eff ("madvise: convert
madvise_cold_or_pageout_pte_range() to use folios"), because in the
past, we used the mapcount of the specific subpage, since the subpage
had pte present, its mapcount wouldn't be 0.

The problem can be quite easily reproduced by writing a small program,
unmapping the first subpage of a pte-mapped large folio vs. unmapping
anyone other than the first subpage.

Fixes: 2f406263e3e9 ("madvise:madvise_cold_or_pageout_pte_range(): don't use mapcount() against large folio for sharing check")
Cc: Yin Fengwei <fengwei.yin@...el.com>
Cc: Yu Zhao <yuzhao@...gle.com>
Cc: Ryan Roberts <ryan.roberts@....com>
Cc: David Hildenbrand <david@...hat.com>
Cc: Kefeng Wang <wangkefeng.wang@...wei.com>
Cc: Matthew Wilcox <willy@...radead.org>
Cc: Minchan Kim <minchan@...nel.org>
Cc: Vishal Moola (Oracle) <vishal.moola@...il.com>
Cc: Yang Shi <shy828301@...il.com>
Signed-off-by: Barry Song <v-songbaohua@...o.com>
---
 mm/madvise.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index cfa5e7288261..abde3edb04f0 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -453,7 +453,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
 		if (folio_test_large(folio)) {
 			int err;
 
-			if (folio_estimated_sharers(folio) != 1)
+			if (folio_estimated_sharers(folio) > 1)
 				break;
 			if (pageout_anon_only_filter && !folio_test_anon(folio))
 				break;
-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ