lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250129115411.2077152-11-david@redhat.com>
Date: Wed, 29 Jan 2025 12:54:08 +0100
From: David Hildenbrand <david@...hat.com>
To: linux-kernel@...r.kernel.org
Cc: linux-doc@...r.kernel.org,
	dri-devel@...ts.freedesktop.org,
	linux-mm@...ck.org,
	nouveau@...ts.freedesktop.org,
	David Hildenbrand <david@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Jérôme Glisse <jglisse@...hat.com>,
	Jonathan Corbet <corbet@....net>,
	Alex Shi <alexs@...nel.org>,
	Yanteng Si <si.yanteng@...ux.dev>,
	Karol Herbst <kherbst@...hat.com>,
	Lyude Paul <lyude@...hat.com>,
	Danilo Krummrich <dakr@...nel.org>,
	David Airlie <airlied@...il.com>,
	Simona Vetter <simona@...ll.ch>,
	"Liam R. Howlett" <Liam.Howlett@...cle.com>,
	Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
	Vlastimil Babka <vbabka@...e.cz>,
	Jann Horn <jannh@...gle.com>,
	Pasha Tatashin <pasha.tatashin@...een.com>,
	Peter Xu <peterx@...hat.com>,
	Alistair Popple <apopple@...dia.com>,
	Jason Gunthorpe <jgg@...dia.com>
Subject: [PATCH v1 10/12] mm/rmap: handle device-exclusive entries correctly in folio_referenced_one()

Ever since commit b756a3b5e7ea ("mm: device exclusive memory access")
we can return with a device-exclusive entry from page_vma_mapped_walk().

folio_referenced_one() is not prepared for that, so teach it about these
non-present nonswap PTEs.

We'll likely never hit that path with device-private entries, but we
could with device-exclusive ones.

It's not really clear what to do: the device could be accessing this
PTE, but we don't have that information in the PTE. Likely MMU notifiers
should be taking care of that, and we can just assume "not referenced by
the CPU".

Note that we could currently only run into this case with
device-exclusive entries on THPs. For order-0 folios, we still adjust
the mapcount on conversion to device-exclusive, making the rmap walk
abort early (folio_mapcount() == 0). We'll fix that next, now that
folio_referenced_one() can handle it.

Fixes: b756a3b5e7ea ("mm: device exclusive memory access")
Signed-off-by: David Hildenbrand <david@...hat.com>
---
 mm/rmap.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index 903a78e60781..77b063e9aec4 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -899,8 +899,14 @@ static bool folio_referenced_one(struct folio *folio,
 			if (lru_gen_look_around(&pvmw))
 				referenced++;
 		} else if (pvmw.pte) {
-			if (ptep_clear_flush_young_notify(vma, address,
-						pvmw.pte))
+			/*
+			 * We can end up here with selected non-swap entries
+			 * that actually map pages similar to PROT_NONE; see
+			 * page_vma_mapped_walk()->check_pte(). From a CPU
+			 * perspective, these PTEs are old.
+			 */
+			if (pte_present(ptep_get(pvmw.pte)) &&
+			    ptep_clear_flush_young_notify(vma, address, pvmw.pte))
 				referenced++;
 		} else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
 			if (pmdp_clear_flush_young_notify(vma, address,
-- 
2.48.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ