lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 23 May 2024 18:37:44 -0400
From: Peter Xu <peterx@...hat.com>
To: linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Cc: Thomas Gleixner <tglx@...utronix.de>,
	Jason Gunthorpe <jgg@...dia.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Al Viro <viro@...iv.linux.org.uk>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	Andy Lutomirski <luto@...nel.org>,
	Matthew Wilcox <willy@...radead.org>,
	peterx@...hat.com,
	Dan Williams <dan.j.williams@...el.com>,
	"Kirill A . Shutemov" <kirill@...temov.name>,
	Mike Rapoport <rppt@...nel.org>,
	Ingo Molnar <mingo@...hat.com>,
	Michal Hocko <mhocko@...e.com>,
	Alex Williamson <alex.williamson@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Suren Baghdasaryan <surenb@...gle.com>,
	Borislav Petkov <bp@...en8.de>,
	x86@...nel.org
Subject: [PATCH RFC 1/2] mm/x86/pat: Only untrack the pfn range if unmap region

X86 uses pfn tracking to do pfnmaps.  It looks like the tracking is
normally started at mmap() of device drivers, then untracked when munmap().

However in the current code the untrack is done in unmap_single_vma().
This might be problematic.

For example, unmap_single_vma() can be used nowadays even for zapping a
single page rather than the whole vmas.  It's very confusing to do whole
vma untracking in this function even if a caller would like to zap one
page.

That may not yet be a problem, may because of multiple reasons:

 (1) Things like MADV_DONTNEED won't ever work for pfnmaps and it'll fail
     already before reaching here.

 (2) If some pfn tracking is lost by accident, the default fallback is to
     use UC-, which might be safe enough even if it may not be wanted.  One
     can see track_pfn_insert() -> lookup_memtype() which can fall back to
     the default _PAGE_CACHE_MODE_UC_MINUS for a MMIO range.

However there's ongoing work [1] on VFIO driver to allow tearing down MMIO
pgtables before an munmap(), in which case we may not want to untrack the
pfns if we're only tearing down the pgtables.  IOW, we may want to keep the
pfn tracking information as those pfn mappings can be restored later with
the same vma object.  In VFIO's use case it can be remapped later if the
VFIO mapped MMIO region got re-enabled (e.g. PCI_COMMAND_MEMORY set).  In
this case we should do pfn track for the whole lifecycle of the vma.

IIUC, this was overlooked when zap_page_range_single() was introduced,
while in the past it was only used in the munmap() path which wants to
always unmap the region completely.  E.g., commit f5cc4eef9987 ("VM: make
zap_page_range() callers that act on a single VMA use separate helper") is
the initial commit that introduced unmap_single_vma(), in which the chunk
of untrack_pfn() was moved over from unmap_vmas().

Recover that behavior to untrack pfnmap only when unmap regions.

[1] https://lore.kernel.org/r/20240523195629.218043-1-alex.williamson@redhat.com

Cc: Alex Williamson <alex.williamson@...hat.com>
Cc: Jason Gunthorpe <jgg@...dia.com>
Cc: Al Viro <viro@...iv.linux.org.uk>
Cc: Dave Hansen <dave.hansen@...ux.intel.com>
Cc: Andy Lutomirski <luto@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Borislav Petkov <bp@...en8.de>
Cc: Kirill A. Shutemov <kirill@...temov.name>
Cc: x86@...nel.org
Signed-off-by: Peter Xu <peterx@...hat.com>
---
 mm/memory.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index b5453b86ec4b..9db5e8730d12 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1821,9 +1821,6 @@ static void unmap_single_vma(struct mmu_gather *tlb,
 	if (vma->vm_file)
 		uprobe_munmap(vma, start, end);
 
-	if (unlikely(vma->vm_flags & VM_PFNMAP))
-		untrack_pfn(vma, 0, 0, mm_wr_locked);
-
 	if (start != end) {
 		if (unlikely(is_vm_hugetlb_page(vma))) {
 			/*
@@ -1888,6 +1885,8 @@ void unmap_vmas(struct mmu_gather *tlb, struct ma_state *mas,
 		unsigned long start = start_addr;
 		unsigned long end = end_addr;
 		hugetlb_zap_begin(vma, &start, &end);
+		if (unlikely(vma->vm_flags & VM_PFNMAP))
+			untrack_pfn(vma, 0, 0, mm_wr_locked);
 		unmap_single_vma(tlb, vma, start, end, &details,
 				 mm_wr_locked);
 		hugetlb_zap_end(vma, &details);
-- 
2.45.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ