[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7b6f57b5-c99f-4be7-b89c-1db06788d2b5@redhat.com>
Date: Wed, 2 Apr 2025 13:36:32 +0200
From: David Hildenbrand <david@...hat.com>
To: linux-kernel@...r.kernel.org
Cc: linux-mm@...ck.org, x86@...nel.org, xingwei lee <xrivendell7@...il.com>,
yuxin wang <wang1315768607@....com>,
Marius Fleischer <fleischermarius@...il.com>, Ingo Molnar
<mingo@...nel.org>, Borislav Petkov <bp@...en8.de>,
Dan Carpenter <dan.carpenter@...aro.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Dave Hansen <dave.hansen@...ux.intel.com>, Andy Lutomirski
<luto@...nel.org>, Peter Zijlstra <peterz@...radead.org>,
Rik van Riel <riel@...riel.com>, "H. Peter Anvin" <hpa@...or.com>,
Peter Xu <peterx@...hat.com>, Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Subject: Re: [PATCH v3] x86/mm/pat: Fix VM_PAT handling when fork() fails in
copy_page_range()
On 25.03.25 20:19, David Hildenbrand wrote:
> If track_pfn_copy() fails, we already added the dst VMA to the maple
> tree. As fork() fails, we'll cleanup the maple tree, and stumble over
> the dst VMA for which we neither performed any reservation nor copied
> any page tables.
>
> Consequently untrack_pfn() will see VM_PAT and try obtaining the
> PAT information from the page table -- which fails because the page
> table was not copied.
>
> The easiest fix would be to simply clear the VM_PAT flag of the dst VMA
> if track_pfn_copy() fails. However, the whole thing is about "simply"
> clearing the VM_PAT flag is shaky as well: if we passed track_pfn_copy()
> and performed a reservation, but copying the page tables fails, we'll
> simply clear the VM_PAT flag, not properly undoing the reservation ...
> which is also wrong.
>
> So let's fix it properly: set the VM_PAT flag only if the reservation
> succeeded (leaving it clear initially), and undo the reservation if
> anything goes wrong while copying the page tables: clearing the VM_PAT
> flag after undoing the reservation.
>
> Note that any copied page table entries will get zapped when the VMA will
> get removed later, after copy_page_range() succeeded; as VM_PAT is not set
> then, we won't try cleaning VM_PAT up once more and untrack_pfn() will be
> happy. Note that leaving these page tables in place without a reservation
> is not a problem, as we are aborting fork(); this process will never run.
>
> A reproducer can trigger this usually at the first try:
>
> https://gitlab.com/davidhildenbrand/scratchspace/-/raw/main/reproducers/pat_fork.c
>
> [ 45.239440] WARNING: CPU: 26 PID: 11650 at arch/x86/mm/pat/memtype.c:983 get_pat_info+0xf6/0x110
> [ 45.241082] Modules linked in: ...
> [ 45.249119] CPU: 26 UID: 0 PID: 11650 Comm: repro3 Not tainted 6.12.0-rc5+ #92
> [ 45.250598] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-2.fc40 04/01/2014
> [ 45.252181] RIP: 0010:get_pat_info+0xf6/0x110
> ...
> [ 45.268513] Call Trace:
> [ 45.269003] <TASK>
> [ 45.269425] ? __warn.cold+0xb7/0x14d
> [ 45.270131] ? get_pat_info+0xf6/0x110
> [ 45.270846] ? report_bug+0xff/0x140
> [ 45.271519] ? handle_bug+0x58/0x90
> [ 45.272192] ? exc_invalid_op+0x17/0x70
> [ 45.272935] ? asm_exc_invalid_op+0x1a/0x20
> [ 45.273717] ? get_pat_info+0xf6/0x110
> [ 45.274438] ? get_pat_info+0x71/0x110
> [ 45.275165] untrack_pfn+0x52/0x110
> [ 45.275835] unmap_single_vma+0xa6/0xe0
> [ 45.276549] unmap_vmas+0x105/0x1f0
> [ 45.277256] exit_mmap+0xf6/0x460
> [ 45.277913] __mmput+0x4b/0x120
> [ 45.278512] copy_process+0x1bf6/0x2aa0
> [ 45.279264] kernel_clone+0xab/0x440
> [ 45.279959] __do_sys_clone+0x66/0x90
> [ 45.280650] do_syscall_64+0x95/0x180
>
> Likely this case was missed in commit d155df53f310 ("x86/mm/pat: clear
> VM_PAT if copy_p4d_range failed")
>
> ... and instead of undoing the reservation we simply cleared the VM_PAT flag.
>
> Keep the documentation of these functions in include/linux/pgtable.h,
> one place is more than sufficient -- we should clean that up for the other
> functions like track_pfn_remap/untrack_pfn separately.
>
> Reported-by: xingwei lee <xrivendell7@...il.com>
> Reported-by: yuxin wang <wang1315768607@....com>
> Closes: https://lore.kernel.org/lkml/CABOYnLx_dnqzpCW99G81DmOr+2UzdmZMk=T3uxwNxwz+R1RAwg@mail.gmail.com/
> Reported-by: Marius Fleischer <fleischermarius@...il.com>
> Closes: https://lore.kernel.org/lkml/CAJg=8jwijTP5fre8woS4JVJQ8iUA6v+iNcsOgtj9Zfpc3obDOQ@mail.gmail.com/
> Fixes: d155df53f310 ("x86/mm/pat: clear VM_PAT if copy_p4d_range failed")
> Fixes: 2ab640379a0a ("x86: PAT: hooks in generic vm code to help archs to track pfnmap regions - v3")
> Cc: Ingo Molnar <mingo@...nel.org>
> Cc: Borislav Petkov <bp@...en8.de>
> Cc: Dan Carpenter <dan.carpenter@...aro.org>
> Cc: Andrew Morton <akpm@...ux-foundation.org>
> Cc: Linus Torvalds <torvalds@...ux-foundation.org>
> Cc: Dave Hansen <dave.hansen@...ux.intel.com>
> Cc: Andy Lutomirski <luto@...nel.org>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Rik van Riel <riel@...riel.com>
> Cc: "H. Peter Anvin" <hpa@...or.com>
> Cc: Peter Xu <peterx@...hat.com>
> Signed-off-by: David Hildenbrand <david@...hat.com>
> ---
Apparently smatch is not happy about some scenarios. The following might
make it happy, and make track_pfn_copy() obey the documentation "pfn set
if rc == 0".
diff --git a/arch/x86/mm/pat/memtype.c b/arch/x86/mm/pat/memtype.c
index d721cc19addbd..a51d21d2e5198 100644
--- a/arch/x86/mm/pat/memtype.c
+++ b/arch/x86/mm/pat/memtype.c
@@ -992,8 +992,10 @@ int track_pfn_copy(struct vm_area_struct *dst_vma,
pgprot_t pgprot;
int rc;
- if (!(src_vma->vm_flags & VM_PAT))
+ if (!(src_vma->vm_flags & VM_PAT)) {
+ *pfn = 0;
return 0;
+ }
/*
* Duplicate the PAT information for the dst VMA based on the src
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 4c107e17c547e..d4b564aacab8f 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -1515,6 +1515,7 @@ static inline void track_pfn_insert(struct
vm_area_struct *vma, pgprot_t *prot,
static inline int track_pfn_copy(struct vm_area_struct *dst_vma,
struct vm_area_struct *src_vma, unsigned long *pfn)
{
+ *pfn = 0;
return 0;
}
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists