[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250930043351.34927-1-lance.yang@linux.dev>
Date: Tue, 30 Sep 2025 12:33:51 +0800
From: Lance Yang <lance.yang@...ux.dev>
To: akpm@...ux-foundation.org,
david@...hat.com,
lorenzo.stoakes@...cle.com
Cc: peterx@...hat.com,
ziy@...dia.com,
baolin.wang@...ux.alibaba.com,
baohua@...nel.org,
ryan.roberts@....com,
dev.jain@....com,
npache@...hat.com,
riel@...riel.com,
Liam.Howlett@...cle.com,
vbabka@...e.cz,
harry.yoo@...cle.com,
jannh@...gle.com,
matthew.brost@...el.com,
joshua.hahnjy@...il.com,
rakie.kim@...com,
byungchul@...com,
gourry@...rry.net,
ying.huang@...ux.alibaba.com,
apopple@...dia.com,
usamaarif642@...il.com,
yuzhao@...gle.com,
linux-kernel@...r.kernel.org,
linux-mm@...ck.org,
ioworker0@...il.com,
stable@...r.kernel.org,
Lance Yang <lance.yang@...ux.dev>
Subject: [PATCH v2 1/1] mm/rmap: fix soft-dirty and uffd-wp bit loss when remapping zero-filled mTHP subpage to shared zeropage
From: Lance Yang <lance.yang@...ux.dev>
When splitting an mTHP and replacing a zero-filled subpage with the shared
zeropage, try_to_map_unused_to_zeropage() currently drops several important
PTE bits.
For userspace tools like CRIU, which rely on the soft-dirty mechanism for
incremental snapshots, losing the soft-dirty bit means modified pages are
missed, leading to inconsistent memory state after restore.
As pointed out by David, the more critical uffd-wp bit is also dropped.
This breaks the userfaultfd write-protection mechanism, causing writes
to be silently missed by monitoring applications, which can lead to data
corruption.
Preserve both the soft-dirty and uffd-wp bits from the old PTE when
creating the new zeropage mapping to ensure they are correctly tracked.
Cc: <stable@...r.kernel.org>
Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp")
Suggested-by: David Hildenbrand <david@...hat.com>
Suggested-by: Dev Jain <dev.jain@....com>
Acked-by: David Hildenbrand <david@...hat.com>
Signed-off-by: Lance Yang <lance.yang@...ux.dev>
---
v1 -> v2:
- Avoid calling ptep_get() multiple times (per Dev)
- Double-check the uffd-wp bit (per David)
- Collect Acked-by from David - thanks!
- https://lore.kernel.org/linux-mm/20250928044855.76359-1-lance.yang@linux.dev/
mm/migrate.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index ce83c2c3c287..50aa91d9ab4e 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -300,13 +300,14 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
unsigned long idx)
{
struct page *page = folio_page(folio, idx);
+ pte_t oldpte = ptep_get(pvmw->pte);
pte_t newpte;
if (PageCompound(page))
return false;
VM_BUG_ON_PAGE(!PageAnon(page), page);
VM_BUG_ON_PAGE(!PageLocked(page), page);
- VM_BUG_ON_PAGE(pte_present(ptep_get(pvmw->pte)), page);
+ VM_BUG_ON_PAGE(pte_present(oldpte), page);
if (folio_test_mlocked(folio) || (pvmw->vma->vm_flags & VM_LOCKED) ||
mm_forbids_zeropage(pvmw->vma->vm_mm))
@@ -322,6 +323,12 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
newpte = pte_mkspecial(pfn_pte(my_zero_pfn(pvmw->address),
pvmw->vma->vm_page_prot));
+
+ if (pte_swp_soft_dirty(oldpte))
+ newpte = pte_mksoft_dirty(newpte);
+ if (pte_swp_uffd_wp(oldpte))
+ newpte = pte_mkuffd_wp(newpte);
+
set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte);
dec_mm_counter(pvmw->vma->vm_mm, mm_counter(folio));
--
2.49.0
Powered by blists - more mailing lists