[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aMDRAr1UP1Ix6zaY@hyeyoo>
Date: Wed, 10 Sep 2025 10:14:50 +0900
From: Harry Yoo <harry.yoo@...cle.com>
To: Jane Chu <jane.chu@...cle.com>
Cc: osalvador@...e.de, liushixin2@...wei.com, muchun.song@...ux.dev,
david@...hat.com, akpm@...ux-foundation.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, jannh@...gle.com
Subject: Re: [PATCH] mm/hugetlb: fix copy_hugetlb_page_range() to use
->pt_share_count
On Tue, Sep 09, 2025 at 12:43:57PM -0600, Jane Chu wrote:
> commit 59d9094df3d79 introduced ->pt_share_count dedicated to
nit: the format should be:
commit <sha1> ("summary")
?
> hugetlb PMD share count tracking, but omitted fixing
> copy_hugetlb_page_range(), leaving the function relying on
> page_count() for tracking that no longer works.
>
> When lazy page table copy for hugetlb is disabled (commit bcd51a3c679d),
same here.
> fork()'ing with hugetlb PMD sharing quickly lockup -
>
> [ 239.446559] watchdog: BUG: soft lockup - CPU#75 stuck for 27s!
> [ 239.446611] RIP: 0010:native_queued_spin_lock_slowpath+0x7e/0x2e0
> [ 239.446631] Call Trace:
> [ 239.446633] <TASK>
> [ 239.446636] _raw_spin_lock+0x3f/0x60
> [ 239.446639] copy_hugetlb_page_range+0x258/0xb50
> [ 239.446645] copy_page_range+0x22b/0x2c0
> [ 239.446651] dup_mmap+0x3e2/0x770
> [ 239.446654] dup_mm.constprop.0+0x5e/0x230
> [ 239.446657] copy_process+0xd17/0x1760
> [ 239.446660] kernel_clone+0xc0/0x3e0
> [ 239.446661] __do_sys_clone+0x65/0xa0
> [ 239.446664] do_syscall_64+0x82/0x930
> [ 239.446668] ? count_memcg_events+0xd2/0x190
> [ 239.446671] ? syscall_trace_enter+0x14e/0x1f0
> [ 239.446676] ? syscall_exit_work+0x118/0x150
> [ 239.446677] ? arch_exit_to_user_mode_prepare.constprop.0+0x9/0xb0
> [ 239.446681] ? clear_bhb_loop+0x30/0x80
> [ 239.446684] ? clear_bhb_loop+0x30/0x80
> [ 239.446686] entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> There are two options to resolve the potential latent issue:
> 1. remove the PMD sharing awareness from copy_hugetlb_page_range(),
> 2. fix it.
> This patch opts for the second option.
>
> Fixes: 59d9094df3d79 ("mm: hugetlb: independent PMD page table shared
> count")
nit: we don't add newline even when Fixes: tag is line is over 75 characters?
> Signed-off-by: Jane Chu <jane.chu@...cle.com>
> ---
The change in general looks good to me,
Reviewed-by: Harry Yoo <harry.yoo@...cle.com>
+Cc Jann Horn who backported the commit 59d9094df3d79.
Elaborating a little bit why it doesn't need to be backported:
TL;DR: One needs to backport commit 3aa4ed8040e15 to pre-v6.0 kernels,
or revert commit bcd51a3c679d to trigger this.
commit 59d9094df3d79 ("mm: hugetlb: independent PMD page table shared count")
is introduced in 6.13 and backported to 5.10.y, 5.15.y, 6.1.y, 6.6.y, 6.12.y.
As lazy page table copy is enabled in v6.0 by commit bcd51a3c679d
("hugetlb: lazy page table copies in fork()"), the bug is not triggered
because shared hugetlb VMAs go through lazy page table copy logic
(vma_needs_copy() returns false) or they can't share page tables
(uffd_disable_huge_pmd_share() returns false).
They shouldn't have anon_vma, VM_PFNMAP, VM_MIXEDMAP and UFFD-WP VMA
cannot share page tables - so either vma_needs_copy() return false, or
page tables cannot be shared.
And before commit 3aa4ed8040e15 ("mm/hugetlb: make detecting shared pte
more reliable") introduced in v6.1, copy_hugetlb_page_range() doesn't check
refcount to determine whether the page table is shared.
> mm/hugetlb.c | 13 ++++---------
> 1 file changed, 4 insertions(+), 9 deletions(-)
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 753f99b4c718..8ca5b4f7805f 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5594,18 +5594,13 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> break;
> }
>
> - /*
> - * If the pagetables are shared don't copy or take references.
> - *
> - * dst_pte == src_pte is the common case of src/dest sharing.
> - * However, src could have 'unshared' and dst shares with
> - * another vma. So page_count of ptep page is checked instead
> - * to reliably determine whether pte is shared.
> - */
> - if (page_count(virt_to_page(dst_pte)) > 1) {
> +#ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING
> + /* If the pagetables are shared don't copy or take references. */
> + if (ptdesc_pmd_pts_count(virt_to_ptdesc(dst_pte)) > 0) {
> addr |= last_addr_mask;
> continue;
> }
> +#endif
>
> dst_ptl = huge_pte_lock(h, dst, dst_pte);
> src_ptl = huge_pte_lockptr(h, src, src_pte);
> --
> 2.43.5
>
--
Cheers,
Harry / Hyeonggon
Powered by blists - more mailing lists