lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aMDRAr1UP1Ix6zaY@hyeyoo>
Date: Wed, 10 Sep 2025 10:14:50 +0900
From: Harry Yoo <harry.yoo@...cle.com>
To: Jane Chu <jane.chu@...cle.com>
Cc: osalvador@...e.de, liushixin2@...wei.com, muchun.song@...ux.dev,
        david@...hat.com, akpm@...ux-foundation.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, jannh@...gle.com
Subject: Re: [PATCH] mm/hugetlb: fix copy_hugetlb_page_range() to use
 ->pt_share_count

On Tue, Sep 09, 2025 at 12:43:57PM -0600, Jane Chu wrote:
> commit 59d9094df3d79 introduced ->pt_share_count dedicated to

nit: the format should be:
  commit <sha1> ("summary")
?

> hugetlb PMD share count tracking, but omitted fixing
> copy_hugetlb_page_range(), leaving the function relying on
> page_count() for tracking that no longer works.
> 
> When lazy page table copy for hugetlb is disabled (commit bcd51a3c679d),

same here.

> fork()'ing with hugetlb PMD sharing quickly lockup -
> 
> [  239.446559] watchdog: BUG: soft lockup - CPU#75 stuck for 27s!
> [  239.446611] RIP: 0010:native_queued_spin_lock_slowpath+0x7e/0x2e0
> [  239.446631] Call Trace:
> [  239.446633]  <TASK>
> [  239.446636]  _raw_spin_lock+0x3f/0x60
> [  239.446639]  copy_hugetlb_page_range+0x258/0xb50
> [  239.446645]  copy_page_range+0x22b/0x2c0
> [  239.446651]  dup_mmap+0x3e2/0x770
> [  239.446654]  dup_mm.constprop.0+0x5e/0x230
> [  239.446657]  copy_process+0xd17/0x1760
> [  239.446660]  kernel_clone+0xc0/0x3e0
> [  239.446661]  __do_sys_clone+0x65/0xa0
> [  239.446664]  do_syscall_64+0x82/0x930
> [  239.446668]  ? count_memcg_events+0xd2/0x190
> [  239.446671]  ? syscall_trace_enter+0x14e/0x1f0
> [  239.446676]  ? syscall_exit_work+0x118/0x150
> [  239.446677]  ? arch_exit_to_user_mode_prepare.constprop.0+0x9/0xb0
> [  239.446681]  ? clear_bhb_loop+0x30/0x80
> [  239.446684]  ? clear_bhb_loop+0x30/0x80
> [  239.446686]  entry_SYSCALL_64_after_hwframe+0x76/0x7e
> 
> There are two options to resolve the potential latent issue:
>   1. remove the PMD sharing awareness from copy_hugetlb_page_range(),
>   2. fix it.
> This patch opts for the second option.
> 
> Fixes: 59d9094df3d79 ("mm: hugetlb: independent PMD page table shared
> count")

nit: we don't add newline even when Fixes: tag is line is over 75 characters?
 
> Signed-off-by: Jane Chu <jane.chu@...cle.com>
> ---

The change in general looks good to me,
Reviewed-by: Harry Yoo <harry.yoo@...cle.com>

+Cc Jann Horn who backported the commit 59d9094df3d79.

Elaborating a little bit why it doesn't need to be backported:
TL;DR: One needs to backport commit 3aa4ed8040e15 to pre-v6.0 kernels,
or revert commit bcd51a3c679d to trigger this.

commit 59d9094df3d79 ("mm: hugetlb: independent PMD page table shared count")
is introduced in 6.13 and backported to 5.10.y, 5.15.y, 6.1.y, 6.6.y, 6.12.y.

As lazy page table copy is enabled in v6.0 by commit bcd51a3c679d
("hugetlb: lazy page table copies in fork()"), the bug is not triggered
because shared hugetlb VMAs go through lazy page table copy logic
(vma_needs_copy() returns false) or they can't share page tables
(uffd_disable_huge_pmd_share() returns false).

They shouldn't have anon_vma, VM_PFNMAP, VM_MIXEDMAP and UFFD-WP VMA
cannot share page tables - so either vma_needs_copy() return false, or
page tables cannot be shared.

And before commit 3aa4ed8040e15 ("mm/hugetlb: make detecting shared pte
more reliable") introduced in v6.1, copy_hugetlb_page_range() doesn't check
refcount to determine whether the page table is shared. 

>  mm/hugetlb.c | 13 ++++---------
>  1 file changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 753f99b4c718..8ca5b4f7805f 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5594,18 +5594,13 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
>  			break;
>  		}
>  
> -		/*
> -		 * If the pagetables are shared don't copy or take references.
> -		 *
> -		 * dst_pte == src_pte is the common case of src/dest sharing.
> -		 * However, src could have 'unshared' and dst shares with
> -		 * another vma. So page_count of ptep page is checked instead
> -		 * to reliably determine whether pte is shared.
> -		 */
> -		if (page_count(virt_to_page(dst_pte)) > 1) {
> +#ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING
> +		/* If the pagetables are shared don't copy or take references. */
> +		if (ptdesc_pmd_pts_count(virt_to_ptdesc(dst_pte)) > 0) {
>  			addr |= last_addr_mask;
>  			continue;
>  		}
> +#endif
>  
>  		dst_ptl = huge_pte_lock(h, dst, dst_pte);
>  		src_ptl = huge_pte_lockptr(h, src, src_pte);
> -- 
> 2.43.5
> 

-- 
Cheers,
Harry / Hyeonggon

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ