lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y3bdPuRR8Oq1Kexx@x1n>
Date:   Thu, 17 Nov 2022 20:17:50 -0500
From:   Peter Xu <peterx@...hat.com>
To:     linux-mm@...ck.org, linux-kernel@...r.kernel.org
Cc:     Rik van Riel <riel@...riel.com>,
        Muchun Song <songmuchun@...edance.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        James Houghton <jthoughton@...gle.com>,
        Nadav Amit <nadav.amit@...il.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        David Hildenbrand <david@...hat.com>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Mike Kravetz <mike.kravetz@...cle.com>
Subject: Re: [PATCH RFC v2 06/12] mm/hugetlb: Protect huge_pmd_share() with
 walker lock

On Thu, Nov 17, 2022 at 08:10:19PM -0500, Peter Xu wrote:
> huge_pmd_share() is normally called with vma lock held already, it lets us
> feel like we don't really need the walker lock.  But that's not true,
> because we only took the vma lock for "current" vma, but not all the vma
> pgtables we're going to walk upon.
> 
> Taking each vma lock may lead to deadlock and hard to order.  The safe
> approach is just to use the walker lock which guarantees the pgtable page
> being alive, then we should use get_page_unless_zero() rather than
> get_page() only, to make sure the pgtable page is not being freed at the
> meantime.
> 
> Signed-off-by: Peter Xu <peterx@...hat.com>
> ---
>  mm/hugetlb.c | 16 +++++++++++++---
>  1 file changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 61a1fa678172..5ef883184885 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -7008,6 +7008,13 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma,
>  	spinlock_t *ptl;
>  
>  	i_mmap_lock_read(mapping);
> +
> +	/*
> +	 * NOTE: even if we've got the vma read lock, here we still need to
> +	 * take the walker lock, because we're not walking the current vma,
> +	 * but some other mm's!
> +	 */
> +	hugetlb_walker_lock();
>  	vma_interval_tree_foreach(svma, &mapping->i_mmap, idx, idx) {
>  		if (svma == vma)
>  			continue;
> @@ -7016,12 +7023,15 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma,
>  		if (saddr) {
>  			spte = huge_pte_offset(svma->vm_mm, saddr,
>  					       vma_mmu_pagesize(svma));
> -			if (spte) {
> -				get_page(virt_to_page(spte));
> +			/*
> +			 * When page ref==0, it means it's probably being
> +			 * freed; continue with the next one.
> +			 */
> +			if (spte && get_page_unless_zero(virt_to_page(spte)))
>  				break;
> -			}
>  		}
>  	}
> +	hugetlb_walker_unlock();
>  
>  	if (!spte)
>  		goto out;
> -- 
> 2.37.3
> 

Will squash this in..

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6c77ae7a3d94..f9abede9de47 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -7045,6 +7045,8 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma,
                         */
                        if (spte && get_page_unless_zero(virt_to_page(spte)))
                                break;
+                       else
+                               spte = NULL;
                }
        }
        hugetlb_walker_unlock();

-- 
Peter Xu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ