lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fd16b219-bc46-484a-8581-a21240440fa6@redhat.com>
Date: Wed, 5 Jun 2024 14:46:12 +0200
From: David Hildenbrand <david@...hat.com>
To: Lance Yang <ioworker0@...il.com>, akpm@...ux-foundation.org
Cc: willy@...radead.org, sj@...nel.org, baolin.wang@...ux.alibaba.com,
 maskray@...gle.com, ziy@...dia.com, ryan.roberts@....com, 21cnbao@...il.com,
 mhocko@...e.com, fengwei.yin@...el.com, zokeefe@...gle.com,
 shy828301@...il.com, xiehuan09@...il.com, libang.li@...group.com,
 wangkefeng.wang@...wei.com, songmuchun@...edance.com, peterx@...hat.com,
 minchan@...nel.org, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6 2/3] mm/rmap: integrate PMD-mapped folio splitting into
 pagewalk loop

On 21.05.24 06:02, Lance Yang wrote:
> In preparation for supporting try_to_unmap_one() to unmap PMD-mapped
> folios, start the pagewalk first, then call split_huge_pmd_address() to
> split the folio.
> 
> Since TTU_SPLIT_HUGE_PMD will no longer perform immediately, we might
> encounter a PMD-mapped THP missing the mlock in the VM_LOCKED range during
> the page walk. It’s probably necessary to mlock this THP to prevent it from
> being picked up during page reclaim.
> 
> Suggested-by: David Hildenbrand <david@...hat.com>
> Suggested-by: Baolin Wang <baolin.wang@...ux.alibaba.com>
> Signed-off-by: Lance Yang <ioworker0@...il.com>
> ---

[...] again, sorry for the late review.

> diff --git a/mm/rmap.c b/mm/rmap.c
> index ddffa30c79fb..08a93347f283 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1640,9 +1640,6 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>   	if (flags & TTU_SYNC)
>   		pvmw.flags = PVMW_SYNC;
>   
> -	if (flags & TTU_SPLIT_HUGE_PMD)
> -		split_huge_pmd_address(vma, address, false, folio);
> -
>   	/*
>   	 * For THP, we have to assume the worse case ie pmd for invalidation.
>   	 * For hugetlb, it could be much worse if we need to do pud
> @@ -1668,20 +1665,35 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>   	mmu_notifier_invalidate_range_start(&range);
>   
>   	while (page_vma_mapped_walk(&pvmw)) {
> -		/* Unexpected PMD-mapped THP? */
> -		VM_BUG_ON_FOLIO(!pvmw.pte, folio);
> -
>   		/*
>   		 * If the folio is in an mlock()d vma, we must not swap it out.
>   		 */
>   		if (!(flags & TTU_IGNORE_MLOCK) &&
>   		    (vma->vm_flags & VM_LOCKED)) {
>   			/* Restore the mlock which got missed */
> -			if (!folio_test_large(folio))
> +			if (!folio_test_large(folio) ||
> +			    (!pvmw.pte && (flags & TTU_SPLIT_HUGE_PMD)))
>   				mlock_vma_folio(folio, vma);

Can you elaborate why you think this would be required? If we would have 
performed the  split_huge_pmd_address() beforehand, we would still be 
left with a large folio, no?

>   			goto walk_done_err;
>   		}
>   
> +		if (!pvmw.pte && (flags & TTU_SPLIT_HUGE_PMD)) {
> +			/*
> +			 * We temporarily have to drop the PTL and start once
> +			 * again from that now-PTE-mapped page table.
> +			 */
> +			split_huge_pmd_locked(vma, range.start, pvmw.pmd, false,
> +					      folio);

Using range.start here is a bit weird. Wouldn't this be pvmw.address? 
[did not check]

> +			pvmw.pmd = NULL;
> +			spin_unlock(pvmw.ptl);
> +			pvmw.ptl = NULL;


Would we want a

page_vma_mapped_walk_restart() that is exactly for that purpose?

> +			flags &= ~TTU_SPLIT_HUGE_PMD;
> +			continue;
> +		}
> +
> +		/* Unexpected PMD-mapped THP? */
> +		VM_BUG_ON_FOLIO(!pvmw.pte, folio);

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ