lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e430344b-76b9-43be-b497-0152a8910faf@arm.com>
Date: Tue, 6 May 2025 15:40:32 +0530
From: Anshuman Khandual <anshuman.khandual@....com>
To: Dev Jain <dev.jain@....com>, akpm@...ux-foundation.org
Cc: Liam.Howlett@...cle.com, lorenzo.stoakes@...cle.com, vbabka@...e.cz,
 jannh@...gle.com, pfalcato@...e.de, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org, david@...hat.com, peterx@...hat.com,
 ryan.roberts@....com, mingo@...nel.org, libang.li@...group.com,
 maobibo@...ngson.cn, zhengqi.arch@...edance.com, baohua@...nel.org,
 willy@...radead.org, ioworker0@...il.com, yang@...amperecomputing.com
Subject: Re: [PATCH 3/3] mm: Optimize mremap() by PTE batching

On 5/6/25 10:30, Dev Jain wrote:
> Use folio_pte_batch() to optimize move_ptes(). Use get_and_clear_full_ptes()
> so as to elide TLBIs on each contig block, which was previously done by
> ptep_get_and_clear().
> 
> Signed-off-by: Dev Jain <dev.jain@....com>
> ---
>  mm/mremap.c | 24 +++++++++++++++++++-----
>  1 file changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/mremap.c b/mm/mremap.c
> index 1a08a7c3b92f..3621c07d8eea 100644
> --- a/mm/mremap.c
> +++ b/mm/mremap.c
> @@ -176,7 +176,7 @@ static int move_ptes(struct pagetable_move_control *pmc,
>  	struct vm_area_struct *vma = pmc->old;
>  	bool need_clear_uffd_wp = vma_has_uffd_without_event_remap(vma);
>  	struct mm_struct *mm = vma->vm_mm;
> -	pte_t *old_ptep, *new_ptep, pte;
> +	pte_t *old_ptep, *new_ptep, old_pte, pte;
>  	pmd_t dummy_pmdval;
>  	spinlock_t *old_ptl, *new_ptl;
>  	bool force_flush = false;
> @@ -185,6 +185,7 @@ static int move_ptes(struct pagetable_move_control *pmc,
>  	unsigned long old_end = old_addr + extent;
>  	unsigned long len = old_end - old_addr;
>  	int err = 0;
> +	int nr;
>  
>  	/*
>  	 * When need_rmap_locks is true, we take the i_mmap_rwsem and anon_vma
> @@ -237,10 +238,14 @@ static int move_ptes(struct pagetable_move_control *pmc,
>  
>  	for (; old_addr < old_end; old_ptep++, old_addr += PAGE_SIZE,
>  				   new_ptep++, new_addr += PAGE_SIZE) {
> -		if (pte_none(ptep_get(old_ptep)))
> +		const fpb_t fpb_flags = FPB_IGNORE_DIRTY | FPB_IGNORE_SOFT_DIRTY;
> +		int max_nr = (old_end - old_addr) >> PAGE_SHIFT;
> +
> +		nr = 1;
> +		old_pte = ptep_get(old_ptep);
> +		if (pte_none(old_pte))
>  			continue;
>  
> -		pte = ptep_get_and_clear(mm, old_addr, old_ptep);
>  		/*
>  		 * If we are remapping a valid PTE, make sure
>  		 * to flush TLB before we drop the PTL for the
> @@ -252,8 +257,17 @@ static int move_ptes(struct pagetable_move_control *pmc,
>  		 * the TLB entry for the old mapping has been
>  		 * flushed.
>  		 */
> -		if (pte_present(pte))
> +		if (pte_present(old_pte)) {
> +			if ((max_nr != 1) && maybe_contiguous_pte_pfns(old_ptep, old_pte)) {

maybe_contiguous_pte_pfns() cost will be applicable for memory
areas greater than a single PAGE_SIZE (i.e max_nr != 1) ? This
helper extracts an additional consecutive pte, ensures that it
is valid mapped and extracts pfn before comparing for the span.

There is some cost associated with the above code sequence which
looks justified for sequential access of memory buffers that has
consecutive physical memory backing. But what happens when such
buffers are less probable, will those buffers take a performance
hit for all the comparisons that just turn out to be negative ?

> +				struct folio *folio = vm_normal_folio(vma, old_addr, old_pte);
> +
> +				if (folio && folio_test_large(folio))
> +					nr = folio_pte_batch(folio, old_addr, old_ptep,
> +					old_pte, max_nr, fpb_flags, NULL, NULL, NULL);
> +			}
>  			force_flush = true;
> +		}
> +		pte = get_and_clear_full_ptes(mm, old_addr, old_ptep, nr, 0);
>  		pte = move_pte(pte, old_addr, new_addr);
>  		pte = move_soft_dirty_pte(pte);
>  
> @@ -266,7 +280,7 @@ static int move_ptes(struct pagetable_move_control *pmc,
>  				else if (is_swap_pte(pte))
>  					pte = pte_swp_clear_uffd_wp(pte);
>  			}
> -			set_pte_at(mm, new_addr, new_ptep, pte);
> +			set_ptes(mm, new_addr, new_ptep, pte, nr);
>  		}
>  	}
>  

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ