lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <858ecac5-9ba7-48da-8f34-ffda28d17609@arm.com>
Date: Fri, 7 Feb 2025 11:05:36 +0530
From: Anshuman Khandual <anshuman.khandual@....com>
To: Ryan Roberts <ryan.roberts@....com>,
 Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
 Muchun Song <muchun.song@...ux.dev>,
 Pasha Tatashin <pasha.tatashin@...een.com>,
 Andrew Morton <akpm@...ux-foundation.org>,
 Uladzislau Rezki <urezki@...il.com>, Christoph Hellwig <hch@...radead.org>,
 Mark Rutland <mark.rutland@....com>, Ard Biesheuvel <ardb@...nel.org>,
 Dev Jain <dev.jain@....com>, Alexandre Ghiti <alexghiti@...osinc.com>,
 Steve Capper <steve.capper@...aro.org>, Kevin Brodsky <kevin.brodsky@....com>
Cc: linux-arm-kernel@...ts.infradead.org, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1 08/16] arm64/mm: Hoist barriers out of ___set_ptes()
 loop



On 2/5/25 20:39, Ryan Roberts wrote:
> ___set_ptes() previously called __set_pte() for each PTE in the range,
> which would conditionally issue a DSB and ISB to make the new PTE value
> immediately visible to the table walker if the new PTE was valid and for
> kernel space.
> 
> We can do better than this; let's hoist those barriers out of the loop
> so that they are only issued once at the end of the loop. We then reduce
> the cost by the number of PTEs in the range.
> 
> Signed-off-by: Ryan Roberts <ryan.roberts@....com>
> ---
>  arch/arm64/include/asm/pgtable.h | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 3b55d9a15f05..1d428e9c0e5a 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -317,10 +317,8 @@ static inline void __set_pte_nosync(pte_t *ptep, pte_t pte)
>  	WRITE_ONCE(*ptep, pte);
>  }
>  
> -static inline void __set_pte(pte_t *ptep, pte_t pte)
> +static inline void __set_pte_complete(pte_t pte)
>  {
> -	__set_pte_nosync(ptep, pte);
> -
>  	/*
>  	 * Only if the new pte is valid and kernel, otherwise TLB maintenance
>  	 * or update_mmu_cache() have the necessary barriers.
> @@ -331,6 +329,12 @@ static inline void __set_pte(pte_t *ptep, pte_t pte)
>  	}
>  }
>  
> +static inline void __set_pte(pte_t *ptep, pte_t pte)
> +{
> +	__set_pte_nosync(ptep, pte);
> +	__set_pte_complete(pte);
> +}
> +
>  static inline pte_t __ptep_get(pte_t *ptep)
>  {
>  	return READ_ONCE(*ptep);
> @@ -647,12 +651,14 @@ static inline void ___set_ptes(struct mm_struct *mm, pte_t *ptep, pte_t pte,
>  
>  	for (;;) {
>  		__check_safe_pte_update(mm, ptep, pte);
> -		__set_pte(ptep, pte);
> +		__set_pte_nosync(ptep, pte);
>  		if (--nr == 0)
>  			break;
>  		ptep++;
>  		pte = pte_advance_pfn(pte, stride);
>  	}
> +
> +	__set_pte_complete(pte);

Given that the loop now iterates over number of page table entries without corresponding
consecutive dsb/isb sync, could there be a situation where something else gets scheduled
on the cpu before __set_pte_complete() is called ? Hence leaving the entire page table
entries block without desired mapping effect. IOW how __set_pte_complete() is ensured to
execute once the loop above completes. Otherwise this change LGTM.

>  }
>  
>  static inline void __set_ptes(struct mm_struct *mm,

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ