lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fbfef7fc-4030-462b-b514-498eea6620aa@arm.com>
Date: Thu, 27 Nov 2025 16:57:06 +0000
From: Ryan Roberts <ryan.roberts@....com>
To: Samuel Holland <samuel.holland@...ive.com>,
 Palmer Dabbelt <palmer@...belt.com>, Paul Walmsley <pjw@...nel.org>,
 linux-riscv@...ts.infradead.org, Andrew Morton <akpm@...ux-foundation.org>,
 David Hildenbrand <david@...hat.com>, linux-mm@...ck.org
Cc: devicetree@...r.kernel.org, Suren Baghdasaryan <surenb@...gle.com>,
 linux-kernel@...r.kernel.org, Mike Rapoport <rppt@...nel.org>,
 Michal Hocko <mhocko@...e.com>, Conor Dooley <conor@...nel.org>,
 Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
 Krzysztof Kozlowski <krzk+dt@...nel.org>, Alexandre Ghiti <alex@...ti.fr>,
 Emil Renner Berthing <kernel@...il.dk>, Rob Herring <robh+dt@...nel.org>,
 Vlastimil Babka <vbabka@...e.cz>, "Liam R . Howlett"
 <Liam.Howlett@...cle.com>
Subject: Re: [PATCH v3 08/22] mm: Allow page table accessors to be
 non-idempotent

On 13/11/2025 01:45, Samuel Holland wrote:
> Currently, some functions such as pte_offset_map() are passed both
> pointers to hardware page tables, and pointers to previously-read PMD
> entries on the stack. To ensure correctness in the first case, these
> functions must use the page table accessor function (pmdp_get()) to
> dereference the supplied pointer. However, this means pmdp_get() is
> called twice in the second case. This double call must be avoided if
> pmdp_get() applies some non-idempotent transformation to the value.
> 
> Avoid the double transformation by calling set_pmd() on the stack
> variables where necessary to keep set_pmd()/pmdp_get() calls balanced.

I don't think this is a good solution.

arm64, at least, expects and requires that only pointers to entries in pgtables
are passed to the arch helpers (e.g. set_pte(), ptep_get(), etc). For PTEs,
arm64 accesses adjacent entries within the page table to manage contiguous
mappings. If it is passed a pointer to a stack variable, it may erroneously
access other stuff on the stack thinking it is an entry in a page table.

I think we should formalize this as a clear requirement for all these functions;
all pte/pmd/pud/p4d/pgd pointers passed to the arch pgtable helpers must always
point to entries in pgtables.

arm64 will very likely take advantage of this in future in the pmd/pud/...
helpers as it does today for the pte level. But even today, arm64's set_pmd()
will emit barriers which are totally unnecessary when operating on a stack
variable that the HW PTW will never see.

Thanks,
Ryan

> 
> Signed-off-by: Samuel Holland <samuel.holland@...ive.com>
> ---
> 
> (no changes since v2)
> 
> Changes in v2:
>  - New patch for v2
> 
>  kernel/events/core.c  | 2 ++
>  mm/gup.c              | 3 +++
>  mm/khugepaged.c       | 6 ++++--
>  mm/page_table_check.c | 3 +++
>  mm/pgtable-generic.c  | 2 ++
>  5 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index fa4f9165bd94..7969b060bf2d 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -8154,6 +8154,8 @@ static u64 perf_get_pgtable_size(struct mm_struct *mm, unsigned long addr)
>  	if (pmd_leaf(pmd))
>  		return pmd_leaf_size(pmd);
>  
> +	/* transform pmd as if &pmd pointed to a hardware page table */
> +	set_pmd(&pmd, pmd);
>  	ptep = pte_offset_map(&pmd, addr);
>  	if (!ptep)
>  		goto again;
> diff --git a/mm/gup.c b/mm/gup.c
> index 549f9e868311..aba61704049e 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -2844,7 +2844,10 @@ static int gup_fast_pte_range(pmd_t pmd, pmd_t *pmdp, unsigned long addr,
>  	int ret = 0;
>  	pte_t *ptep, *ptem;
>  
> +	/* transform pmd as if &pmd pointed to a hardware page table */
> +	set_pmd(&pmd, pmd);
>  	ptem = ptep = pte_offset_map(&pmd, addr);
> +	pmd = pmdp_get(&pmd);
>  	if (!ptep)
>  		return 0;
>  	do {
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 1bff8ade751a..ab1f68a7bc83 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -1724,7 +1724,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
>  		struct mmu_notifier_range range;
>  		struct mm_struct *mm;
>  		unsigned long addr;
> -		pmd_t *pmd, pgt_pmd;
> +		pmd_t *pmd, pgt_pmd, pmdval;
>  		spinlock_t *pml;
>  		spinlock_t *ptl;
>  		bool success = false;
> @@ -1777,7 +1777,9 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff)
>  		 */
>  		if (check_pmd_state(pmd) != SCAN_SUCCEED)
>  			goto drop_pml;
> -		ptl = pte_lockptr(mm, pmd);
> +		/* pte_lockptr() needs a value, not a pointer to a page table */
> +		pmdval = pmdp_get(pmd);
> +		ptl = pte_lockptr(mm, &pmdval);
>  		if (ptl != pml)
>  			spin_lock_nested(ptl, SINGLE_DEPTH_NESTING);
>  
> diff --git a/mm/page_table_check.c b/mm/page_table_check.c
> index 31f4c39d20ef..77d6688db0de 100644
> --- a/mm/page_table_check.c
> +++ b/mm/page_table_check.c
> @@ -260,7 +260,10 @@ void __page_table_check_pte_clear_range(struct mm_struct *mm,
>  		return;
>  
>  	if (!pmd_bad(pmd) && !pmd_leaf(pmd)) {
> +		/* transform pmd as if &pmd pointed to a hardware page table */
> +		set_pmd(&pmd, pmd);
>  		pte_t *ptep = pte_offset_map(&pmd, addr);
> +		pmd = pmdp_get(&pmd);
>  		unsigned long i;
>  
>  		if (WARN_ON(!ptep))
> diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
> index 63a573306bfa..6602deb002f1 100644
> --- a/mm/pgtable-generic.c
> +++ b/mm/pgtable-generic.c
> @@ -299,6 +299,8 @@ pte_t *___pte_offset_map(pmd_t *pmd, unsigned long addr, pmd_t *pmdvalp)
>  		pmd_clear_bad(pmd);
>  		goto nomap;
>  	}
> +	/* transform pmdval as if &pmdval pointed to a hardware page table */
> +	set_pmd(&pmdval, pmdval);
>  	return __pte_map(&pmdval, addr);
>  nomap:
>  	rcu_read_unlock();


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ