linux-kernel - Re: [PATCH v2 1/1] mm: fix folio_pte

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20250503182858.5a02729fcffd6d4723afcfc2@linux-foundation.org>
Date: Sat, 3 May 2025 18:28:58 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Petr Vaněk <arkamar@...as.cz>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org, David Hildenbrand
 <david@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
 xen-devel@...ts.xenproject.org, x86@...nel.org, stable@...r.kernel.org
Subject: Re: [PATCH v2 1/1] mm: fix folio_pte_batch() on XEN PV

On Fri,  2 May 2025 23:50:19 +0200 Petr Vaněk <arkamar@...as.cz> wrote:

> On XEN PV, folio_pte_batch() can incorrectly batch beyond the end of a
> folio due to a corner case in pte_advance_pfn(). Specifically, when the
> PFN following the folio maps to an invalidated MFN,
> 
> 	expected_pte = pte_advance_pfn(expected_pte, nr);
> 
> produces a pte_none(). If the actual next PTE in memory is also
> pte_none(), the pte_same() succeeds,
> 
> 	if (!pte_same(pte, expected_pte))
> 		break;
> 
> the loop is not broken, and batching continues into unrelated memory.
> 
> ...

Looks OK for now I guess but it looks like we should pay some attention
to what types we're using.

> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -248,11 +248,9 @@ static inline int folio_pte_batch(struct folio *folio, unsigned long addr,
>  		pte_t *start_ptep, pte_t pte, int max_nr, fpb_t flags,
>  		bool *any_writable, bool *any_young, bool *any_dirty)
>  {
> -	unsigned long folio_end_pfn = folio_pfn(folio) + folio_nr_pages(folio);
> -	const pte_t *end_ptep = start_ptep + max_nr;
>  	pte_t expected_pte, *ptep;
>  	bool writable, young, dirty;
> -	int nr;
> +	int nr, cur_nr;
>  
>  	if (any_writable)
>  		*any_writable = false;
> @@ -265,11 +263,15 @@ static inline int folio_pte_batch(struct folio *folio, unsigned long addr,
>  	VM_WARN_ON_FOLIO(!folio_test_large(folio) || max_nr < 1, folio);
>  	VM_WARN_ON_FOLIO(page_folio(pfn_to_page(pte_pfn(pte))) != folio, folio);
>  
> +	/* Limit max_nr to the actual remaining PFNs in the folio we could batch. */
> +	max_nr = min_t(unsigned long, max_nr,
> +		       folio_pfn(folio) + folio_nr_pages(folio) - pte_pfn(pte));
> +

Methinks max_nr really wants to be unsigned long.  That will permit the
cleanup of quite a bit of truncation, extension, signedness conversion
and general type chaos in folio_pte_batch()'s various callers.

And...

Why does folio_nr_pages() return a signed quantity?  It's a count.

And why the heck is folio_pte_batch() inlined?  It's larger then my
first hard disk and it has five callsites!