lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <fa3a6258-5304-40b2-bb58-e6081ed845d1@redhat.com>
Date: Tue, 1 Jul 2025 20:44:09 +0200
From: David Hildenbrand <david@...hat.com>
To: Luiz Capitulino <luizcap@...hat.com>, willy@...radead.org
Cc: akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
 linux-mm@...ck.org, lcapitulino@...il.com, shivankg@....com
Subject: Re: [PATCH 3/3] fs: stable_page_flags(): use snapshot_page()

On 26.06.25 20:16, Luiz Capitulino wrote:
> A race condition is possible in stable_page_flags() where user-space is
> reading /proc/kpageflags concurrently to a folio split. This may lead to
> oopses or BUG_ON()s being triggered.
> 
> To fix this, this commit uses snapshot_page() in stable_page_flags() so
> that stable_page_flags() works with a stable page and folio snapshots
> instead.
> 
> Note that stable_page_flags() makes use of some functions that require
> the original page or folio pointer to work properly (eg.
> is_free_budy_page() and folio_test_idle()). Since those functions can't
> be used on the page snapshot, we replace their usage with flags that
> were set by snapshot_page() for this purpose.
> 
> Signed-off-by: Luiz Capitulino <luizcap@...hat.com>
> ---
>   fs/proc/page.c | 25 ++++++++++++++-----------
>   1 file changed, 14 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/proc/page.c b/fs/proc/page.c
> index 936f8bbe5a6f..a2ee95f727f0 100644
> --- a/fs/proc/page.c
> +++ b/fs/proc/page.c
> @@ -147,6 +147,7 @@ static inline u64 kpf_copy_bit(u64 kflags, int ubit, int kbit)
>   u64 stable_page_flags(const struct page *page)
>   {
>   	const struct folio *folio;
> +	struct page_snapshot ps;
>   	unsigned long k;
>   	unsigned long mapping;
>   	bool is_anon;
> @@ -158,7 +159,9 @@ u64 stable_page_flags(const struct page *page)
>   	 */
>   	if (!page)
>   		return 1 << KPF_NOPAGE;
> -	folio = page_folio(page);
> +
> +	snapshot_page(&ps, page);
> +	folio = &ps.folio_snapshot;
>   
>   	k = folio->flags;
>   	mapping = (unsigned long)folio->mapping;
> @@ -167,7 +170,7 @@ u64 stable_page_flags(const struct page *page)
>   	/*
>   	 * pseudo flags for the well known (anonymous) memory mapped pages
>   	 */
> -	if (page_mapped(page))
> +	if (folio_mapped(folio))
>   		u |= 1 << KPF_MMAP;
>   	if (is_anon) {
>   		u |= 1 << KPF_ANON;
> @@ -179,7 +182,7 @@ u64 stable_page_flags(const struct page *page)
>   	 * compound pages: export both head/tail info
>   	 * they together define a compound page's start/end pos and order
>   	 */
> -	if (page == &folio->page)
> +	if (ps.idx == 0)
>   		u |= kpf_copy_bit(k, KPF_COMPOUND_HEAD, PG_head);
>   	else
>   		u |= 1 << KPF_COMPOUND_TAIL;
> @@ -189,10 +192,10 @@ u64 stable_page_flags(const struct page *page)
>   	         folio_test_large_rmappable(folio)) {
>   		/* Note: we indicate any THPs here, not just PMD-sized ones */
>   		u |= 1 << KPF_THP;
> -	} else if (is_huge_zero_folio(folio)) {
> +	} else if (ps.flags & PAGE_SNAPSHOT_PG_HUGE_ZERO) {

For that, we could use

is_huge_zero_pfn(ps.pfn)

from

https://lkml.kernel.org/r/20250617154345.2494405-10-david@redhat.com


You should be able to cherry pick that commit (only minor conflict in 
vm_normal_page_pmd()) and include it in this series.

>   		u |= 1 << KPF_ZERO_PAGE;
>   		u |= 1 << KPF_THP;
> -	} else if (is_zero_folio(folio)) {
> +	} else if (is_zero_pfn(ps.pfn)) {
>   		u |= 1 << KPF_ZERO_PAGE;
>   	}
>   
> @@ -200,14 +203,14 @@ u64 stable_page_flags(const struct page *page)
>   	 * Caveats on high order pages: PG_buddy and PG_slab will only be set
>   	 * on the head page.
>   	 */
> -	if (PageBuddy(page))
> +	if (PageBuddy(&ps.page_snapshot))
>   		u |= 1 << KPF_BUDDY;
> -	else if (page_count(page) == 0 && is_free_buddy_page(page))
 > +	else if (ps.flags & PAGE_SNAPSHOT_PG_FREE)

Yeah, that is nasty, and inherently racy. So detecting it an snapshot 
time might be best.

Which makes me wonder if this whole block should simply be

if (ps.flags & PAGE_SNAPSHOT_PG_BUDDY)
	u |= 1 << KPF_BUDDY;

and you move all buddy detection into the snapshotting function. That 
is, PAGE_SNAPSHOT_PG_BUDDY gets set for head and tail pages of buddy pages.

Looks less special that way ;)

>   		u |= 1 << KPF_BUDDY;
>   
> -	if (PageOffline(page))
> +	if (folio_test_offline(folio))
>   		u |= 1 << KPF_OFFLINE;
 > -	if (PageTable(page))> +	if (folio_test_pgtable(folio))
>   		u |= 1 << KPF_PGTABLE;

I assume, long-term none of these will actually be folios. But we can 
change that once we get to it.

(likely, going back to pages ... just like for the slab case below)

>   	if (folio_test_slab(folio))
>   		u |= 1 << KPF_SLAB;
> @@ -215,7 +218,7 @@ u64 stable_page_flags(const struct page *page)
>   #if defined(CONFIG_PAGE_IDLE_FLAG) && defined(CONFIG_64BIT)
>   	u |= kpf_copy_bit(k, KPF_IDLE,          PG_idle);
>   #else
> -	if (folio_test_idle(folio))
> +	if (ps.flags & PAGE_SNAPSHOT_PG_IDLE)
>   		u |= 1 << KPF_IDLE;

Another nasty 32bit case. At least once we decouple pages from folios,
the while test-idle in page-ext will vanish and this will get cleaned up.

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ