linux-kernel - Re: [PATCH v3] mm/page_alloc: clear page->private in free_pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0BC1D792-80CA-4E60-AEA0-187F73BD4723@nvidia.com>
Date: Sat, 07 Feb 2026 18:00:51 -0500
From: Zi Yan <ziy@...dia.com>
To: "David Hildenbrand (Arm)" <david@...nel.org>
Cc: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>, linux-mm@...ck.org,
 akpm@...ux-foundation.org, vbabka@...e.cz, surenb@...gle.com,
 mhocko@...e.com, jackmanb@...gle.com, hannes@...xchg.org, npiggin@...il.com,
 linux-kernel@...r.kernel.org, kasong@...cent.com, hughd@...gle.com,
 chrisl@...nel.org, ryncsn@...il.com, stable@...r.kernel.org,
 willy@...radead.org
Subject: Re: [PATCH v3] mm/page_alloc: clear page->private in
 free_pages_prepare()

On 7 Feb 2026, at 17:02, David Hildenbrand (Arm) wrote:

> On 2/7/26 18:36, Mikhail Gavrilov wrote:
>
> Thanks!
>
>> Several subsystems (slub, shmem, ttm, etc.) use page->private but don't
>> clear it before freeing pages. When these pages are later allocated as
>> high-order pages and split via split_page(), tail pages retain stale
>> page->private values.
>>
>> This causes a use-after-free in the swap subsystem. The swap code uses
>> page->private to track swap count continuations, assuming freshly
>> allocated pages have page->private == 0. When stale values are present,
>> swap_count_continued() incorrectly assumes the continuation list is valid
>> and iterates over uninitialized page->lru containing LIST_POISON values,
>> causing a crash:
>>
>>    KASAN: maybe wild-memory-access in range [0xdead000000000100-0xdead000000000107]
>>    RIP: 0010:__do_sys_swapoff+0x1151/0x1860
>>
>> Fix this by clearing page->private in free_pages_prepare(), ensuring all
>> freed pages have clean state regardless of previous use.
>
> I could have sworn we discussed something like that already in the past.

This[1] is my discussion on this topic and I managed to convince people we should
keep ->private zero on any pages.

[1] https://lore.kernel.org/all/20250925085006.23684-1-zhongjinji@honor.com/
>
> I recall that freeing pages with page->private set was allowed. Although
> I once wondered whether we should actually change that.

But if that is allowed, we can end up with tail page's private non zero,
because that free page can merge with a lower PFN buddy and its ->private
is not reset. See __free_one_page().

>
>>
>> Fixes: 3b8000ae185c ("mm/vmalloc: huge vmalloc backing pages should be split rather than compound")
>> Cc: stable@...r.kernel.org
>> Suggested-by: Zi Yan <ziy@...dia.com>
>> Acked-by: Zi Yan <ziy@...dia.com>
>> Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
>> ---
>
> Next time, please don't send patches as reply to another thread; that
> way it can easily get lost in a bigger thread.
>
> You want to get peoples attention :)
>
>>   mm/page_alloc.c | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index cbf758e27aa2..24ac34199f95 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -1430,6 +1430,7 @@ __always_inline bool free_pages_prepare(struct page *page,
>>    	page_cpupid_reset_last(page);
>>   	page->flags.f &= ~PAGE_FLAGS_CHECK_AT_PREP;
>> +	page->private = 0;
>
> Should we be using set_page_private()? It's a bit inconsistent :)
>
> I wonder, if it's really just the split_page() problem, why not
> handle it there, where we already iterate over all ("tail") pages?
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index cbf758e27aa2..cbbcfdf3ed26 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3122,8 +3122,10 @@ void split_page(struct page *page, unsigned int order)
>         VM_BUG_ON_PAGE(PageCompound(page), page);
>         VM_BUG_ON_PAGE(!page_count(page), page);
>  -       for (i = 1; i < (1 << order); i++)
> +       for (i = 1; i < (1 << order); i++) {
>                 set_page_refcounted(page + i);
> +               set_page_private(page, 0);
> +       }
>         split_page_owner(page, order, 0);
>         pgalloc_tag_split(page_folio(page), order, 0);
>         split_page_memcg(page, order);
>
>
> But then I thought about "what does actually happen during an folio split".
>
> We had a check in __split_folio_to_order() that got removed in 4265d67e405a, for some
> undocumented reason (and the patch got merged with 0 tags :( ). I assume because with zone-device
> there was a way to now got ->private properly set. But we removed the safety check for

It was the end of last year and the review traffic was a lot. No one had time to look at
it.

> all other folios.
>
> -               /*
> -                * page->private should not be set in tail pages. Fix up and warn once
> -                * if private is unexpectedly set.
> -                */
> -               if (unlikely(new_folio->private)) {
> -                       VM_WARN_ON_ONCE_PAGE(true, new_head);
> -                       new_folio->private = NULL;
> -               }
>
>
> I would have thought that we could have triggered that check easily before. Why didn't we?
>
> Who would have cleared the private field of tail pages?
>
> @Zi Yan, any idea why the folio splitting code wouldn't have revealed a similar problem?
>

For the issue reported by Mikhail[2], the page comes from vmalloc(), so it will not be split.
For other cases, a page/folio needs to be compound to be splittable and prep_compound_tail()
sets all tail page's private to 0. So that check is not that useful.

And the issue we are handling here is non compound high order page allocation. No one is
clearing ->private for all pages right now.

OK, I think we want to decide whether it is OK to have a page with set ->private at
page free time. If no, we can get this patch in and add a VM_WARN_ON_ONCE(page->private)
to catch all violators. If yes, we can use Mikhail's original patch, zeroing ->private
in split_page() and add a comment on ->private:

1. for compound page allocation, prep_compound_tail() is responsible for resetting ->private;
2. for non compound high order page allocation, split_page() is responsible for resetting ->private.


[2] https://lore.kernel.org/linux-mm/CABXGCsNqk6pOkocJ0ctcHssCvke2kqhzoR2BGf_Hh1hWPZATuA@mail.gmail.com/



--
Best Regards,
Yan, Zi