lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7E6AC98B-29FC-4B92-BF72-56615D22CBF0@nvidia.com>
Date: Mon, 20 Oct 2025 15:55:56 -0400
From: Zi Yan <ziy@...dia.com>
To: jinji zhong <jinji.z.zhong@...il.com>
Cc: akpm@...ux-foundation.org, feng.han@...or.com, hannes@...xchg.org,
 jackmanb@...gle.com, linux-kernel@...r.kernel.org, linux-mm@...ck.org,
 liulu.liu@...or.com, mhocko@...e.com, surenb@...gle.com, vbabka@...e.cz,
 zhongjinji@...or.com
Subject: Re: [PATCH v0] mm/page_alloc: Cleanup for __del_page_from_free_list()

On 20 Oct 2025, at 11:06, jinji zhong wrote:

>> On 1 Oct 2025, at 0:38, jinji zhong wrote:
>>
>>>> On 30 Sep 2025, at 9:55, Vlastimil Babka wrote:
>>>
>>>>> On 9/25/25 10:50, zhongjinji wrote:
>>>>>> It is unnecessary to set page->private in __del_page_from_free_list().
>>>>>>
>>>>>> If the page is about to be allocated, page->private will be cleared by
>>>>>> post_alloc_hook() before the page is handed out. If the page is expanded
>>>>>> or merged, page->private will be reset by set_buddy_order, and no one
>>>>>> will retrieve the page's buddy_order without the PageBuddy flag being set.
>>>>>> If the page is isolated, it will also reset page->private when it
>>>>>> succeeds.
>>>>>
>>>>> Seems correct.
>>>
>>>> This means high order free pages will have head[2N].private set to a non-zero
>>>> value, where head[N*2].private is 1, head[N*(2^2)].private is 2, ...
>>>> head[N*(2^M)].private is M and head[0].private is the actual free page order.
>>>> If such a high order free page is used as high order folio, it should be fine.
>>>> But if user allocates a non-compound high order page and uses split_page()
>>>> to get a list of order-0 pages from this high order page, some pages will
>>>> have non zero private. I wonder if these users are prepared for that.
>>>
>>> Having non-empty page->private in tail pages of non-compound high-order
>>> pages is not an issue, as pages from the pcp lists never guarantee their
>>> initial state. If ensuring empty page->private for tail pages is required,
>>
>> Sure. But is it because all page allocation users return used pages with
>
> Some users [2] do not reset the private back to 0. When the page is a tail
> page, the non-zero private value will persist until the page is split.
>
>> ->private set back to 0? And can all page allocation users handle non-zero
>> ->private? Otherwise, it can cause subtle bugs.
>
> Yes, you are right. Some users(like swapfile [1]) cannot handle non-zero private.
>
>>> we should handle this in prep_new_page(), similar to the approach taken in
>>> prep_compound_page().
>>>
>>>> For example, kernel/events/ring_buffer.c does it. In its comment, it says
>>>> “set its first page's private to this order; !PagePrivate(page) means it's
>>>> just a normal page.”
>>>> (see https://elixir.bootlin.com/linux/v6.17/source/kernel/events/ring_buffer.c#L634)
>>>
>>> PagePrivate is a flag in page->flags that indicates page->private is
>>> already in use. While PageBuddy serves a similar purpose, it additionally
>>> signifies that the page is part of the buddy system.
>>
>> OK. You mean ->private will never be used if PagePrivate is not set
>> in ring buffer code?
>
> In the ring buffer code, it only uses the private field of the head page,
> but I recently found that the swapfile [1] is assuming page->private is zero,
> even if the page is a tail page, which seems a bit dangerous. Adding this
> patch will make this situation worse.

Yeah, thank you for the detective work. The comment in [1] sounds really
alarming, as the code relies on a specific behavior:
    /*
	 *Page allocation does not initialize the page's lru field,
	 * but it does always reset its private field.
     */

We can revisit your patch when page->private is gone like Vlastimil suggested.

>
> link: https://elixir.bootlin.com/linux/v6.17/source/mm/swapfile.c#L3745 [1]
> link: https://elixir.bootlin.com/linux/v6.17/source/mm/swapfile.c#L3887 [2]
>
>> If you are confident about it is OK to make some pages’ ->private not being
>> zero at allocation, I am not going to block the patch.
>>
>>>
>>>> I wonder if non zero page->private would cause any issue there.
>>>
>>>> Maybe split_page() should set all page->private to 0.
>>>
>>>> Let me know if I get anything wrong.
>>>
>>>>>
>>>>>> Since __del_page_from_free_list() is a hot path in the kernel, it would be
>>>>>> better to remove the unnecessary set_page_private().
>>>>>>
>>>>>> Signed-off-by: zhongjinji <zhongjinji@...or.com>
>>>>>
>>>>> Reviewed-by: Vlastimil Babka <vbabka@...e.cz>
>>>>>
>>>>>> ---
>>>>>>  mm/page_alloc.c | 1 -
>>>>>>  1 file changed, 1 deletion(-)
>>>>>>
>>>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>>>> index d1d037f97c5f..1999eb7e7c14 100644
>>>>>> --- a/mm/page_alloc.c
>>>>>> +++ b/mm/page_alloc.c
>>>>>> @@ -868,7 +868,6 @@ static inline void __del_page_from_free_list(struct page *page, struct zone *zon
>>>>>>
>>>>>>  	list_del(&page->buddy_list);
>>>>>>  	__ClearPageBuddy(page);
>>>>>> -	set_page_private(page, 0);
>>>>>>  	zone->free_area[order].nr_free--;
>>>>>>
>>>>>>  	if (order >= pageblock_order && !is_migrate_isolate(migratetype))
>>>
>>>
>>>> Best Regards,
>>>> Yan, Zi
>>
>>
>> Best Regards,
>> Yan, Zi


--
Best Regards,
Yan, Zi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ