[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e46ab5c6-8b95-4d2f-b820-14659c615e81@redhat.com>
Date: Tue, 19 Nov 2024 17:52:06 +0100
From: David Hildenbrand <david@...hat.com>
To: Zi Yan <ziy@...dia.com>
Cc: Yu Zhao <yuzhao@...gle.com>, Andrew Morton <akpm@...ux-foundation.org>,
Muchun Song <muchun.song@...ux.dev>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH mm-unstable v2 1/3] mm/contig_alloc: support __GFP_COMP
On 19.11.24 17:49, David Hildenbrand wrote:
> On 19.11.24 17:41, Zi Yan wrote:
>> On 19 Nov 2024, at 11:31, David Hildenbrand wrote:
>>
>>> On 19.11.24 17:12, Zi Yan wrote:
>>>> On 19 Nov 2024, at 10:29, David Hildenbrand wrote:
>>>>
>>>>>> +/* Split a multi-block free page into its individual pageblocks. */
>>>>>> +static void split_large_buddy(struct zone *zone, struct page *page,
>>>>>> + unsigned long pfn, int order, fpi_t fpi)
>>>>>> +{
>>>>>> + unsigned long end = pfn + (1 << order);
>>>>>> +
>>>>>> + VM_WARN_ON_ONCE(!IS_ALIGNED(pfn, 1 << order));
>>>>>> + /* Caller removed page from freelist, buddy info cleared! */
>>>>>> + VM_WARN_ON_ONCE(PageBuddy(page));
>>>>>> +
>>>>>> + if (order > pageblock_order)
>>>>>> + order = pageblock_order;
>>>>>> +
>>>>>> + while (pfn != end) {
>>>>>> + int mt = get_pfnblock_migratetype(page, pfn);
>>>>>> +
>>>>>> + __free_one_page(page, pfn, zone, order, mt, fpi);
>>>>>> + pfn += 1 << order;
>>>>>> + page = pfn_to_page(pfn);
>>>>>> + }
>>>>>> +}
>>>>>
>>>>> Hi,
>>>>>
>>>>> stumbling over this while digging through the code ....
>>>>>
>>>>>> +
>>>>>> static void free_one_page(struct zone *zone, struct page *page,
>>>>>> unsigned long pfn, unsigned int order,
>>>>>> fpi_t fpi_flags)
>>>>>> {
>>>>>> unsigned long flags;
>>>>>> - int migratetype;
>>>>>> spin_lock_irqsave(&zone->lock, flags);
>>>>>> - migratetype = get_pfnblock_migratetype(page, pfn);
>>>>>> - __free_one_page(page, pfn, zone, order, migratetype, fpi_flags);
>>>>>
>>>>> This change is rather undesired:
>>>>>
>>>>> via __free_pages_core()->__free_pages_ok() we can easily end up here with order=MAX_PAGE_ORDER.
>>>>
>>>> Do you have a concrete example? PMD THP on x86_64 is pageblock_order.
>>>> We do not have PMD level mTHP yet. Any other possible source?
>>>>
>>>>>
>>>>> What your new code will do is split this perfectly reasonable MAX_PAGE_ORDER chunk via split_large_buddy() into pageblock-sized chunks, and let the buddy merging logic undo our unnecessary splitting.
>>>>>
>>>>> Is there a way to avoid this and just process the whole MAX_PAGE_ORDER chunk like we used to?
>>>>
>>>> Probably split_large_buddy() can check the migratetypes of the to-be-freed
>>>> page, if order > pageblock_order. If all migratetypes are the same, the page
>>>> can be freed at MAX_PAGE_ORDER, otherwise pageblock_order.
>>>
>>> Thinking about this: why do we care about the migratetype?
>>>
>>> We only have to fallback to pageblocks if any pageblock is "MIGRATE_ISOLATE" (and maybe MIGRATE_CMA), but not all. Otherwise, we can just ignore the migratetype (or rather overwrite it)
>>
>> There are VM_WARN_ONCEs around *_free_list() operations to make sure
>> page migratetype matches the migratetype of the list it is on. Ignoring
>> migratetype would trigger these WARNs. Overwriting it would work but
>> free page migratetype accounting needs to be taken care of.
>>
>> An implicit reason is that __free_one_page() does not support >MAX_PAGE_ORDER
>> and gigantic hugetlb folios are freed via free_one_page(), where
>> split_large_buddy() is used to work around the limitation.
>
> Yes, I saw that change. But that can be easily identified cased by
> unlikely(order > MAX_PAGE_ORDER) to handle that rare case in a special way.
>
> > > For the two memory init cases you mentioned in the other email,
> maybe a new
>> fpi flag to make free_one_page() use __free_one_page() for them,
>> since migratetypes should be the same across MAX_PAGE_ORDER range there, right?
>
> In the context of alloc_frozen_range()/free_frozen_range() I want to
> free MAX_PAGE_ORDER chunks when possible, and not have some odd logic in
> the freeing path undo some of that effort.
Adding a pointer to that discussion:
https://lkml.kernel.org/r/ZzZdnuZBf-xgiwD2@casper.infradead.org
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists