[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c6a6059d-460b-4f7c-8976-f05b0a58b5e1@redhat.com>
Date: Tue, 24 Jun 2025 11:47:38 +0200
From: David Hildenbrand <david@...hat.com>
To: Matthew Wilcox <willy@...radead.org>, Bharata B Rao <bharata@....com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Jonathan.Cameron@...wei.com, dave.hansen@...el.com, gourry@...rry.net,
hannes@...xchg.org, mgorman@...hsingularity.net, mingo@...hat.com,
peterz@...radead.org, raghavendra.kt@....com, riel@...riel.com,
rientjes@...gle.com, sj@...nel.org, weixugc@...gle.com,
ying.huang@...ux.alibaba.com, ziy@...dia.com, dave@...olabs.net,
nifan.cxl@...il.com, xuezhengchu@...wei.com, yiannis@...corp.com,
akpm@...ux-foundation.org
Subject: Re: page_ext and memdescs
On 16.06.25 16:05, Matthew Wilcox wrote:
> On Mon, Jun 16, 2025 at 07:09:30PM +0530, Bharata B Rao wrote:
>> diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h
>> index 76c817162d2f..4300c9dbafec 100644
>> --- a/include/linux/page_ext.h
>> +++ b/include/linux/page_ext.h
>> @@ -40,8 +40,25 @@ enum page_ext_flags {
>> PAGE_EXT_YOUNG,
>> PAGE_EXT_IDLE,
>> #endif
>> + /*
>> + * 32 bits following this are used by the migrator.
>> + * The next available bit position is 33.
>> + */
>> + PAGE_EXT_MIGRATE_READY,
>> };
>>
>> +#define PAGE_EXT_MIG_NID_WIDTH 10
>> +#define PAGE_EXT_MIG_FREQ_WIDTH 3
>> +#define PAGE_EXT_MIG_TIME_WIDTH 18
>> +
>> +#define PAGE_EXT_MIG_NID_SHIFT (PAGE_EXT_MIGRATE_READY + 1)
>> +#define PAGE_EXT_MIG_FREQ_SHIFT (PAGE_EXT_MIG_NID_SHIFT + PAGE_EXT_MIG_NID_WIDTH)
>> +#define PAGE_EXT_MIG_TIME_SHIFT (PAGE_EXT_MIG_FREQ_SHIFT + PAGE_EXT_MIG_FREQ_WIDTH)
>> +
>> +#define PAGE_EXT_MIG_NID_MASK ((1UL << PAGE_EXT_MIG_NID_SHIFT) - 1)
>> +#define PAGE_EXT_MIG_FREQ_MASK ((1UL << PAGE_EXT_MIG_FREQ_SHIFT) - 1)
>> +#define PAGE_EXT_MIG_TIME_MASK ((1UL << PAGE_EXT_MIG_TIME_SHIFT) - 1)
>
> OK, so we need to have a conversation about page_ext. Sorry this is
> happening to you. I've kind of skipped over page_ext when talking
> about folios and memdescs up to now, so it's not that you've missed
> anything.
>
> As the comment says,
>
> * Page Extension can be considered as an extended mem_map.
>
> and we need to do this because we don't want to grow struct page beyond
> 64 bytes. But memdescs are dynamically allocated, so we don't need
> page_ext any more, and all that code can go away.
>
> lib/alloc_tag.c:struct page_ext_operations page_alloc_tagging_ops = {
In this case, we might not necessarily have an allocated memdesc, for
all allocations, though. Think of memory ballooning allocating "offline"
pages in the future.
Of course, the easy solution is to not track these non-memdesc allocations.
> mm/page_ext.c:static struct page_ext_operations page_idle_ops __initdata = {
That should be per-folio.
> mm/page_ext.c:static struct page_ext_operations *page_ext_ops[] __initdata = {
That's just the lookup table for the others.
> mm/page_owner.c:struct page_ext_operations page_owner_ops = {
Hm, probably like tagging above.
> mm/page_table_check.c:struct page_ext_operations page_table_check_ops = {
That should be per-folio as well IIUC.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists