[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b2947260-becd-4f17-8822-b4c7325a176d@gmail.com>
Date: Thu, 5 Feb 2026 10:05:03 -0800
From: Usama Arif <usamaarif642@...il.com>
To: "David Hildenbrand (Arm)" <david@...nel.org>,
Matthew Wilcox <willy@...radead.org>
Cc: Zi Yan <ziy@...dia.com>, Kiryl Shutsemau <kas@...nel.org>,
lorenzo.stoakes@...cle.com, Andrew Morton <akpm@...ux-foundation.org>,
linux-mm@...ck.org, hannes@...xchg.org, riel@...riel.com,
shakeel.butt@...ux.dev, baohua@...nel.org, dev.jain@....com,
baolin.wang@...ux.alibaba.com, npache@...hat.com, Liam.Howlett@...cle.com,
ryan.roberts@....com, vbabka@...e.cz, lance.yang@...ux.dev,
linux-kernel@...r.kernel.org, kernel-team@...a.com
Subject: Re: [RFC 01/12] mm: add PUD THP ptdesc and rmap support
On 05/02/2026 09:40, David Hildenbrand (Arm) wrote:
> On 2/5/26 06:13, Usama Arif wrote:
>>
>>
>> On 04/02/2026 20:21, Matthew Wilcox wrote:
>>> On Thu, Feb 05, 2026 at 04:17:19AM +0000, Matthew Wilcox wrote:
>>>> Why are you even talking about "the next series"? The approach is
>>>> wrong. You need to put this POC aside and solve the problems that
>>>> you've bypassed to create this POC.
>>
>>
>> Ah is the issue the code duplication that Lorenzo has raised (ofcourse
>> completely agree that there is quite a bit), the lru.next patch I did
>> which hopefully [1] makes better, or investigating if it might be
>> interferring with DAX/VFIO that Lorenzo pointed out (will ofcourse
>> investigate before sending the next revision)? The mapcount work
>> (I think David is working on this?) that is needed to allow splitting
>> PUDs to PMD is completely a separate issue and can be tackled in parallel
>> to this.
>
> I would enjoy seeing an investigation where we see what might have to be done to avoid preallocating page tables for anonymous memory THPs, and instead, try allocating them on demand when remapping. If allocation fails, it's just another -ENOMEM or -EAGAIN.
>
> That would not only reduce the page table overhead when using THPs, it would also avoid the preallocation of two levels like you need here.
>
> Maybe it's doable, maybe not.
>
> Last time I looked into it I was like "there must be a better way to achieve that" :)
>
> Spinlocks might require preallocating etc.
Thanks for this! I am going to try and implement this now and stress test this as well for 2M THPs.
I have access to some production workloads that use a lot of THPs as well and I can put
counters to see how often this even happens in prod workloads. i.e. how often page table
allocation even fails in 2M THPs if its done on demand instead of preallocating this.
>
> (as raised elsewhere, staring with shmem support avoid the page table problem)
>
Powered by blists - more mailing lists