[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c8f4e753-836d-4ca4-8a94-c54738b7db45@redhat.com>
Date: Wed, 5 Nov 2025 20:56:34 +0100
From: David Hildenbrand <dhildenb@...hat.com>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>,
Gregory Price <gourry@...rry.net>
Cc: Matthew Wilcox <willy@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Christian Borntraeger <borntraeger@...ux.ibm.com>,
Janosch Frank <frankja@...ux.ibm.com>,
Claudio Imbrenda <imbrenda@...ux.ibm.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
Heiko Carstens <hca@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>,
Sven Schnelle <svens@...ux.ibm.com>, Peter Xu <peterx@...hat.com>,
Alexander Viro <viro@...iv.linux.org.uk>,
Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
Arnd Bergmann <arnd@...db.de>, Zi Yan <ziy@...dia.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
"Liam R . Howlett" <Liam.Howlett@...cle.com>, Nico Pache
<npache@...hat.com>, Ryan Roberts <ryan.roberts@....com>,
Dev Jain <dev.jain@....com>, Barry Song <baohua@...nel.org>,
Lance Yang <lance.yang@...ux.dev>, Muchun Song <muchun.song@...ux.dev>,
Oscar Salvador <osalvador@...e.de>, Vlastimil Babka <vbabka@...e.cz>,
Mike Rapoport <rppt@...nel.org>, Suren Baghdasaryan <surenb@...gle.com>,
Michal Hocko <mhocko@...e.com>, Matthew Brost <matthew.brost@...el.com>,
Joshua Hahn <joshua.hahnjy@...il.com>, Rakie Kim <rakie.kim@...com>,
Byungchul Park <byungchul@...com>, Ying Huang
<ying.huang@...ux.alibaba.com>, Alistair Popple <apopple@...dia.com>,
Axel Rasmussen <axelrasmussen@...gle.com>, Yuanchu Xie <yuanchu@...gle.com>,
Wei Xu <weixugc@...gle.com>, Kemeng Shi <shikemeng@...weicloud.com>,
Kairui Song <kasong@...cent.com>, Nhat Pham <nphamcs@...il.com>,
Baoquan He <bhe@...hat.com>, Chris Li <chrisl@...nel.org>,
SeongJae Park <sj@...nel.org>, Jason Gunthorpe <jgg@...pe.ca>,
Leon Romanovsky <leon@...nel.org>, Xu Xin <xu.xin16@....com.cn>,
Chengming Zhou <chengming.zhou@...ux.dev>, Jann Horn <jannh@...gle.com>,
Miaohe Lin <linmiaohe@...wei.com>, Naoya Horiguchi
<nao.horiguchi@...il.com>, Pedro Falcato <pfalcato@...e.de>,
Pasha Tatashin <pasha.tatashin@...een.com>, Rik van Riel <riel@...riel.com>,
Harry Yoo <harry.yoo@...cle.com>, Hugh Dickins <hughd@...gle.com>,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
linux-s390@...r.kernel.org, linux-fsdevel@...r.kernel.org,
linux-mm@...ck.org, linux-arch@...r.kernel.org, damon@...ts.linux.dev
Subject: Re: [PATCH 02/16] mm: introduce leaf entry type and use to simplify
leaf entry logic
On 05.11.25 20:52, Lorenzo Stoakes wrote:
> On Wed, Nov 05, 2025 at 02:25:34PM -0500, Gregory Price wrote:
>> On Wed, Nov 05, 2025 at 07:06:11PM +0000, Matthew Wilcox wrote:
>>> On Mon, Nov 03, 2025 at 12:31:43PM +0000, Lorenzo Stoakes wrote:
>>>> The kernel maintains leaf page table entries which contain either:
>>>>
>>>> - Nothing ('none' entries)
>>>> - Present entries (that is stuff the hardware can navigate without fault)
>>>> - Everything else that will cause a fault which the kernel handles
>>>
>>> The problem is that we're already using 'pmd leaf entries' to mean "this
>>> is a pointer to a PMD entry rather than a table of PTEs".
>>
>> Having not looked at the implications of this for leafent_t prototypes
>> ...
>> Can't this be solved by just adding a leafent type "Pointer" which
>> implies there's exactly one leaf-ent type which won't cause faults?
>>
>> is_present() => (table_ptr || leafent_ptr)
>> else(): => !leafent_ptr
>>
>> if is_none()
>> do the none-thing
>> if is_present()
>> if is_leafent(ent) (== is_leafent_ptr)
>> do the pointer thing
>> else
>> do the table thing
>> else()
>> type = leafent_type(ent)
>> switch(type)
>> do the software things
>> can't be a present entry (see above)
>>
>>
>> A leaf is a leaf :shrug:
>>
>> ~Gregory
>
> I thought about doing this but it doesn't really work as the type is
> _abstracted_ from the architecture-specific value, _and_ we use what is
> currently the swp_type field to identify what this is.
>
> So we would lose the architecture-specific information that any 'hardware leaf'
> entry would require and not be able to reliably identify it without losing bits.
>
> Trying to preserve the value _and_ correctly identify it as a present entry
> would be difficult.
>
> And I _really_ didn't want to go on a deep dive through all the architectures to
> see if we could encode it differently to allow for this.
>
> Rather I think it's better to differentiate between s/w + h/w leaf entries.
(Being rather silent because I'm busy with all kinds of other stuff)
I agree :)
As Willy said, something that spells out "sw leaf" would be nice.
--
Cheers
David
Powered by blists - more mailing lists