[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <617169bc-e18c-40fa-be3a-99c118a6d7fe@redhat.com>
Date: Thu, 4 Jul 2024 12:44:38 +0200
From: David Hildenbrand <david@...hat.com>
To: Oscar Salvador <osalvador@...e.de>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Peter Xu <peterx@...hat.com>, Muchun Song <muchun.song@...ux.dev>,
SeongJae Park <sj@...nel.org>, Miaohe Lin <linmiaohe@...wei.com>,
Michal Hocko <mhocko@...e.com>, Matthew Wilcox <willy@...radead.org>,
Christophe Leroy <christophe.leroy@...roup.eu>
Subject: Re: [PATCH 00/45] hugetlb pagewalk unification
On 04.07.24 06:30, Oscar Salvador wrote:
> Hi all,
>
> During Peter's talk at the LSFMM, it was agreed that one of the things
> that need to be done in order to further integrate hugetlb into mm core,
> is to unify generic and hugetlb pagewalkers.
> I started with this one, which is unifying hugetlb into generic
> pagewalk, instead of having its hugetlb_entry entries.
> Which means that pmd_entry/pte_entry(for cont-pte) entries will also deal with
> hugetlb vmas as well, and so will new pud_entry entries since hugetlb can be
> pud mapped (devm pages as well but we seem not to care about those with
> the exception of hmm code).
>
> The outcome is this RFC.
First of all, a good step into the right direction, but maybe not what
we want long-term. So I'm questioning whether we want this intermediate
approach. walk_page_range() and friends are simply not a good design
(e.g., indirect function calls).
There are roughly two categories of page table walkers we have:
1) We actually only want to walk present folios (to be precise, page
ranges of folios). We should look into moving away from the walk the
page walker API where possible, and have something better that
directly gives us the folio (page ranges). Any PTE batching would be
done internally.
2) We want to deal with non-present folios as well (swp entries and all
kinds of other stuff). We should maybe implement our custom page
table walker and move away from walk_page_range(). We are not walking
"pages" after all but everything else included :)
Then, there is a subset of 1) where we only want to walk to a single
address (a single folio). I'm working on that right now to get rid of
follow_page() and some (IIRC 3: KSM an daemon) walk_page_range() users.
Hugetlb will still remain a bit special, but I'm afraid we cannot hide
that completely.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists