[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d2313c1d-1e50-49b7-bed7-840431af799a@arm.com>
Date: Thu, 23 Nov 2023 19:11:19 +0000
From: Ryan Roberts <ryan.roberts@....com>
To: Peter Xu <peterx@...hat.com>, Matthew Wilcox <willy@...radead.org>
Cc: Christoph Hellwig <hch@...radead.org>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Andrea Arcangeli <aarcange@...hat.com>,
James Houghton <jthoughton@...gle.com>,
Lorenzo Stoakes <lstoakes@...il.com>,
David Hildenbrand <david@...hat.com>,
Vlastimil Babka <vbabka@...e.cz>,
John Hubbard <jhubbard@...dia.com>,
Yang Shi <shy828301@...il.com>,
Rik van Riel <riel@...riel.com>,
Hugh Dickins <hughd@...gle.com>,
Jason Gunthorpe <jgg@...dia.com>,
Axel Rasmussen <axelrasmussen@...gle.com>,
"Kirill A . Shutemov" <kirill@...temov.name>,
Andrew Morton <akpm@...ux-foundation.org>,
linuxppc-dev@...ts.ozlabs.org, Mike Rapoport <rppt@...nel.org>,
Mike Kravetz <mike.kravetz@...cle.com>
Subject: Re: [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in hugepd
processing
On 23/11/2023 17:22, Peter Xu wrote:
> On Thu, Nov 23, 2023 at 03:47:49PM +0000, Matthew Wilcox wrote:
>> It looks like ARM (in the person of Ryan) are going to add support for
>> something equivalent to hugepd.
>
> If it's about arm's cont_pte, then it looks ideal because this series
> didn't yet touch cont_pte, assuming it'll just work. From that aspect, his
> work may help mine, and no immediately collapsing either.
Hi,
I'm not sure I've 100% understood the crossover between this series and my work
to support arm64's contpte mappings generally for anonymous and file-backed memory.
My approach is to transparently use contpte mappings when core-mm request pte
mappings that meet the requirements; and its all based around intercepting the
normal (non-hugetlb) helpers (e.g. set_ptes(), ptep_get() and friends). There is
no semantic change to the core-mm. See [1]. It relies on 1) the page cache using
large folios and 2) my "small-sized THP" series which starts using arbitrary
sized large folios for anonymous memory [2].
If I've understood this conversation correctly there is an object called hugepd,
which today is only supported by powerpc, but which could allow the core-mm to
control the mapping granularity? I can see some value in exposing that control
to core-mm in the (very) long term.
[1] https://lore.kernel.org/all/20231115163018.1303287-1-ryan.roberts@arm.com/
[2] https://lore.kernel.org/linux-mm/20231115132734.931023-1-ryan.roberts@arm.com/
Thanks,
Ryan
>
> There can be a slight performance difference which I need to measure for
> arm's cont_pte already for hugetlb, but I didn't worry much on that;
> quotting my commit message in the last patch:
>
> There may be a slight difference of how the loops run when processing
> GUP over a large hugetlb range on either ARM64 (e.g. CONT_PMD) or RISCV
> (mostly its Svnapot extension on 64K huge pages): each loop of
> __get_user_pages() will resolve one pgtable entry with the patch
> applied, rather than relying on the size of hugetlb hstate, the latter
> may cover multiple entries in one loop.
>
> However, the performance difference should hopefully not be a major
> concern, considering that GUP just yet got 57edfcfd3419 ("mm/gup:
> accelerate thp gup even for "pages != NULL""), and that's not part of a
> performance analysis but a side dish. If the performance will be a
> concern, we can consider handle CONT_PTE in follow_page(), for example.
>
> So IMHO it can be slightly different comparing to e.g. page fault, because
> each fault is still pretty slow as a whole if one fault for each small pte
> (of a large folio / cont_pte), while the loop in GUP is still relatively
> tight and short, comparing to a fault. I'd boldly guess more low hanging
> fruits out there for large folio outside GUP areas.
>
> In all cases, it'll be interesting to know if Ryan has worked on cont_pte
> support for gup on large folios, and whether there's any performance number
> to share. It's definitely good news to me because it means Ryan's work can
> also then benefit hugetlb if this series will be merged, I just don't know
> how much difference there will be.
>
> Thanks,
>
Powered by blists - more mailing lists