[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZW4C9I2LHmZY-COM@x1n>
Date: Mon, 4 Dec 2023 11:48:52 -0500
From: Peter Xu <peterx@...hat.com>
To: Ryan Roberts <ryan.roberts@....com>
Cc: Christophe Leroy <christophe.leroy@...roup.eu>,
Matthew Wilcox <willy@...radead.org>,
Christoph Hellwig <hch@...radead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Andrea Arcangeli <aarcange@...hat.com>,
James Houghton <jthoughton@...gle.com>,
Lorenzo Stoakes <lstoakes@...il.com>,
David Hildenbrand <david@...hat.com>,
Vlastimil Babka <vbabka@...e.cz>,
John Hubbard <jhubbard@...dia.com>,
Yang Shi <shy828301@...il.com>,
Rik van Riel <riel@...riel.com>,
Hugh Dickins <hughd@...gle.com>,
Jason Gunthorpe <jgg@...dia.com>,
Axel Rasmussen <axelrasmussen@...gle.com>,
"Kirill A . Shutemov" <kirill@...temov.name>,
Andrew Morton <akpm@...ux-foundation.org>,
"linuxppc-dev@...ts.ozlabs.org" <linuxppc-dev@...ts.ozlabs.org>,
Mike Rapoport <rppt@...nel.org>,
Mike Kravetz <mike.kravetz@...cle.com>
Subject: Re: [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in
hugepd processing
On Mon, Dec 04, 2023 at 11:11:26AM +0000, Ryan Roberts wrote:
> To be honest, while I understand pte_cont() and friends, I don't understand
> their relevance (or at least potential future relevance) to GUP?
GUP in general can be smarter to recognize if a pte/pmd is a cont_pte and
fetch the whole pte/pmd range if the caller specified. Now it loops over
each pte/pmd.
Fast-gup is better as it at least doesn't take pgtable lock, for cont_pte
it looks inside gup_pte_range() which is good enough, but it'll still do
folio checks for each sub-pte, even though the 2nd+ folio checks should be
mostly the same (if to ignore races when the folio changed within the time
of processing the cont_pte chunk).
Slow-gup (as of what this series is about so far) doesn't do that either,
for each cont_pte whole entry it'll loop N times, frequently taking and
releasing the pgtable lock. A smarter slow-gup can fundamentallly setup
follow_page_context.page_mask if it sees a cont_pte. There might be a
challenge on whether holding the head page's refcount would stablize the
whole folio, but that may be another question to ask.
I think I also overlooked that PPC_8XX also has cont_pte support, so we
actually have three users indeed, if not counting potential future archs
adding support to also get that same tlb benefit.
Thanks,
--
Peter Xu
Powered by blists - more mailing lists