lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZV-KQ0e0y9BTsHGv@x1n>
Date:   Thu, 23 Nov 2023 12:22:11 -0500
From:   Peter Xu <peterx@...hat.com>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Christoph Hellwig <hch@...radead.org>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        Andrea Arcangeli <aarcange@...hat.com>,
        James Houghton <jthoughton@...gle.com>,
        Lorenzo Stoakes <lstoakes@...il.com>,
        David Hildenbrand <david@...hat.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        John Hubbard <jhubbard@...dia.com>,
        Yang Shi <shy828301@...il.com>,
        Rik van Riel <riel@...riel.com>,
        Hugh Dickins <hughd@...gle.com>,
        Jason Gunthorpe <jgg@...dia.com>,
        Axel Rasmussen <axelrasmussen@...gle.com>,
        "Kirill A . Shutemov" <kirill@...temov.name>,
        Andrew Morton <akpm@...ux-foundation.org>,
        linuxppc-dev@...ts.ozlabs.org, Mike Rapoport <rppt@...nel.org>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Ryan Roberts <ryan.roberts@....com>
Subject: Re: [PATCH RFC 06/12] mm/gup: Drop folio_fast_pin_allowed() in
 hugepd processing

On Thu, Nov 23, 2023 at 03:47:49PM +0000, Matthew Wilcox wrote:
> It looks like ARM (in the person of Ryan) are going to add support for
> something equivalent to hugepd.

If it's about arm's cont_pte, then it looks ideal because this series
didn't yet touch cont_pte, assuming it'll just work.  From that aspect, his
work may help mine, and no immediately collapsing either.

There can be a slight performance difference which I need to measure for
arm's cont_pte already for hugetlb, but I didn't worry much on that;
quotting my commit message in the last patch:

    There may be a slight difference of how the loops run when processing
    GUP over a large hugetlb range on either ARM64 (e.g. CONT_PMD) or RISCV
    (mostly its Svnapot extension on 64K huge pages): each loop of
    __get_user_pages() will resolve one pgtable entry with the patch
    applied, rather than relying on the size of hugetlb hstate, the latter
    may cover multiple entries in one loop.
    
    However, the performance difference should hopefully not be a major
    concern, considering that GUP just yet got 57edfcfd3419 ("mm/gup:
    accelerate thp gup even for "pages != NULL""), and that's not part of a
    performance analysis but a side dish.  If the performance will be a
    concern, we can consider handle CONT_PTE in follow_page(), for example.

So IMHO it can be slightly different comparing to e.g. page fault, because
each fault is still pretty slow as a whole if one fault for each small pte
(of a large folio / cont_pte), while the loop in GUP is still relatively
tight and short, comparing to a fault.  I'd boldly guess more low hanging
fruits out there for large folio outside GUP areas.

In all cases, it'll be interesting to know if Ryan has worked on cont_pte
support for gup on large folios, and whether there's any performance number
to share.  It's definitely good news to me because it means Ryan's work can
also then benefit hugetlb if this series will be merged, I just don't know
how much difference there will be.

Thanks,

-- 
Peter Xu

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ