lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADrL8HUm7g4pBLv9vjmB-LhJqxm4jyksGJQAdwRsweKKAnofDg@mail.gmail.com>
Date:   Tue, 7 Feb 2023 14:46:04 -0800
From:   James Houghton <jthoughton@...gle.com>
To:     Peter Xu <peterx@...hat.com>
Cc:     Mike Kravetz <mike.kravetz@...cle.com>,
        David Hildenbrand <david@...hat.com>,
        Muchun Song <songmuchun@...edance.com>,
        David Rientjes <rientjes@...gle.com>,
        Axel Rasmussen <axelrasmussen@...gle.com>,
        Mina Almasry <almasrymina@...gle.com>,
        "Zach O'Keefe" <zokeefe@...gle.com>,
        Manish Mishra <manish.mishra@...anix.com>,
        Naoya Horiguchi <naoya.horiguchi@....com>,
        "Dr . David Alan Gilbert" <dgilbert@...hat.com>,
        "Matthew Wilcox (Oracle)" <willy@...radead.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Baolin Wang <baolin.wang@...ux.alibaba.com>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Yang Shi <shy828301@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 21/46] hugetlb: use struct hugetlb_pte for walk_hugetlb_range

> Here is the result: [1] (sorry it took a little while heh). The
> implementation of the "RFC v1" way is pretty horrible[2] (and this
> implementation probably has bugs anyway; it doesn't account for the
> folio_referenced() problem).
>
> Matthew is trying to solve the same problem with THPs right now: [3].
> I haven't figured out how we can apply Matthews's approach to HGM
> right now, but there probably is a way. (If we left the mapcount
> increment bits in the same place, we couldn't just check the
> hstate-level PTE; it would have already been made present.)
>
> We could:
> - use the THP-like way and tolerate ~1 second collapses

Another thought here. We don't necessarily *need* to collapse the page
table mappings in between mmu_notifier_invalidate_range_start() and
mmu_notifier_invalidate_range_end(), as the pfns aren't changing,
we aren't punching any holes, and we aren't changing permission bits.
If we had an MMU notifier that simply informed KVM that we collapsed
the page tables *after* we finished collapsing, then it would be ok
for hugetlb_collapse() to be slow.

If this MMU notifier is something that makes sense, it probably
applies to MADV_COLLAPSE for THPs as well.


> - use the (non-RFC) v1 way and tolerate the migration/smaps differences
> - use the RFC v1 way and tolerate the complicated mapcount accounting
> - flesh out [3] and see if it can be applied to HGM nicely
>
> I'm happy to go with any of these approaches.
>
> [1]: https://pastebin.com/raw/hJzFJHiD
> [2]: https://github.com/48ca/linux/commit/4495f16a09b660aff44b3edcc125aa3a3df85976
> [3]: https://lore.kernel.org/linux-mm/Y+FkV4fBxHlp6FTH@casper.infradead.org/

- James

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ