[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ys8HrW+52EwQbeh8@monkey>
Date: Wed, 13 Jul 2022 10:58:05 -0700
From: Mike Kravetz <mike.kravetz@...cle.com>
To: David Hildenbrand <david@...hat.com>
Cc: Khalid Aziz <khalid.aziz@...cle.com>,
Andrew Morton <akpm@...ux-foundation.org>, willy@...radead.org,
aneesh.kumar@...ux.ibm.com, arnd@...db.de, 21cnbao@...il.com,
corbet@....net, dave.hansen@...ux.intel.com, ebiederm@...ssion.com,
hagen@...u.net, jack@...e.cz, keescook@...omium.org,
kirill@...temov.name, kucharsk@...il.com, linkinjeon@...nel.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, longpeng2@...wei.com, luto@...nel.org,
markhemm@...glemail.com, pcc@...gle.com, rppt@...nel.org,
sieberf@...zon.com, sjpark@...zon.de, surenb@...gle.com,
tst@...oebel-theuer.de, yzaikin@...gle.com
Subject: Re: [PATCH v2 0/9] Add support for shared PTEs across processes
On 07/13/22 16:00, David Hildenbrand wrote:
> On 08.07.22 21:36, Khalid Aziz wrote:
> > On 7/8/22 05:47, David Hildenbrand wrote:
> >> On 02.07.22 06:24, Andrew Morton wrote:
> >>> On Wed, 29 Jun 2022 16:53:51 -0600 Khalid Aziz <khalid.aziz@...cle.com> wrote:
>
> > suggestion to extend hugetlb PMD sharing was discussed briefly. Conclusion from that discussion and earlier discussion
> > on mailing list was hugetlb PMD sharing is built with special case code in too many places in the kernel and it is
> > better to replace it with something more general purpose than build even more on it. Mike can correct me if I got that
> > wrong.
>
> Yes, I pushed for the removal of that yet-another-hugetlb-special-stuff,
> and asked the honest question if we can just remove it and replace it by
> something generic in the future. And as I learned, we most probably
> cannot rip that out without affecting existing user space. Even
> replacing it by mshare() would degrade existing user space.
>
> So the natural thing to reduce page table consumption (again, what this
> cover letter talks about) for user space (semi- ?)automatically for
> MAP_SHARED files is to factor out what hugetlb has, and teach generic MM
> code to cache and reuse page tables (PTE and PMD tables should be
> sufficient) where suitable.
>
> For reasonably aligned mappings and mapping sizes, it shouldn't be too
> hard (I know, locking ...), to cache and reuse page tables attached to
> files -- similar to what hugetlb does, just in a generic way. We might
> want a mechanism to enable/disable this for specific processes and/or
> VMAs, but these are minor details.
>
> And that could come for free for existing user space, because page
> tables, and how they are handled, would just be an implementation detail.
>
>
> I'd be really interested into what the major roadblocks/downsides
> file-based page table sharing has. Because I am not convinced that a
> mechanism like mshare() -- that has to be explicitly implemented+used by
> user space -- is required for that.
Perhaps this is an 'opportunity' for me to write up in detail how
hugetlb pmd sharing works. As you know, I have been struggling with
keeping that working AND safe AND performant. Who knows, this may lead
to changes in the existing implementation.
--
Mike Kravetz
Powered by blists - more mailing lists