[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9abab5ad-98c0-48bb-b6be-59f2b3d3924a@redhat.com>
Date: Wed, 16 Oct 2024 10:50:12 +0200
From: David Hildenbrand <david@...hat.com>
To: Ackerley Tng <ackerleytng@...gle.com>, Peter Xu <peterx@...hat.com>
Cc: tabba@...gle.com, quic_eberman@...cinc.com, roypat@...zon.co.uk,
jgg@...dia.com, rientjes@...gle.com, fvdl@...gle.com, jthoughton@...gle.com,
seanjc@...gle.com, pbonzini@...hat.com, zhiquan1.li@...el.com,
fan.du@...el.com, jun.miao@...el.com, isaku.yamahata@...el.com,
muchun.song@...ux.dev, erdemaktas@...gle.com, vannapurve@...gle.com,
qperret@...gle.com, jhubbard@...dia.com, willy@...radead.org,
shuah@...nel.org, brauner@...nel.org, bfoster@...hat.com,
kent.overstreet@...ux.dev, pvorel@...e.cz, rppt@...nel.org,
richard.weiyang@...il.com, anup@...infault.org, haibo1.xu@...el.com,
ajones@...tanamicro.com, vkuznets@...hat.com,
maciej.wieczor-retman@...el.com, pgonda@...gle.com, oliver.upton@...ux.dev,
linux-kernel@...r.kernel.org, linux-mm@...ck.org, kvm@...r.kernel.org,
linux-kselftest@...r.kernel.org
Subject: Re: [RFC PATCH 26/39] KVM: guest_memfd: Track faultability within a
struct kvm_gmem_private
>> I also don't know how you treat things like folio_test_hugetlb() on
>> possible assumptions that the VMA must be a hugetlb vma. I'd confess I
>> didn't yet check the rest of the patchset yet - reading a large series
>> without a git tree is sometimes challenging to me.
>>
>
> I'm thinking to basically never involve folio_test_hugetlb(), and the
> VMAs used by guest_memfd will also never be a HugeTLB VMA. That's
> because only the HugeTLB allocator is used, but by the time the folio is
> mapped to userspace, it would have already have been split. After the
> page is split, the folio loses its HugeTLB status. guest_memfd folios
> will never be mapped to userspace while they still have a HugeTLB
> status.
We absolutely must convert these hugetlb folios to non-hugetlb folios.
That is one of the reasons why I raised at LPC that we should focus on
leaving hugetlb out of the picture and rather have a global pool, and
the option to move folios from the global pool back and forth to hugetlb
or to guest_memfd.
How exactly that would look like is TBD.
For the time being, I think we could add a "hack" to take hugetlb folios
from hugetlb for our purposes, but we would absolutely have to convert
them to non-hugetlb folios, especially when we split them to small
folios and start using the mapcount. But it doesn't feel quite clean.
Simply starting with a separate global pool (e.g., boot-time allocation
similar to as done by hugetlb, or CMA) might be cleaner, and a lot of
stuff could be factored out from hugetlb code to achieve that.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists