[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAGtprH8sn48pNC29SSNqCCV88O8mjU1JiOFvLbLrm_7LGjGRuQ@mail.gmail.com>
Date: Fri, 9 May 2025 17:41:06 -0700
From: Vishal Annapurve <vannapurve@...gle.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
Cc: "Zhao, Yan Y" <yan.y.zhao@...el.com>,
"quic_eberman@...cinc.com" <quic_eberman@...cinc.com>, "Shutemov, Kirill" <kirill.shutemov@...el.com>,
"Li, Xiaoyao" <xiaoyao.li@...el.com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"Hansen, Dave" <dave.hansen@...el.com>, "david@...hat.com" <david@...hat.com>,
"thomas.lendacky@....com" <thomas.lendacky@....com>, "tabba@...gle.com" <tabba@...gle.com>,
"vbabka@...e.cz" <vbabka@...e.cz>, "Du, Fan" <fan.du@...el.com>,
"michael.roth@....com" <michael.roth@....com>, "seanjc@...gle.com" <seanjc@...gle.com>,
"Weiny, Ira" <ira.weiny@...el.com>, "pbonzini@...hat.com" <pbonzini@...hat.com>,
"binbin.wu@...ux.intel.com" <binbin.wu@...ux.intel.com>,
"ackerleytng@...gle.com" <ackerleytng@...gle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "Yamahata, Isaku" <isaku.yamahata@...el.com>,
"Peng, Chao P" <chao.p.peng@...el.com>, "Li, Zhiquan1" <zhiquan1.li@...el.com>,
"jroedel@...e.de" <jroedel@...e.de>, "Miao, Jun" <jun.miao@...el.com>,
"pgonda@...gle.com" <pgonda@...gle.com>, "x86@...nel.org" <x86@...nel.org>
Subject: Re: [RFC PATCH 08/21] KVM: TDX: Increase/decrease folio ref for huge pages
On Fri, May 9, 2025 at 4:45 PM Edgecombe, Rick P
<rick.p.edgecombe@...el.com> wrote:
>
> On Fri, 2025-05-09 at 07:20 -0700, Vishal Annapurve wrote:
> > I might be wrongly throwing out some terminologies here then.
> > VM_PFNMAP flag can be set for memory backed by folios/page structs.
> > udmabuf seems to be working with pinned "folios" in the backend.
> >
> > The goal is to get to a stage where guest_memfd is backed by pfn
> > ranges unmanaged by kernel that guest_memfd owns and distributes to
> > userspace, KVM, IOMMU subject to shareability attributes. if the
> > shareability changes, the users will get notified and will have to
> > invalidate their mappings. guest_memfd will allow mmaping such ranges
> > with VM_PFNMAP flag set by default in the VMAs to indicate the need of
> > special handling/lack of page structs.
>
> I see the point about how operating on PFNs can allow smoother transition to a
> solution that saves struct page memory, but I wonder about the wisdom of
> building this 2MB TDX code against eventual goals.
This discussion was more in response to a few questions from Yan [1].
My point of this discussion was to ensure that:
1) There is more awareness about the future roadmap.
2) There is a line of sight towards supporting guest memory (at least
guest private memory) without page structs.
No need to solve these problems right away, but it would be good to
ensure that the design choices are aligned towards the future
direction.
One thing that needs to be resolved right away is - no refcounts on
guest memory from outside guest_memfd [2]. (Discounting the error
situations)
[1] https://lore.kernel.org/lkml/aBldhnTK93+eKcMq@yzhao56-desk.sh.intel.com/
[2] https://lore.kernel.org/lkml/CAGtprH_ggm8N-R9QbV1f8mo8-cQkqyEta3W=h2jry-NRD7_6OA@mail.gmail.com/
>
> We were thinking to enable 2MB TDX huge pages on top of:
> 1. Mmap shared pages
> 2. In-place conversion
> 3. 2MB huge page support
>
> Where do you think struct page-less guestmemfd fits in that roadmap?
Ideally the roadmap should be:
1. mmap support
2. Huge page support in guest memfd with in-place conversion
3. 2MB huge page EPT mappings support
4. private memory without page structs
5. private/shared memory without page structs
There should be newer RFC series landing soon for 1 and 2. In my
opinion, as long as hugepage EPT support is reviewed, tested and is
stable enough, it can land upstream sooner than 2 as well.
>
> >
> > As an intermediate stage, it makes sense to me to just not have
> > private memory backed by page structs and use a special "filemap" to
> > map file offsets to these private memory ranges. This step will also
> > need similar contract with users -
> > 1) memory is pinned by guest_memfd
> > 2) users will get invalidation notifiers on shareability changes
> >
> > I am sure there is a lot of work here and many quirks to be addressed,
> > let's discuss this more with better context around. A few related RFC
> > series are planned to be posted in the near future.
>
> Look forward to collecting more context, and thanks for your patience while we
> catch up. But why not an iterative approach? We can't save struct page memory on
> guestmemfd huge pages until we have guestmemfd huge pages.
Powered by blists - more mailing lists