[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z4mablD78z45k1u6@google.com>
Date: Thu, 16 Jan 2025 15:46:54 -0800
From: Sean Christopherson <seanjc@...gle.com>
To: Peter Xu <peterx@...hat.com>
Cc: James Houghton <jthoughton@...gle.com>, Paolo Bonzini <pbonzini@...hat.com>,
Jonathan Corbet <corbet@....net>, Marc Zyngier <maz@...nel.org>, Oliver Upton <oliver.upton@...ux.dev>,
Yan Zhao <yan.y.zhao@...el.com>, Nikita Kalyazin <kalyazin@...zon.com>,
Anish Moorthy <amoorthy@...gle.com>, Peter Gonda <pgonda@...gle.com>,
David Matlack <dmatlack@...gle.com>, Wei W <wei.w.wang@...el.com>, kvm@...r.kernel.org,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.linux.dev
Subject: Re: [PATCH v1 00/13] KVM: Introduce KVM Userfault
On Thu, Jan 16, 2025, Peter Xu wrote:
> On Thu, Jan 16, 2025 at 03:04:45PM -0800, James Houghton wrote:
> > On Thu, Jan 16, 2025 at 2:16 PM Sean Christopherson <seanjc@...gle.com> wrote:
> > >
> > > On Thu, Jan 16, 2025, Peter Xu wrote:
> > > > On Thu, Jan 16, 2025 at 03:19:49PM -0500, Peter Xu wrote:
> > > > > > For the gmem case, userfaultfd cannot be used, so KVM Userfault isn't
> > > > > > replacing it. And as of right now anyway, KVM Userfault *does* provide
> > > > > > a complete post-copy system for gmem.
> > > > > >
> > > > > > When gmem pages can be mapped into userspace, for post-copy to remain
> > > > > > functional, userspace-mapped gmem will need userfaultfd integration.
> > > > > > Keep in mind that even after this integration happens, userfaultfd
> > > > > > alone will *not* be a complete post-copy solution, as vCPU faults
> > > > > > won't be resolved via the userspace page tables.
> > > > >
> > > > > Do you know in context of CoCo, whether a private page can be accessed at
> > > > > all outside of KVM?
> > > > >
> > > > > I think I'm pretty sure now a private page can never be mapped to
> > > > > userspace. However, can another module like vhost-kernel access it during
> > > > > postcopy? My impression of that is still a yes, but then how about
> > > > > vhost-user?
> > > > >
> > > > > Here, the "vhost-kernel" part represents a question on whether private
> > > > > pages can be accessed at all outside KVM. While "vhost-user" part
> > > > > represents a question on whether, if the previous vhost-kernel question
> > > > > answers as "yes it can", such access attempt can happen in another
> > > > > process/task (hence, not only does it lack KVM context, but also not
> > > > > sharing the same task context).
> > > >
> > > > Right after I sent it, I just recalled whenever a device needs to access
> > > > the page, it needs to be converted to shared pages first..
> > >
> > > FWIW, once Trusted I/O comes along, "trusted" devices will be able to access guest
> > > private memory. The basic gist is that the IOMMU will enforce access to private
> > > memory, e.g. on AMD the IOMMU will check the RMP[*], and I believe the plan for
> > > TDX is to have the IOMMU share the Secure-EPT tables that are used by the CPU.
> > >
> > > [*] https://www.amd.com/content/dam/amd/en/documents/developer/sev-tio-whitepaper.pdf
>
> Thanks, Sean. This is interesting to know..
>
> >
> > Hi Sean,
> >
> > Do you know what API the IOMMU driver would use to get the private
> > pages to map? Normally it'd use GUP, but GUP would/should fail for
> > guest-private pages, right?
>
> James,
>
> I'm still reading the link Sean shared, looks like there's answer in the
> white paper on this on assigned devices:
>
> TDIs access memory via either guest virtual address (GVA) space or
> guest physical address (GPA) space. The I/O Memory Management Unit
> (IOMMU) in the host hardware is responsible for translating the
> provided GVAs or GPAs into system physical addresses
> (SPAs). Because SEV-SNP enforces access control at the time of
> translation, the IOMMU performs RMP entry lookups on translation
>
> So I suppose after the device is attested and trusted, it can directly map
> everything if wanted, and DMA directly to the encrypted pages.
But as James called out, the kernel still needs to actually map guest_memfd
memory (all other memory is shared), and guest_memfd does not and will not ever
support GUP/mmap() of *private* memory.
There's an RFC that's under heavy discussion that I assume will handle some/all?
of this (I have largely ignored the thread).
https://lore.kernel.org/all/20250107142719.179636-1-yilun.xu@linux.intel.com
> OTOH, for my specific question (on vhost-kernel, or vhost-user), I suppose
> they cannot be attested but still be part of host software.. so I'm
> guessing they'll need to still stick with shared pages, and use a bounce
> buffer to do DMAs..
Yep. There's no sane way to attest software that runs in "regular" mode on the
CPU, and so things like device emulation and vhost will always be restricted to
shared memory.
Powered by blists - more mailing lists