[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240516031155.meola5hmlk24qv52@amd.com>
Date: Wed, 15 May 2024 22:11:55 -0500
From: Michael Roth <michael.roth@....com>
To: Sean Christopherson <seanjc@...gle.com>
CC: Paolo Bonzini <pbonzini@...hat.com>, <kvm@...r.kernel.org>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PULL 13/19] KVM: SEV: Implement gmem hook for invalidating
private pages
On Wed, May 15, 2024 at 03:32:31PM -0700, Sean Christopherson wrote:
> On Fri, May 10, 2024, Michael Roth wrote:
> > Implement a platform hook to do the work of restoring the direct map
> > entries of gmem-managed pages and transitioning the corresponding RMP
> > table entries back to the default shared/hypervisor-owned state.
>
> ...
>
> > +void sev_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end)
> > +{
> > + kvm_pfn_t pfn;
> > +
> > + pr_debug("%s: PFN start 0x%llx PFN end 0x%llx\n", __func__, start, end);
> > +
> > + for (pfn = start; pfn < end;) {
> > + bool use_2m_update = false;
> > + int rc, rmp_level;
> > + bool assigned;
> > +
> > + rc = snp_lookup_rmpentry(pfn, &assigned, &rmp_level);
> > + if (WARN_ONCE(rc, "SEV: Failed to retrieve RMP entry for PFN 0x%llx error %d\n",
> > + pfn, rc))
> > + goto next_pfn;
>
> This is comically trivial to hit, as it fires when running guest_memfd_test on a
> !SNP host. Presumably the correct fix is to simply do nothing for !sev_snp_guest(),
> but that's easier said than done due to the lack of a @kvm in .gmem_invalidate().
Yah, the code assumes that SNP is the only SVM user that would use gmem
pages. Unfortunately KVM_X86_SW_PROTECTED_VM is the one other situation
where this can be the case. The minimal fix would be to squash the below
into this patch:
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 176ba117413a..56b0b59b8263 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -4675,6 +4675,9 @@ void sev_gmem_invalidate(kvm_pfn_t start,
kvm_pfn_t end)
{
kvm_pfn_t pfn;
+ if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
+ return;
+
pr_debug("%s: PFN start 0x%llx PFN end 0x%llx\n",
__func__, start, end);
for (pfn = start; pfn < end;) {
It's not perfect because the callback will still run for
KVM_X86_SW_PROTECTED_VM if SNP is enabled, but in the context of
KVM_X86_SW_PROTECTED_VM being a stand-in for testing SNP/TDX, that
might not be such a bad thing.
Longer term if we need something more robust would be to modify the
free_folio callback path to pass along folio->mapping, or switch to
something else that provides similar functionality. Another approach
might be to set .free_folio dynamically based on the vm_type of the
gmem user when creating the gmem instance.
>
> That too is not a big fix, but that's beside the point. IMO, the fact that I'm
> the first person to (completely inadvertantly) hit this rather basic bug is a
> good hint that we should wait until 6.11 to merge SNP support.
We do regular testing of normal guests with/without SNP enabled, but
unfortunately we've only been doing KST runs on SNP-enabled hosts.
I've retested with the above fix and everything looks good with
SVM/SEV/SEV-ES/SNP/selftests with and without SNP enabled, but I
understand if we still have reservations after this.
-Mike
Powered by blists - more mailing lists