linux-kernel - Re: [PULL 13/19] KVM: SEV: Implement gmem hook for invalidating private pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240516031155.meola5hmlk24qv52@amd.com>
Date: Wed, 15 May 2024 22:11:55 -0500
From: Michael Roth <michael.roth@....com>
To: Sean Christopherson <seanjc@...gle.com>
CC: Paolo Bonzini <pbonzini@...hat.com>, <kvm@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>
Subject: Re: [PULL 13/19] KVM: SEV: Implement gmem hook for invalidating
 private pages

On Wed, May 15, 2024 at 03:32:31PM -0700, Sean Christopherson wrote:
> On Fri, May 10, 2024, Michael Roth wrote:
> > Implement a platform hook to do the work of restoring the direct map
> > entries of gmem-managed pages and transitioning the corresponding RMP
> > table entries back to the default shared/hypervisor-owned state.
> 
> ...
> 
> > +void sev_gmem_invalidate(kvm_pfn_t start, kvm_pfn_t end)
> > +{
> > +	kvm_pfn_t pfn;
> > +
> > +	pr_debug("%s: PFN start 0x%llx PFN end 0x%llx\n", __func__, start, end);
> > +
> > +	for (pfn = start; pfn < end;) {
> > +		bool use_2m_update = false;
> > +		int rc, rmp_level;
> > +		bool assigned;
> > +
> > +		rc = snp_lookup_rmpentry(pfn, &assigned, &rmp_level);
> > +		if (WARN_ONCE(rc, "SEV: Failed to retrieve RMP entry for PFN 0x%llx error %d\n",
> > +			      pfn, rc))
> > +			goto next_pfn;
> 
> This is comically trivial to hit, as it fires when running guest_memfd_test on a
> !SNP host.  Presumably the correct fix is to simply do nothing for !sev_snp_guest(),
> but that's easier said than done due to the lack of a @kvm in .gmem_invalidate().

Yah, the code assumes that SNP is the only SVM user that would use gmem
pages. Unfortunately KVM_X86_SW_PROTECTED_VM is the one other situation
where this can be the case. The minimal fix would be to squash the below
into this patch:

  diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
  index 176ba117413a..56b0b59b8263 100644
  --- a/arch/x86/kvm/svm/sev.c
  +++ b/arch/x86/kvm/svm/sev.c
  @@ -4675,6 +4675,9 @@ void sev_gmem_invalidate(kvm_pfn_t start,
  kvm_pfn_t end)
   {
           kvm_pfn_t pfn;

           +       if (!cc_platform_has(CC_ATTR_HOST_SEV_SNP))
           +               return;
           +
                   pr_debug("%s: PFN start 0x%llx PFN end 0x%llx\n",
                   __func__, start, end);

                           for (pfn = start; pfn < end;) {

It's not perfect because the callback will still run for
KVM_X86_SW_PROTECTED_VM if SNP is enabled, but in the context of
KVM_X86_SW_PROTECTED_VM being a stand-in for testing SNP/TDX, that
might not be such a bad thing.

Longer term if we need something more robust would be to modify the
free_folio callback path to pass along folio->mapping, or switch to
something else that provides similar functionality. Another approach
might be to set .free_folio dynamically based on the vm_type of the
gmem user when creating the gmem instance.

> 
> That too is not a big fix, but that's beside the point.  IMO, the fact that I'm
> the first person to (completely inadvertantly) hit this rather basic bug is a
> good hint that we should wait until 6.11 to merge SNP support.

We do regular testing of normal guests with/without SNP enabled, but
unfortunately we've only been doing KST runs on SNP-enabled hosts.
I've retested with the above fix and everything looks good with
SVM/SEV/SEV-ES/SNP/selftests with and without SNP enabled, but I
understand if we still have reservations after this.

-Mike