linux-kernel - Re: [PATCH] KVM: SEV: Fix guest memory leak when handling guest requests

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZkzSvPGass4z4u9p@google.com>
Date: Tue, 21 May 2024 09:58:36 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Michael Roth <michael.roth@....com>
Cc: Michael Roth <mdroth@...xas.edu>, pbonzini@...hat.com, kvm@...r.kernel.org, 
	linux-kernel@...r.kernel.org, ashish.kalra@....com, thomas.lendacky@....com, 
	rick.p.edgecombe@...el.com
Subject: Re: [PATCH] KVM: SEV: Fix guest memory leak when handling guest requests

On Tue, May 21, 2024, Michael Roth wrote:
> On Tue, May 21, 2024 at 07:09:04AM -0700, Sean Christopherson wrote:
> > On Mon, May 20, 2024, Michael Roth wrote:
> > > On Mon, May 20, 2024 at 04:32:04PM -0700, Sean Christopherson wrote:
> > > > On Mon, May 20, 2024, Michael Roth wrote:
> > > > > But there is a possibility that the guest will attempt access the response
> > > > > PFN before/during that reporting and spin on an #NPF instead though. So
> > > > > maybe the safer more repeatable approach is to handle the error directly
> > > > > from KVM and propagate it to userspace.
> > > > 
> > > > I was thinking more along the lines of KVM marking the VM as dead/bugged.  
> > > 
> > > In practice userspace will get an unhandled exit and kill the vcpu/guest,
> > > but we could additionally flag the guest as dead.
> > 
> > Honest question, does it make sense from KVM to make the VM unusable?  E.g. is
> > it feasible for userspace to keep running the VM?  Does the page that's in a bad
> > state present any danger to the host?
> 
> If the reclaim fails (which it shouldn't), then KVM has a unique situation
> where a non-gmem guest page is in a state. In theory, if the guest/userspace
> could somehow induce a reclaim failure, then can they potentially trick the
> host into trying to access that same page as a shared page and induce a
> host RMP #PF.
> 
> So it does seem like a good idea to force the guest to stop executing. Then
> once the guest is fully destroyed the bad page will stay leaked so it
> won't affect subsequent activities.
> 
> > 
> > > Is there a existing mechanism for this?
> > 
> > kvm_vm_dead()
> 
> Nice, that would do the trick. I'll modify the logic to also call that
> after a reclaim failure.

Hmm, assuming there's no scenario where snp_page_reclaim() is expected fail, and
such a failure is always unrecoverable, e.g. has similar potential for inducing
host RMP #PFs, then KVM_BUG_ON() is more appropriate.

Ah, and there are already WARNs in the lower level helpers.  Those WARNs should
be KVM_BUG_ON(), because AFAICT there's no scenario where letting the VM live on
is safe/sensible.  And unless I'm missing something, snp_page_reclaim() should
do the private=>shared conversion, because the only reason to reclaim a page is
to move it back to shared state.

Lastly, I vote to rename host_rmp_make_shared() to kvm_rmp_make_shared() to make
it more obvious that it's a KVM helper, whereas rmp_make_shared() is a generic
kernel helper, i.e. _can't_ bug the VM because it doesn't (and shouldn't) have a
pointer to the VM.

E.g. end up with something like this:

/*
 * Transition a page to hypervisor-owned/shared state in the RMP table. This
 * should not fail under normal conditions, but leak the page should that
 * happen since it will no longer be usable by the host due to RMP protections.
 */
static int kvm_rmp_make_shared(struct kvm *kvm, u64 pfn, enum pg_level level)
{
	if (KVM_BUG_ON(rmp_make_shared(pfn, level), kvm)) {
		snp_leak_pages(pfn, page_level_size(level) >> PAGE_SHIFT);
		return -EIO;
	}

	return 0;
}

/*
 * Certain page-states, such as Pre-Guest and Firmware pages (as documented
 * in Chapter 5 of the SEV-SNP Firmware ABI under "Page States") cannot be
 * directly transitioned back to normal/hypervisor-owned state via RMPUPDATE
 * unless they are reclaimed first.
 *
 * Until they are reclaimed and subsequently transitioned via RMPUPDATE, they
 * might not be usable by the host due to being set as immutable or still
 * being associated with a guest ASID.
 *
 * Bug the VM and leak the page if reclaim fails, or if the RMP entry can't be
 * converted back to shared, as the page is no longer usable due to RMP
 * protections, and it's infeasible for the guest to continue on.
 */
static int snp_page_reclaim(struct kvm *kvm, u64 pfn)
{
	struct sev_data_snp_page_reclaim data = {0};
	int err;

	data.paddr = __sme_set(pfn << PAGE_SHIFT);
	
	if (KVM_BUG_ON(sev_do_cmd(SEV_CMD_SNP_PAGE_RECLAIM, &data, &err), kvm)) {
		snp_leak_pages(pfn, 1);
		return -EIO;
	}

	if (kvm_rmp_make_shared(kvm, pfn, PG_LEVEL_4K))
		return -EIO;

	return 0;
}