linux-kernel - Re: [PATCH Part2 RFC v4 24/40] KVM: SVM: Add KVM_SEV_SNP_LAUNCH

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YPHlt4VCm6b2MZMs@google.com>
Date:   Fri, 16 Jul 2021 20:01:59 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Brijesh Singh <brijesh.singh@....com>
Cc:     x86@...nel.org, linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        linux-efi@...r.kernel.org, platform-driver-x86@...r.kernel.org,
        linux-coco@...ts.linux.dev, linux-mm@...ck.org,
        linux-crypto@...r.kernel.org, Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Joerg Roedel <jroedel@...e.de>,
        Tom Lendacky <thomas.lendacky@....com>,
        "H. Peter Anvin" <hpa@...or.com>, Ard Biesheuvel <ardb@...nel.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Andy Lutomirski <luto@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Sergio Lopez <slp@...hat.com>, Peter Gonda <pgonda@...gle.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Srinivas Pandruvada <srinivas.pandruvada@...ux.intel.com>,
        David Rientjes <rientjes@...gle.com>,
        Dov Murik <dovmurik@...ux.ibm.com>,
        Tobin Feldman-Fitzthum <tobin@....com>,
        Borislav Petkov <bp@...en8.de>,
        Michael Roth <michael.roth@....com>,
        Vlastimil Babka <vbabka@...e.cz>, tony.luck@...el.com,
        npmccallum@...hat.com, brijesh.ksingh@...il.com
Subject: Re: [PATCH Part2 RFC v4 24/40] KVM: SVM: Add
 KVM_SEV_SNP_LAUNCH_UPDATE command

On Wed, Jul 07, 2021, Brijesh Singh wrote:
> +static int snp_launch_update(struct kvm *kvm, struct kvm_sev_cmd *argp)
> +{
> +	unsigned long npages, vaddr, vaddr_end, i, next_vaddr;
> +	struct kvm_sev_info *sev = &to_kvm_svm(kvm)->sev_info;
> +	struct sev_data_snp_launch_update data = {};
> +	struct kvm_sev_snp_launch_update params;
> +	int *error = &argp->error;
> +	struct kvm_vcpu *vcpu;
> +	struct page **inpages;
> +	struct rmpupdate e;
> +	int ret;
> +
> +	if (!sev_snp_guest(kvm))
> +		return -ENOTTY;
> +
> +	if (!sev->snp_context)
> +		return -EINVAL;
> +
> +	if (copy_from_user(&params, (void __user *)(uintptr_t)argp->data, sizeof(params)))
> +		return -EFAULT;
> +
> +	data.gctx_paddr = __psp_pa(sev->snp_context);
> +
> +	/* Lock the user memory. */
> +	inpages = sev_pin_memory(kvm, params.uaddr, params.len, &npages, 1);

params.uaddr needs to be checked for validity, e.g. proper alignment.
sev_pin_memory() does some checks, but not all checks.

> +	if (!inpages)
> +		return -ENOMEM;
> +
> +	vcpu = kvm_get_vcpu(kvm, 0);
> +	vaddr = params.uaddr;
> +	vaddr_end = vaddr + params.len;
> +
> +	for (i = 0; vaddr < vaddr_end; vaddr = next_vaddr, i++) {
> +		unsigned long psize, pmask;
> +		int level = PG_LEVEL_4K;
> +		gpa_t gpa;
> +
> +		if (!hva_to_gpa(kvm, vaddr, &gpa)) {

I'm having a bit of deja vu...  This flow needs to hold kvm->srcu to do a memslot
lookup.

That said, IMO having KVM do the hva->gpa is not a great ABI.  The memslots are
completely arbitrary (from a certain point of view) and have no impact on the
validity of the memory pinning or PSP command.  E.g. a memslot update while this
code is in-flight would be all kinds of weird.

In other words, make userspace provide both the hva (because it's sadly needed
to pin memory) as well as the target gpa.  That prevents KVM from having to deal
with memslot lookups and also means that userspace can issue the command before
configuring the memslots (though I've no idea if that's actually feasible for
any userspace VMM).

> +			ret = -EINVAL;
> +			goto e_unpin;
> +		}
> +
> +		psize = page_level_size(level);
> +		pmask = page_level_mask(level);

Is there any hope of this path supporting 2mb/1gb pages in the not-too-distant
future?  If not, then I vote to do away with the indirection and just hardcode
4kg sizes in the flow.  I.e. if this works on 4kb chunks, make that obvious.

> +		gpa = gpa & pmask;
> +
> +		/* Transition the page state to pre-guest */
> +		memset(&e, 0, sizeof(e));
> +		e.assigned = 1;
> +		e.gpa = gpa;
> +		e.asid = sev_get_asid(kvm);
> +		e.immutable = true;
> +		e.pagesize = X86_TO_RMP_PG_LEVEL(level);
> +		ret = rmpupdate(inpages[i], &e);

What happens if userspace pulls a stupid and assigns the same page to multiple
SNP guests?  Does RMPUPDATE fail?  Can one RMPUPDATE overwrite another?

> +		if (ret) {
> +			ret = -EFAULT;
> +			goto e_unpin;
> +		}
> +
> +		data.address = __sme_page_pa(inpages[i]);
> +		data.page_size = e.pagesize;
> +		data.page_type = params.page_type;
> +		data.vmpl3_perms = params.vmpl3_perms;
> +		data.vmpl2_perms = params.vmpl2_perms;
> +		data.vmpl1_perms = params.vmpl1_perms;
> +		ret = __sev_issue_cmd(argp->sev_fd, SEV_CMD_SNP_LAUNCH_UPDATE, &data, error);
> +		if (ret) {
> +			snp_page_reclaim(inpages[i], e.pagesize);
> +			goto e_unpin;
> +		}
> +
> +		next_vaddr = (vaddr & pmask) + psize;
> +	}
> +
> +e_unpin:
> +	/* Content of memory is updated, mark pages dirty */
> +	memset(&e, 0, sizeof(e));
> +	for (i = 0; i < npages; i++) {
> +		set_page_dirty_lock(inpages[i]);
> +		mark_page_accessed(inpages[i]);
> +
> +		/*
> +		 * If its an error, then update RMP entry to change page ownership
> +		 * to the hypervisor.
> +		 */
> +		if (ret)
> +			rmpupdate(inpages[i], &e);

This feels wrong since it's purging _all_ RMP entries, not just those that were
successfully modified.  And maybe add a RMP "reset" helper, e.g. why is zeroing
the RMP entry the correct behavior?

> +	}
> +
> +	/* Unlock the user pages */
> +	sev_unpin_memory(kvm, inpages, npages);
> +
> +	return ret;
> +}
> +