[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZS614OSoritrE1d2@google.com>
Date: Tue, 17 Oct 2023 09:27:12 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Dionna Amalie Glaze <dionnaglaze@...gle.com>
Cc: Michael Roth <michael.roth@....com>, kvm@...r.kernel.org,
linux-coco@...ts.linux.dev, linux-mm@...ck.org,
linux-crypto@...r.kernel.org, x86@...nel.org,
linux-kernel@...r.kernel.org, tglx@...utronix.de, mingo@...hat.com,
jroedel@...e.de, thomas.lendacky@....com, hpa@...or.com,
ardb@...nel.org, pbonzini@...hat.com, vkuznets@...hat.com,
jmattson@...gle.com, luto@...nel.org, dave.hansen@...ux.intel.com,
slp@...hat.com, pgonda@...gle.com, peterz@...radead.org,
srinivas.pandruvada@...ux.intel.com, rientjes@...gle.com,
dovmurik@...ux.ibm.com, tobin@....com, bp@...en8.de,
vbabka@...e.cz, kirill@...temov.name, ak@...ux.intel.com,
tony.luck@...el.com, marcorr@...gle.com,
sathyanarayanan.kuppuswamy@...ux.intel.com, alpergun@...gle.com,
jarkko@...nel.org, ashish.kalra@....com, nikunj.dadhania@....com,
pankaj.gupta@....com, liam.merwick@...cle.com,
zhi.a.wang@...el.com, Brijesh Singh <brijesh.singh@....com>,
Alexey Kardashevskiy <aik@....com>
Subject: Re: [PATCH v10 48/50] KVM: SEV: Provide support for SNP_GUEST_REQUEST
NAE event
On Mon, Oct 16, 2023, Dionna Amalie Glaze wrote:
> > +
> > + /*
> > + * If a VMM-specific certificate blob hasn't been provided, grab the
> > + * host-wide one.
> > + */
> > + snp_certs = sev_snp_certs_get(sev->snp_certs);
> > + if (!snp_certs)
> > + snp_certs = sev_snp_global_certs_get();
> > +
>
> This is where the generation I suggested adding would get checked. If
> the instance certs' generation is not the global generation, then I
> think we need a way to return to the VMM to make that right before
> continuing to provide outdated certificates.
> This might be an unreasonable request, but the fact that the certs and
> reported_tcb can be set while a VM is running makes this an issue.
Before we get that far, the changelogs need to explain why the kernel is storing
userspace blobs in the first place. The whole thing is a bit of a mess.
sev_snp_global_certs_get() has data races that could lead to variations of TOCTOU
bugs: sev_ioctl_snp_set_config() can overwrite psp_master->sev_data->snp_certs
while sev_snp_global_certs_get() is running. If the compiler reloads snp_certs
between bumping the refcount and grabbing the pointer, KVM will end up leaking a
refcount and consuming a pointer without a refcount.
if (!kref_get_unless_zero(&certs->kref))
return NULL;
return certs;
If allocating memory for the certs fails, the kernel will have set the config
but not store the corresponding certs.
ret = __sev_do_cmd_locked(SEV_CMD_SNP_CONFIG, &config, &argp->error);
if (ret)
goto e_free;
memcpy(&sev->snp_config, &config, sizeof(config));
}
/*
* If the new certs are passed then cache it else free the old certs.
*/
if (input.certs_len) {
snp_certs = sev_snp_certs_new(certs, input.certs_len);
if (!snp_certs) {
ret = -ENOMEM;
goto e_free;
}
}
Reasoning about ordering is also difficult, e.g. what is KVM's contract with
userspace in terms of recognizing new global certs?
I don't understand why the kernel needs to manage the certs. AFAICT the so called
global certs aren't an input to SEV_CMD_SNP_CONFIG, i.e. SNP_SET_EXT_CONFIG is
purely a software defined thing.
The easiest solution I can think of is to have KVM provide a chunk of memory in
kvm_sev_info for SNP guests that userspace can mmap(), a la vcpu->run.
struct sev_snp_certs {
u8 data[KVM_MAX_SEV_SNP_CERT_SIZE];
u32 size;
u8 pad[<size to make the struct page aligned>];
};
When the guest requests the certs, KVM does something like:
certs_size = READ_ONCE(sev->snp_certs->size);
if (certs_size > sizeof(sev->snp_certs->data) ||
!IS_ALIGNED(certs_size, PAGE_SIZE))
certs_size = 0;
if (certs_size && (data_npages << PAGE_SHIFT) < certs_size) {
vcpu->arch.regs[VCPU_REGS_RBX] = certs_size >> PAGE_SHIFT;
exitcode = SNP_GUEST_VMM_ERR(SNP_GUEST_VMM_ERR_INVALID_LEN);
goto cleanup;
}
...
if (certs_size &&
kvm_write_guest(kvm, data_gpa, sev->snp_certs->data, certs_size))
exitcode = SEV_RET_INVALID_ADDRESS;
If userspace wants to provide garbage to the guest, so be it, not KVM's problem.
That way, whether the VM gets the global cert or a per-VM cert is purely a userspace
concern.
If userspace needs to *stall* cert requests, e.g. while the certs are being updated,
then that's a different issue entirely. If the GHCB allows telling the guest to
retry the request, then it should be trivially easy to solve, e.g. add a flag in
sev_snp_certs. If KVM must "immediately" handle the request, then we'll need more
elaborate uAPI.
Powered by blists - more mailing lists