[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aHU1PGWwp9f6q8sk@google.com>
Date: Mon, 14 Jul 2025 09:50:04 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Zheyun Shen <szy0127@...u.edu.cn>
Cc: Srikanth Aithal <sraithal@....com>, linux-next@...r.kernel.org, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [BUG] NULL pointer dereference in sev_writeback_caches during KVM
SEV migration kselftest on AMD platform
On Mon, Jul 14, 2025, Sean Christopherson wrote:
> On Mon, Jul 14, 2025, Zheyun Shen wrote:
> I think this is the fix, testing now...
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 95668e84ab86..1476e877b2dc 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -1936,6 +1936,7 @@ static void sev_migrate_from(struct kvm *dst_kvm, struct kvm *src_kvm)
> dst->enc_context_owner = src->enc_context_owner;
> dst->es_active = src->es_active;
> dst->vmsa_features = src->vmsa_features;
> + memcpy(&dst->have_run_cpus, &src->have_run_cpus, sizeof(src->have_run_cpus));
>
> src->asid = 0;
> src->active = false;
> @@ -1943,6 +1944,7 @@ static void sev_migrate_from(struct kvm *dst_kvm, struct kvm *src_kvm)
> src->pages_locked = 0;
> src->enc_context_owner = NULL;
> src->es_active = false;
> + memset(&src->have_run_cpus, 0, sizeof(src->have_run_cpus));
>
> list_cut_before(&dst->regions_list, &src->regions_list, &src->regions_list);
Gah, that's niether sufficient nor correct. I was thinking KVM_VM_DEAD would guard
accesses to the bitmask, but that only handles the KVM_RUN path. And we don't
want to skip the WBINVD when tearing down the source, because nothing guarantees
the destination has pinned all of the source's memory.
And conversely, I don't think KVM needs to copy over the mask itself. If a CPU
was used for the source VM but not the destination VM, then it can only have
cached memory that was accessible to the source VM. And a CPU that was run in
the source is also used by the destination is no different than a CPU that was
run in the destination only.
So as much as I want to avoid allocating another cpumask (ugh), it's the right
thing to do. And practically speaking, I doubt many real world users of SEV will
be using MAXSMP, i.e. the allocations don't exist anyways.
Unless someone objects and/or has a better idea, I'll squash this:
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 95668e84ab86..e39726d258b8 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2072,6 +2072,17 @@ int sev_vm_move_enc_context_from(struct kvm *kvm, unsigned int source_fd)
if (ret)
goto out_source_vcpu;
+ /*
+ * Allocate a new have_run_cpus for the destination, i.e. don't copy
+ * the set of CPUs from the source. If a CPU was used to run a vCPU in
+ * the source VM but is never used for the destination VM, then the CPU
+ * can only have cached memory that was accessible to the source VM.
+ */
+ if (!zalloc_cpumask_var(&dst_sev->have_run_cpus, GFP_KERNEL_ACCOUNT)) {
+ ret = -ENOMEM;
+ goto out_source_vcpu;
+ }
+
sev_migrate_from(kvm, source_kvm);
kvm_vm_dead(source_kvm);
cg_cleanup_sev = src_sev;
@@ -2771,13 +2782,18 @@ int sev_vm_copy_enc_context_from(struct kvm *kvm, unsigned int source_fd)
goto e_unlock;
}
+ mirror_sev = to_kvm_sev_info(kvm);
+ if (!zalloc_cpumask_var(&mirror_sev->have_run_cpus, GFP_KERNEL_ACCOUNT)) {
+ ret = -ENOMEM;
+ goto e_unlock;
+ }
+
/*
* The mirror kvm holds an enc_context_owner ref so its asid can't
* disappear until we're done with it
*/
source_sev = to_kvm_sev_info(source_kvm);
kvm_get_kvm(source_kvm);
- mirror_sev = to_kvm_sev_info(kvm);
list_add_tail(&mirror_sev->mirror_entry, &source_sev->mirror_vms);
/* Set enc_context_owner and copy its encryption context over */
Powered by blists - more mailing lists