linux-kernel - Re: [BUG] NULL pointer dereference in sev_writeback_caches during KVM SEV migration kselftest on AMD platform

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aHU1PGWwp9f6q8sk@google.com>
Date: Mon, 14 Jul 2025 09:50:04 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Zheyun Shen <szy0127@...u.edu.cn>
Cc: Srikanth Aithal <sraithal@....com>, linux-next@...r.kernel.org, kvm@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [BUG] NULL pointer dereference in sev_writeback_caches during KVM
 SEV migration kselftest on AMD platform

On Mon, Jul 14, 2025, Sean Christopherson wrote:
> On Mon, Jul 14, 2025, Zheyun Shen wrote:
> I think this is the fix, testing now...
> 
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 95668e84ab86..1476e877b2dc 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -1936,6 +1936,7 @@ static void sev_migrate_from(struct kvm *dst_kvm, struct kvm *src_kvm)
>         dst->enc_context_owner = src->enc_context_owner;
>         dst->es_active = src->es_active;
>         dst->vmsa_features = src->vmsa_features;
> +       memcpy(&dst->have_run_cpus, &src->have_run_cpus, sizeof(src->have_run_cpus));
>  
>         src->asid = 0;
>         src->active = false;
> @@ -1943,6 +1944,7 @@ static void sev_migrate_from(struct kvm *dst_kvm, struct kvm *src_kvm)
>         src->pages_locked = 0;
>         src->enc_context_owner = NULL;
>         src->es_active = false;
> +       memset(&src->have_run_cpus, 0, sizeof(src->have_run_cpus));
>  
>         list_cut_before(&dst->regions_list, &src->regions_list, &src->regions_list);

Gah, that's niether sufficient nor correct.  I was thinking KVM_VM_DEAD would guard
accesses to the bitmask, but that only handles the KVM_RUN path.  And we don't
want to skip the WBINVD when tearing down the source, because nothing guarantees
the destination has pinned all of the source's memory.

And conversely, I don't think KVM needs to copy over the mask itself.  If a CPU
was used for the source VM but not the destination VM, then it can only have
cached memory that was accessible to the source VM.  And a CPU that was run in
the source is also used by the destination is no different than a CPU that was
run in the destination only.

So as much as I want to avoid allocating another cpumask (ugh), it's the right
thing to do.  And practically speaking, I doubt many real world users of SEV will
be using MAXSMP, i.e. the allocations don't exist anyways.

Unless someone objects and/or has a better idea, I'll squash this:

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 95668e84ab86..e39726d258b8 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2072,6 +2072,17 @@ int sev_vm_move_enc_context_from(struct kvm *kvm, unsigned int source_fd)
        if (ret)
                goto out_source_vcpu;
 
+       /*
+        * Allocate a new have_run_cpus for the destination, i.e. don't copy
+        * the set of CPUs from the source.  If a CPU was used to run a vCPU in
+        * the source VM but is never used for the destination VM, then the CPU
+        * can only have cached memory that was accessible to the source VM.
+        */
+       if (!zalloc_cpumask_var(&dst_sev->have_run_cpus, GFP_KERNEL_ACCOUNT)) {
+               ret = -ENOMEM;
+               goto out_source_vcpu;
+       }
+
        sev_migrate_from(kvm, source_kvm);
        kvm_vm_dead(source_kvm);
        cg_cleanup_sev = src_sev;
@@ -2771,13 +2782,18 @@ int sev_vm_copy_enc_context_from(struct kvm *kvm, unsigned int source_fd)
                goto e_unlock;
        }
 
+       mirror_sev = to_kvm_sev_info(kvm);
+       if (!zalloc_cpumask_var(&mirror_sev->have_run_cpus, GFP_KERNEL_ACCOUNT)) {
+               ret = -ENOMEM;
+               goto e_unlock;
+       }
+
        /*
         * The mirror kvm holds an enc_context_owner ref so its asid can't
         * disappear until we're done with it
         */
        source_sev = to_kvm_sev_info(source_kvm);
        kvm_get_kvm(source_kvm);
-       mirror_sev = to_kvm_sev_info(kvm);
        list_add_tail(&mirror_sev->mirror_entry, &source_sev->mirror_vms);
 
        /* Set enc_context_owner and copy its encryption context over */