lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADrL8HWvYwxTvRQFzk33aaDLgnSzgBvCaTW_1vP-fBuaC_K4Sw@mail.gmail.com>
Date: Tue, 3 Jun 2025 08:21:30 -0700
From: James Houghton <jthoughton@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Alexander Potapenko <glider@...gle.com>, Peter Gonda <pgonda@...gle.com>, 
	Tom Lendacky <thomas.lendacky@....com>
Subject: Re: [PATCH 1/2] KVM: SVM: Reject SEV{-ES} intra host migration if
 vCPU creation is in-flight

On Mon, Jun 2, 2025 at 3:45 PM Sean Christopherson <seanjc@...gle.com> wrote:
>
> Reject migration of SEV{-ES} state if either the source or destination VM
> is actively creating a vCPU, i.e. if kvm_vm_ioctl_create_vcpu() is in the
> section between incrementing created_vcpus and online_vcpus.  The bulk of
> vCPU creation runs _outside_ of kvm->lock to allow creating multiple vCPUs
> in parallel, and so sev_info.es_active can get toggled from false=>true in
> the destination VM after (or during) svm_vcpu_create(), resulting in an
> SEV{-ES} VM effectively having a non-SEV{-ES} vCPU.
>
> The issue manifests most visibly as a crash when trying to free a vCPU's
> NULL VMSA page in an SEV-ES VM, but any number of things can go wrong.
>
>   BUG: unable to handle page fault for address: ffffebde00000000
>   #PF: supervisor read access in kernel mode
>   #PF: error_code(0x0000) - not-present page
>   PGD 0 P4D 0
>   Oops: Oops: 0000 [#1] SMP KASAN NOPTI
>   CPU: 227 UID: 0 PID: 64063 Comm: syz.5.60023 Tainted: G     U     O        6.15.0-smp-DEV #2 NONE
>   Tainted: [U]=USER, [O]=OOT_MODULE
>   Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 12.52.0-0 10/28/2024
>   RIP: 0010:constant_test_bit arch/x86/include/asm/bitops.h:206 [inline]
>   RIP: 0010:arch_test_bit arch/x86/include/asm/bitops.h:238 [inline]
>   RIP: 0010:_test_bit include/asm-generic/bitops/instrumented-non-atomic.h:142 [inline]
>   RIP: 0010:PageHead include/linux/page-flags.h:866 [inline]
>   RIP: 0010:___free_pages+0x3e/0x120 mm/page_alloc.c:5067
>   Code: <49> f7 06 40 00 00 00 75 05 45 31 ff eb 0c 66 90 4c 89 f0 4c 39 f0
>   RSP: 0018:ffff8984551978d0 EFLAGS: 00010246
>   RAX: 0000777f80000001 RBX: 0000000000000000 RCX: ffffffff918aeb98
>   RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffebde00000000
>   RBP: 0000000000000000 R08: ffffebde00000007 R09: 1ffffd7bc0000000
>   R10: dffffc0000000000 R11: fffff97bc0000001 R12: dffffc0000000000
>   R13: ffff8983e19751a8 R14: ffffebde00000000 R15: 1ffffd7bc0000000
>   FS:  0000000000000000(0000) GS:ffff89ee661d3000(0000) knlGS:0000000000000000
>   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   CR2: ffffebde00000000 CR3: 000000793ceaa000 CR4: 0000000000350ef0
>   DR0: 0000000000000000 DR1: 0000000000000b5f DR2: 0000000000000000
>   DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>   Call Trace:
>    <TASK>
>    sev_free_vcpu+0x413/0x630 arch/x86/kvm/svm/sev.c:3169
>    svm_vcpu_free+0x13a/0x2a0 arch/x86/kvm/svm/svm.c:1515
>    kvm_arch_vcpu_destroy+0x6a/0x1d0 arch/x86/kvm/x86.c:12396
>    kvm_vcpu_destroy virt/kvm/kvm_main.c:470 [inline]
>    kvm_destroy_vcpus+0xd1/0x300 virt/kvm/kvm_main.c:490
>    kvm_arch_destroy_vm+0x636/0x820 arch/x86/kvm/x86.c:12895
>    kvm_put_kvm+0xb8e/0xfb0 virt/kvm/kvm_main.c:1310
>    kvm_vm_release+0x48/0x60 virt/kvm/kvm_main.c:1369
>    __fput+0x3e4/0x9e0 fs/file_table.c:465
>    task_work_run+0x1a9/0x220 kernel/task_work.c:227
>    exit_task_work include/linux/task_work.h:40 [inline]
>    do_exit+0x7f0/0x25b0 kernel/exit.c:953
>    do_group_exit+0x203/0x2d0 kernel/exit.c:1102
>    get_signal+0x1357/0x1480 kernel/signal.c:3034
>    arch_do_signal_or_restart+0x40/0x690 arch/x86/kernel/signal.c:337
>    exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
>    exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline]
>    __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
>    syscall_exit_to_user_mode+0x67/0xb0 kernel/entry/common.c:218
>    do_syscall_64+0x7c/0x150 arch/x86/entry/syscall_64.c:100
>    entry_SYSCALL_64_after_hwframe+0x76/0x7e
>   RIP: 0033:0x7f87a898e969
>    </TASK>
>   Modules linked in: gq(O)
>   gsmi: Log Shutdown Reason 0x03
>   CR2: ffffebde00000000
>   ---[ end trace 0000000000000000 ]---
>
> Deliberately don't check for a NULL VMSA when freeing the vCPU, as crashing
> the host is likely desirable due to the VMSA being consumed by hardware.
> E.g. if KVM manages to allow VMRUN on the vCPU, hardware may read/write a
> bogus VMSA page.  Accessing PFN 0 is "fine"-ish now that it's sequestered
> away thanks to L1TF, but panicking in this scenario is preferable to
> potentially running with corrupted state.
>
> Reported-by: Alexander Potapenko <glider@...gle.com>
> Tested-by: Alexander Potapenko <glider@...gle.com>
> Fixes: 0b020f5af092 ("KVM: SEV: Add support for SEV-ES intra host migration")
> Fixes: b56639318bb2 ("KVM: SEV: Add support for SEV intra host migration")
> Cc: stable@...r.kernel.org
> Cc: James Houghton <jthoughton@...gle.com>
> Cc: Peter Gonda <pgonda@...gle.com>
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>

Thanks Sean! Free free to add:

Reviewed-by: James Houghton <jthoughton@...gle.com>

> ---
>  arch/x86/kvm/svm/sev.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index a7a7dc507336..93d899454535 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -2032,6 +2032,10 @@ static int sev_check_source_vcpus(struct kvm *dst, struct kvm *src)
>         struct kvm_vcpu *src_vcpu;
>         unsigned long i;
>
> +       if (src->created_vcpus != atomic_read(&src->online_vcpus) ||
> +           dst->created_vcpus != atomic_read(&dst->online_vcpus))
> +               return -EINVAL;

I think -EBUSY (or perhaps -EAGAIN) might be a more proper return code.

> +
>         if (!sev_es_guest(src))
>                 return 0;
>
> --
> 2.49.0.1204.g71687c7c1d-goog
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ