linux-kernel - Re: [PATCH] KVM: riscv: Power on secondary vCPUs from migration

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <DCTFU1UCDSZZ.3J6L3T6TYTELM@ventanamicro.com>
Date: Mon, 15 Sep 2025 16:19:21 +0200
From: Radim Krčmář <rkrcmar@...tanamicro.com>
To: "Jinyu Tang" <tjytimi@....com>, "Anup Patel" <anup@...infault.org>,
 "Atish Patra" <atish.patra@...ux.dev>, "Andrew Jones"
 <ajones@...tanamicro.com>, "Conor Dooley" <conor.dooley@...rochip.com>,
 "Yong-Xuan Wang" <yongxuan.wang@...ive.com>, "Paul Walmsley"
 <paul.walmsley@...ive.com>, "Nutty Liu" <nutty.liu@...mail.com>, "Tianshun
 Sun" <stsmail163@....com>
Cc: <kvm@...r.kernel.org>, <kvm-riscv@...ts.infradead.org>,
 <linux-riscv@...ts.infradead.org>, <linux-kernel@...r.kernel.org>,
 "linux-riscv" <linux-riscv-bounces@...ts.infradead.org>
Subject: Re: [PATCH] KVM: riscv: Power on secondary vCPUs from migration

2025-09-15T20:23:34+08:00, Jinyu Tang <tjytimi@....com>:
> The current logic keeps all secondary VCPUs powered off on their
> first run in kvm_arch_vcpu_postcreate(), relying on the boot VCPU 
> to wake them up by sbi call. This is correct for a fresh VM start,
> where VCPUs begin execution at the bootaddress (0x80000000).
>
> However, this behavior is not suitable for VCPUs that are being
> restored from a state (e.g., during migration resume or snapshot
> load). These VCPUs have a saved program counter (sepc). Forcing
> them to wait for a wake-up from the boot VCPU, which may not
> happen or may happen incorrectly, leaves them in a stuck state
> when using Qemu to migration if smp is larger than one.
>
> So check a cold start and a warm resumption by the value of the 
> guest's sepc register. If the VCPU is running for the first time 
> *and* its sepc is not the hardware boot address, it indicates a 
> resumed vCPU that must be powered on immediately to continue 
> execution from its saved context.
>
> Signed-off-by: Jinyu Tang <tjytimi@....com>
> Tested-by: Tianshun Sun <stsmail163@....com>
> ---

I don't like this approach.  Userspace controls the state of the VM, and
KVM shouldn't randomly change the state that userspace wants.

> diff --git a/arch/riscv/kvm/vcpu.c b/arch/riscv/kvm/vcpu.c
> @@ -867,8 +867,16 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
>  	struct kvm_cpu_trap trap;
>  	struct kvm_run *run = vcpu->run;
>  
> -	if (!vcpu->arch.ran_atleast_once)
> +	if (!vcpu->arch.ran_atleast_once) {
>  		kvm_riscv_vcpu_setup_config(vcpu);
> +		/*
> +		 * For VCPUs that are resuming (e.g., from migration)
> +		 * and not starting from the boot address, explicitly
> +		 * power them on.
> +		 */
> +		if (vcpu->arch.guest_context.sepc != 0x80000000)

Offlined VCPUs are not guaranteed to have sepc == 0x80000000, so this
patch would incorrectly wake them up.
(Depending on vcpu->arch.ran_atleast_once is flaky at best as well.)

Please try to fix userspace instead,

Thanks.