linux-kernel - Re: [PATCH v2 1/1] KVM: VMX: configure SVI during runtime APICv activation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aRScMffMkpsdi5vs@google.com>
Date: Wed, 12 Nov 2025 06:47:54 -0800
From: Sean Christopherson <seanjc@...gle.com>
To: Dongli Zhang <dongli.zhang@...cle.com>
Cc: kvm@...r.kernel.org, x86@...nel.org, linux-kernel@...r.kernel.org, 
	chao.gao@...el.com, pbonzini@...hat.com, tglx@...utronix.de, mingo@...hat.com, 
	bp@...en8.de, dave.hansen@...ux.intel.com, hpa@...or.com, joe.jin@...cle.com, 
	alejandro.j.jimenez@...cle.com
Subject: Re: [PATCH v2 1/1] KVM: VMX: configure SVI during runtime APICv activation

On Sun, Nov 09, 2025, Dongli Zhang wrote:
> ---
> Changed since v2:
>   - Add support for guest mode (suggested by Chao Gao).
>   - Add comments in the code (suggested by Chao Gao).
>   - Remove WARN_ON_ONCE from vmx_hwapic_isr_update().
>   - Edit commit message "AMD SVM APICv" to "AMD SVM AVIC"
>     (suggested by Alejandro Jimenez).
> 
>  arch/x86/kvm/vmx/vmx.c | 9 ---------
>  arch/x86/kvm/x86.c     | 7 +++++++
>  2 files changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index f87c216d976d..d263dbf0b917 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6878,15 +6878,6 @@ void vmx_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr)
>  	 * VM-Exit, otherwise L1 with run with a stale SVI.
>  	 */
>  	if (is_guest_mode(vcpu)) {
> -		/*
> -		 * KVM is supposed to forward intercepted L2 EOIs to L1 if VID
> -		 * is enabled in vmcs12; as above, the EOIs affect L2's vAPIC.
> -		 * Note, userspace can stuff state while L2 is active; assert
> -		 * that VID is disabled if and only if the vCPU is in KVM_RUN
> -		 * to avoid false positives if userspace is setting APIC state.
> -		 */
> -		WARN_ON_ONCE(vcpu->wants_to_run &&
> -			     nested_cpu_has_vid(get_vmcs12(vcpu)));
>  		to_vmx(vcpu)->nested.update_vmcs01_hwapic_isr = true;
>  		return;
>  	}
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index b4b5d2d09634..08b34431c187 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10878,9 +10878,16 @@ void __kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
>  	 * pending. At the same time, KVM_REQ_EVENT may not be set as APICv was
>  	 * still active when the interrupt got accepted. Make sure
>  	 * kvm_check_and_inject_events() is called to check for that.
> +	 *
> +	 * When APICv gets enabled, updating SVI is necessary; otherwise,
> +	 * SVI won't reflect the highest bit in vISR and the next EOI from
> +	 * the guest won't be virtualized correctly, as the CPU will clear
> +	 * the SVI bit from vISR.
>  	 */
>  	if (!apic->apicv_active)
>  		kvm_make_request(KVM_REQ_EVENT, vcpu);
> +	else
> +		kvm_apic_update_hwapic_isr(vcpu);

Rather than trigger the update from x86.c, what if we let VMX make the call?
Then we don't need to drop the WARN, and in the unlikely scenario L2 is active,
we'll save a pointless scan of the vISR (VMX will defer the update until L1 is
active).

We could even have kvm_apic_update_hwapic_isr() WARN if L2 is active.  E.g. with
an opportunistic typo fix in vmx_hwapic_isr_update()'s comment (completely untested):

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 0ae7f913d782..786ccfc24252 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -774,7 +774,8 @@ void kvm_apic_update_hwapic_isr(struct kvm_vcpu *vcpu)
 {
        struct kvm_lapic *apic = vcpu->arch.apic;
 
-       if (WARN_ON_ONCE(!lapic_in_kernel(vcpu)) || !apic->apicv_active)
+       if (WARN_ON_ONCE(!lapic_in_kernel(vcpu)) || !apic->apicv_active ||
+                        is_guest_mode(vcpu))
                return;
 
        kvm_x86_call(hwapic_isr_update)(vcpu, apic_find_highest_isr(apic));
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 91b6f2f3edc2..653b8b713547 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4430,6 +4430,14 @@ void vmx_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
                                                 kvm_vcpu_apicv_active(vcpu));
 
        vmx_update_msr_bitmap_x2apic(vcpu);
+
+       /*
+        * Refresh SVI if APICv is enabled, as any changes KVM made to vISR
+        * while APICv was disabled need to be reflected in SVI, e.g. so that
+        * the next accelerated EOI will clear the correct vector in vISR.
+        */
+       if (kvm_vcpu_apicv_active(vcpu))
+               kvm_apic_update_hwapic_isr(vcpu);
 }
 
 static u32 vmx_exec_control(struct vcpu_vmx *vmx)
@@ -6880,7 +6888,7 @@ void vmx_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr)
 
        /*
         * If L2 is active, defer the SVI update until vmcs01 is loaded, as SVI
-        * is only relevant for if and only if Virtual Interrupt Delivery is
+        * is only relevant for L2 if and only if Virtual Interrupt Delivery is
         * enabled in vmcs12, and if VID is enabled then L2 EOIs affect L2's
         * vAPIC, not L1's vAPIC.  KVM must update vmcs01 on the next nested
         * VM-Exit, otherwise L1 with run with a stale SVI.