[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <72da0532-908b-40c2-a4e4-7ef1895547c7@oracle.com>
Date: Wed, 12 Nov 2025 19:06:24 -0800
From: Dongli Zhang <dongli.zhang@...cle.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, x86@...nel.org, linux-kernel@...r.kernel.org,
chao.gao@...el.com, pbonzini@...hat.com, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
hpa@...or.com, joe.jin@...cle.com, alejandro.j.jimenez@...cle.com
Subject: Re: [PATCH v2 1/1] KVM: VMX: configure SVI during runtime APICv
activation
Hi Sean,
On 11/12/25 6:47 AM, Sean Christopherson wrote:
> On Sun, Nov 09, 2025, Dongli Zhang wrote:
>> ---
>> Changed since v2:
>> - Add support for guest mode (suggested by Chao Gao).
>> - Add comments in the code (suggested by Chao Gao).
>> - Remove WARN_ON_ONCE from vmx_hwapic_isr_update().
>> - Edit commit message "AMD SVM APICv" to "AMD SVM AVIC"
>> (suggested by Alejandro Jimenez).
>>
>> arch/x86/kvm/vmx/vmx.c | 9 ---------
>> arch/x86/kvm/x86.c | 7 +++++++
>> 2 files changed, 7 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index f87c216d976d..d263dbf0b917 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -6878,15 +6878,6 @@ void vmx_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr)
>> * VM-Exit, otherwise L1 with run with a stale SVI.
>> */
>> if (is_guest_mode(vcpu)) {
>> - /*
>> - * KVM is supposed to forward intercepted L2 EOIs to L1 if VID
>> - * is enabled in vmcs12; as above, the EOIs affect L2's vAPIC.
>> - * Note, userspace can stuff state while L2 is active; assert
>> - * that VID is disabled if and only if the vCPU is in KVM_RUN
>> - * to avoid false positives if userspace is setting APIC state.
>> - */
>> - WARN_ON_ONCE(vcpu->wants_to_run &&
>> - nested_cpu_has_vid(get_vmcs12(vcpu)));
>> to_vmx(vcpu)->nested.update_vmcs01_hwapic_isr = true;
>> return;
>> }
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index b4b5d2d09634..08b34431c187 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -10878,9 +10878,16 @@ void __kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
>> * pending. At the same time, KVM_REQ_EVENT may not be set as APICv was
>> * still active when the interrupt got accepted. Make sure
>> * kvm_check_and_inject_events() is called to check for that.
>> + *
>> + * When APICv gets enabled, updating SVI is necessary; otherwise,
>> + * SVI won't reflect the highest bit in vISR and the next EOI from
>> + * the guest won't be virtualized correctly, as the CPU will clear
>> + * the SVI bit from vISR.
>> */
>> if (!apic->apicv_active)
>> kvm_make_request(KVM_REQ_EVENT, vcpu);
>> + else
>> + kvm_apic_update_hwapic_isr(vcpu);
>
> Rather than trigger the update from x86.c, what if we let VMX make the call?
> Then we don't need to drop the WARN, and in the unlikely scenario L2 is active,
> we'll save a pointless scan of the vISR (VMX will defer the update until L1 is
> active).
>
> We could even have kvm_apic_update_hwapic_isr() WARN if L2 is active. E.g. with
> an opportunistic typo fix in vmx_hwapic_isr_update()'s comment (completely untested):
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 0ae7f913d782..786ccfc24252 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -774,7 +774,8 @@ void kvm_apic_update_hwapic_isr(struct kvm_vcpu *vcpu)
> {
> struct kvm_lapic *apic = vcpu->arch.apic;
>
> - if (WARN_ON_ONCE(!lapic_in_kernel(vcpu)) || !apic->apicv_active)
> + if (WARN_ON_ONCE(!lapic_in_kernel(vcpu)) || !apic->apicv_active ||
> + is_guest_mode(vcpu))
> return;
>
> kvm_x86_call(hwapic_isr_update)(vcpu, apic_find_highest_isr(apic));
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 91b6f2f3edc2..653b8b713547 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4430,6 +4430,14 @@ void vmx_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
> kvm_vcpu_apicv_active(vcpu));
>
> vmx_update_msr_bitmap_x2apic(vcpu);
> +
> + /*
> + * Refresh SVI if APICv is enabled, as any changes KVM made to vISR
> + * while APICv was disabled need to be reflected in SVI, e.g. so that
> + * the next accelerated EOI will clear the correct vector in vISR.
> + */
> + if (kvm_vcpu_apicv_active(vcpu))
> + kvm_apic_update_hwapic_isr(vcpu);
> }
>
> static u32 vmx_exec_control(struct vcpu_vmx *vmx)
> @@ -6880,7 +6888,7 @@ void vmx_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr)
>
> /*
> * If L2 is active, defer the SVI update until vmcs01 is loaded, as SVI
> - * is only relevant for if and only if Virtual Interrupt Delivery is
> + * is only relevant for L2 if and only if Virtual Interrupt Delivery is
> * enabled in vmcs12, and if VID is enabled then L2 EOIs affect L2's
> * vAPIC, not L1's vAPIC. KVM must update vmcs01 on the next nested
> * VM-Exit, otherwise L1 with run with a stale SVI.
As a quick reply, the idea is to call kvm_apic_update_hwapic_isr() in
vmx_refresh_apicv_exec_ctrl(), instead of __kvm_vcpu_update_apicv().
I think the below case doesn't work:
1. APICv is activated when vCPU is in L2.
kvm_vcpu_update_apicv()
-> __kvm_vcpu_update_apicv()
-> vmx_refresh_apicv_exec_ctrl()
vmx_refresh_apicv_exec_ctrl() returns after setting:
vmx->nested.update_vmcs01_apicv_status = true.
2. On exit from L2 to L1, __nested_vmx_vmexit() requests for KVM_REQ_APICV_UPDATE.
__nested_vmx_vmexit()
-> leave_guest_mode(vcpu)
-> kvm_make_request(KVM_REQ_APICV_UPDATE, vcpu)
3. vCPU processes KVM_REQ_APICV_UPDATE again.
This time, __kvm_vcpu_update_apicv() returns without calling
refresh_apicv_exec_ctrl(), because (apic->apicv_active == activate).
vmx_refresh_apicv_exec_ctrl() doesn't get any chance to be called.
In order to call kvm_apic_update_hwapic_isr() in vmx_refresh_apicv_exec_ctrl(),
we may need to resolve the issue mentioned by Chao, for instance, with something
like:
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index bcea087b642f..1725c6a94f99 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -19,6 +19,7 @@
#include "trace.h"
#include "vmx.h"
#include "smm.h"
+#include "x86_ops.h"
static bool __read_mostly enable_shadow_vmcs = 1;
module_param_named(enable_shadow_vmcs, enable_shadow_vmcs, bool, S_IRUGO);
@@ -5216,7 +5217,7 @@ void __nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32
vm_exit_reason,
if (vmx->nested.update_vmcs01_apicv_status) {
vmx->nested.update_vmcs01_apicv_status = false;
- kvm_make_request(KVM_REQ_APICV_UPDATE, vcpu);
+ vmx_refresh_apicv_exec_ctrl(vcpu);
}
if (vmx->nested.update_vmcs01_hwapic_isr) {
Still validating if it works well.
Thank you very much!
Dongli Zhang
Powered by blists - more mailing lists