linux-kernel - Re: [PATCH v2 1/1] KVM: VMX: configure SVI during runtime APICv activation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2b0de566-0602-4a9e-9c5c-b947617f684f@oracle.com>
Date: Fri, 5 Dec 2025 10:12:10 -0800
From: Dongli Zhang <dongli.zhang@...cle.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, x86@...nel.org, linux-kernel@...r.kernel.org,
        chao.gao@...el.com, pbonzini@...hat.com, tglx@...utronix.de,
        mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
        hpa@...or.com, joe.jin@...cle.com, alejandro.j.jimenez@...cle.com
Subject: Re: [PATCH v2 1/1] KVM: VMX: configure SVI during runtime APICv
 activation

Hi Sean,

On 12/4/25 6:15 PM, Sean Christopherson wrote:
> On Mon, Nov 17, 2025, Dongli Zhang wrote:

[snip]

>>
>> 1. Take the approach reviewed by Chao, and ...
> 
> Ya.  I spent a lot of time today wrapping my head around what all is going on,
> and my lightbulb moment came when reading this from the changelog:
> 
>   The issue is not applicable to AMD SVM which employs a different LAPIC
>   virtualization mechanism. 
> 
> That's not entirely true.  It's definitely true for SVI, but not for the bug that
> Chao pointed out.  SVM is "immune" from these bugs because KVM simply updates
> vmcb01 directly.  And looking at everything, there's zero reason we can't do the
> same for VMX.  Yeah, KVM needs to do a couple VMPTRLDs to swap between vmcs01 and
> vmcs02, but those aren't _that_ expensive, and these are slow paths.
> 
> And with a guard(), it's pretty trivial to run a section of code with vmcs01
> active.
> 
> static void vmx_load_vmcs01(struct kvm_vcpu *vcpu)
> {
> 	struct vcpu_vmx *vmx = to_vmx(vcpu);
> 
> 	if (!is_guest_mode(vcpu)) {
> 		WARN_ON_ONCE(vmx->loaded_vmcs != &vmx->vmcs01);
> 		return;
> 	}
> 
> 	WARN_ON_ONCE(vmx->loaded_vmcs != &vmx->nested.vmcs02);
> 	vmx_switch_loaded_vmcs(vcpu, &vmx->vmcs01);
> }
> 
> static void vmx_put_vmcs01(struct kvm_vcpu *vcpu)
> {
> 	if (!is_guest_mode(vcpu))
> 		return;
> 
> 	vmx_switch_loaded_vmcs(vcpu, &to_vmx(vcpu)->nested.vmcs02);
> }
> DEFINE_GUARD(vmx_vmcs01, struct kvm_vcpu *,
> 	     vmx_load_vmcs01(_T), vmx_put_vmcs01(_T))
> 
> I've got changes to convert everything to guard(vmx_vmcs01); except for
> vmx_set_virtual_apic_mode(), they're all quite trivial.  I also have a selftest
> that hits both this bug and the one Chao pointed out, so I'm reasonably confident
> the changes do indeed work.
> 
> But they're most definitely NOT stable material.  So my plan is to grab this
> and the below for 6.19, and then do the cleanup for 6.20 or later.
> 
> Oh, almost forgot.  We can also sink the hwapic_isr_update() call into
> kvm_apic_update_apicv() and drop kvm_apic_update_hwapic_isr() entirely, which is
> another argument for your approach.  That's actually a really good fit, because
> that's where KVM parses the vISR when APICv is being _disabled_.
> 
> I'll post a v3 with everything tomorrow (hopefully) after running the changes
> through more normal test flow.

Looking forward for how it looks like. The only concern is if it is simple
enough to backport to prior older kernel version, i.e. v5.15.196.

> 
>> 2. Fix the vmx_refresh_apicv_exec_ctrl() issue with an additional patch:
>>
>> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
>> index bcea087b642f..7d98c11a8920 100644
>> --- a/arch/x86/kvm/vmx/nested.c
>> +++ b/arch/x86/kvm/vmx/nested.c
>> @@ -19,6 +19,7 @@
>>  #include "trace.h"
>>  #include "vmx.h"
>>  #include "smm.h"
>> +#include "x86_ops.h"
>>
>>  static bool __read_mostly enable_shadow_vmcs = 1;
>>  module_param_named(enable_shadow_vmcs, enable_shadow_vmcs, bool, S_IRUGO);
>> @@ -5214,9 +5215,9 @@ void __nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32
>> vm_exit_reason,
>>                 kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu);
>>         }
>>
>> -       if (vmx->nested.update_vmcs01_apicv_status) {
>> -               vmx->nested.update_vmcs01_apicv_status = false;
>> -               kvm_make_request(KVM_REQ_APICV_UPDATE, vcpu);
>> +       if (vmx->nested.update_vmcs01_apicv_exec_ctrl) {
>> +               vmx->nested.update_vmcs01_apicv_exec_ctrl = false;
>> +               vmx_refresh_apicv_exec_ctrl(vcpu);
> 
> +1 to the fix, but I'll omit the update_vmcs01_apicv_exec_ctrl rename because I'm
> 99% certain we can get rid of it entirely.
> 
> Oh, and can you give your SoB for this?  I'll write the changelog, just need the
> SoB for the commit.  Thanks!
> 

Sure.

Signed-off-by: Dongli Zhang <dongli.zhang@...cle.com>

Thank you very much!

Dongli Zhang