[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aKjZC3peGKPj9NPq@google.com>
Date: Fri, 22 Aug 2025 13:54:35 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Naveen N Rao <naveen@...nel.org>
Cc: "Maciej S. Szmigiero" <mail@...iej.szmigiero.name>, Paolo Bonzini <pbonzini@...hat.com>, 
	Maxim Levitsky <mlevitsk@...hat.com>, Suravee Suthikulpanit <Suravee.Suthikulpanit@....com>, 
	Alejandro Jimenez <alejandro.j.jimenez@...cle.com>, kvm@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] KVM: SVM: Sync TPR from LAPIC into VMCB::V_TPR when
 setting LAPIC regs
On Fri, Aug 22, 2025, Naveen N Rao wrote:
> On Thu, Aug 21, 2025 at 01:38:17PM -0700, Sean Christopherson wrote:
> > On Tue, Aug 19, 2025, Maciej S. Szmigiero wrote:
> > > +	/*
> > > +	 * Sync TPR from LAPIC TASKPRI into V_TPR field of the VMCB.
> > > +	 *
> > > +	 * When AVIC is enabled the normal pre-VMRUN sync in sync_lapic_to_cr8()
> > > +	 * is inhibited so any set TPR LAPIC state would not get reflected
> > > +	 * in V_TPR.
> > 
> > Hmm, I think that code is straight up wrong.  There's no justification, just a
> > claim:
> > 
> >   commit 3bbf3565f48ce3999b5a12cde946f81bd4475312
> >   Author:     Suravee Suthikulpanit <Suravee.Suthikulpanit@....com>
> >   AuthorDate: Wed May 4 14:09:51 2016 -0500
> >   Commit:     Paolo Bonzini <pbonzini@...hat.com>
> >   CommitDate: Wed May 18 18:04:31 2016 +0200
> > 
> >     svm: Do not intercept CR8 when enable AVIC
> >     
> >     When enable AVIC:
> >         * Do not intercept CR8 since this should be handled by AVIC HW.
> >         * Also, we don't need to sync cr8/V_TPR and APIC backing page.   <======
> >     
> >     Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@....com>
> >     [Rename svm_in_nested_interrupt_shadow to svm_nested_virtualize_tpr. - Paolo]
> >     Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
> > 
> > That claim assumes APIC[TPR] will _never_ be modified by anything other than
> > hardware. 
> 
> It also isn't clear to me why only sync_lapic_to_cr8() was gated when 
> AVIC was enabled, while sync_cr8_to_lapic() continues to copy V_TRP to 
> the backing page. If AVIC is enabled, then the AVIC hardware updates 
> both the backing page and V_TPR on a guest write to TPR.
> 
> > That's obviously false for state restore from userspace, and it's also
> > technically false at steady state, e.g. if KVM managed to trigger emulation of a
> > store to the APIC page, then KVM would bypass the automatic harware sync.
> 
> Do you mean emulation due to AVIC being inhibited? I initially thought 
> this could be a problem, but in this scenario, AVIC would be disabled on 
> the next VMRUN, so we will end up sync'ing TPR from the lapic to V_TPR.
No, if emulation is triggered when AVIC isn't inhibited.  E.g. a contrived but
entirely possible situation would be if MOVBE isn't supported in hardware, KVM
is emulating MOVBE for emulation, and the guest sets the TPR via MOVBE.  The MOVBE
#UDs, KVM emulates in response to the #UD, and Bob's your uncle.
There are other scenarios where KVM would emulate, though they're even more
contrived.
> > There's also the comically ancient KVM_SET_VAPIC_ADDR, which AFAICT appears to
> > be largely dead code with respect to vTPR (nothing sets KVM_APIC_CHECK_VAPIC
> > except for the initial ioctl), but could again set APIC[TPR] without updating
> > V_TPR.
> >
> > So, rather than manually do the update during state restore, my vote 
> > is to restore the sync logic.  And if we want to optimize that code 
> > (seems unnecessary), then we should hook all TPR writes.
> 
> I guess you mean apic_set_tpr()? 
Yep.
> We will need to hook into that in addition to updating
> avic_apicv_post_state_restore() since KVM_SET_LAPIC just does a memcpy of the
> register state.
Yeah, or explicitly call the hook, e.g. like kvm_apic_set_state() does for
hwapic_isr_update().  But I don't think we should add a hook unless someone
proves that unconditionally synchronizing before VMRUN affects performance.
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index d9931c6c4bc6..1bfebe40854f 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -4046,8 +4046,7 @@ static inline void sync_lapic_to_cr8(struct kvm_vcpu *vcpu)
> >         struct vcpu_svm *svm = to_svm(vcpu);
> >         u64 cr8;
> >  
> > -       if (nested_svm_virtualize_tpr(vcpu) ||
> > -           kvm_vcpu_apicv_active(vcpu))
> > +       if (nested_svm_virtualize_tpr(vcpu))
> >                 return;
> >  
> >         cr8 = kvm_get_cr8(vcpu);
> 
> I agree that this is a simpler fix, so would be good to do for backport 
> ease.
> 
> The code in sync_lapic_to_cr8 ends up being a function call to 
> kvm_get_cr8() and ~6 instructions, which isn't that much. But if we can 
> gate sync'ing V_TPR to the backing page in sync_cr8_to_lapic() as well, 
> then it might be good to do so.
> 
> 
> - Naveen
> 
Powered by blists - more mailing lists
 
