[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aKjZC3peGKPj9NPq@google.com>
Date: Fri, 22 Aug 2025 13:54:35 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Naveen N Rao <naveen@...nel.org>
Cc: "Maciej S. Szmigiero" <mail@...iej.szmigiero.name>, Paolo Bonzini <pbonzini@...hat.com>,
Maxim Levitsky <mlevitsk@...hat.com>, Suravee Suthikulpanit <Suravee.Suthikulpanit@....com>,
Alejandro Jimenez <alejandro.j.jimenez@...cle.com>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] KVM: SVM: Sync TPR from LAPIC into VMCB::V_TPR when
setting LAPIC regs
On Fri, Aug 22, 2025, Naveen N Rao wrote:
> On Thu, Aug 21, 2025 at 01:38:17PM -0700, Sean Christopherson wrote:
> > On Tue, Aug 19, 2025, Maciej S. Szmigiero wrote:
> > > + /*
> > > + * Sync TPR from LAPIC TASKPRI into V_TPR field of the VMCB.
> > > + *
> > > + * When AVIC is enabled the normal pre-VMRUN sync in sync_lapic_to_cr8()
> > > + * is inhibited so any set TPR LAPIC state would not get reflected
> > > + * in V_TPR.
> >
> > Hmm, I think that code is straight up wrong. There's no justification, just a
> > claim:
> >
> > commit 3bbf3565f48ce3999b5a12cde946f81bd4475312
> > Author: Suravee Suthikulpanit <Suravee.Suthikulpanit@....com>
> > AuthorDate: Wed May 4 14:09:51 2016 -0500
> > Commit: Paolo Bonzini <pbonzini@...hat.com>
> > CommitDate: Wed May 18 18:04:31 2016 +0200
> >
> > svm: Do not intercept CR8 when enable AVIC
> >
> > When enable AVIC:
> > * Do not intercept CR8 since this should be handled by AVIC HW.
> > * Also, we don't need to sync cr8/V_TPR and APIC backing page. <======
> >
> > Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@....com>
> > [Rename svm_in_nested_interrupt_shadow to svm_nested_virtualize_tpr. - Paolo]
> > Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
> >
> > That claim assumes APIC[TPR] will _never_ be modified by anything other than
> > hardware.
>
> It also isn't clear to me why only sync_lapic_to_cr8() was gated when
> AVIC was enabled, while sync_cr8_to_lapic() continues to copy V_TRP to
> the backing page. If AVIC is enabled, then the AVIC hardware updates
> both the backing page and V_TPR on a guest write to TPR.
>
> > That's obviously false for state restore from userspace, and it's also
> > technically false at steady state, e.g. if KVM managed to trigger emulation of a
> > store to the APIC page, then KVM would bypass the automatic harware sync.
>
> Do you mean emulation due to AVIC being inhibited? I initially thought
> this could be a problem, but in this scenario, AVIC would be disabled on
> the next VMRUN, so we will end up sync'ing TPR from the lapic to V_TPR.
No, if emulation is triggered when AVIC isn't inhibited. E.g. a contrived but
entirely possible situation would be if MOVBE isn't supported in hardware, KVM
is emulating MOVBE for emulation, and the guest sets the TPR via MOVBE. The MOVBE
#UDs, KVM emulates in response to the #UD, and Bob's your uncle.
There are other scenarios where KVM would emulate, though they're even more
contrived.
> > There's also the comically ancient KVM_SET_VAPIC_ADDR, which AFAICT appears to
> > be largely dead code with respect to vTPR (nothing sets KVM_APIC_CHECK_VAPIC
> > except for the initial ioctl), but could again set APIC[TPR] without updating
> > V_TPR.
> >
> > So, rather than manually do the update during state restore, my vote
> > is to restore the sync logic. And if we want to optimize that code
> > (seems unnecessary), then we should hook all TPR writes.
>
> I guess you mean apic_set_tpr()?
Yep.
> We will need to hook into that in addition to updating
> avic_apicv_post_state_restore() since KVM_SET_LAPIC just does a memcpy of the
> register state.
Yeah, or explicitly call the hook, e.g. like kvm_apic_set_state() does for
hwapic_isr_update(). But I don't think we should add a hook unless someone
proves that unconditionally synchronizing before VMRUN affects performance.
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index d9931c6c4bc6..1bfebe40854f 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -4046,8 +4046,7 @@ static inline void sync_lapic_to_cr8(struct kvm_vcpu *vcpu)
> > struct vcpu_svm *svm = to_svm(vcpu);
> > u64 cr8;
> >
> > - if (nested_svm_virtualize_tpr(vcpu) ||
> > - kvm_vcpu_apicv_active(vcpu))
> > + if (nested_svm_virtualize_tpr(vcpu))
> > return;
> >
> > cr8 = kvm_get_cr8(vcpu);
>
> I agree that this is a simpler fix, so would be good to do for backport
> ease.
>
> The code in sync_lapic_to_cr8 ends up being a function call to
> kvm_get_cr8() and ~6 instructions, which isn't that much. But if we can
> gate sync'ing V_TPR to the backing page in sync_cr8_to_lapic() as well,
> then it might be good to do so.
>
>
> - Naveen
>
Powered by blists - more mailing lists