[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ha3rq3ybk4c63bov2ix6yyajggep6zdulsd2e44vqw4ctufeiq@env4pl5m3wz6>
Date: Wed, 25 Jun 2025 20:58:16 +0530
From: Naveen N Rao <naveen@...nel.org>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Marc Zyngier <maz@...nel.org>, Oliver Upton <oliver.upton@...ux.dev>,
Paolo Bonzini <pbonzini@...hat.com>, Joerg Roedel <joro@...tes.org>,
David Woodhouse <dwmw2@...radead.org>, Lu Baolu <baolu.lu@...ux.intel.com>,
linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.linux.dev, kvm@...r.kernel.org,
iommu@...ts.linux.dev, linux-kernel@...r.kernel.org, Sairaj Kodilkar <sarunkod@....com>,
Vasant Hegde <vasant.hegde@....com>, Maxim Levitsky <mlevitsk@...hat.com>,
Joao Martins <joao.m.martins@...cle.com>, Francesco Lavra <francescolavra.fl@...il.com>,
David Matlack <dmatlack@...gle.com>
Subject: Re: [PATCH v3 20/62] KVM: SVM: Add a comment to explain why
avic_vcpu_blocking() ignores IRQ blocking
On Mon, Jun 23, 2025 at 09:18:45AM -0700, Sean Christopherson wrote:
> On Mon, Jun 23, 2025, Naveen N Rao wrote:
> > On Wed, Jun 11, 2025 at 03:45:23PM -0700, Sean Christopherson wrote:
> > > Add a comment to explain why KVM clears IsRunning when putting a vCPU,
> > > even though leaving IsRunning=1 would be ok from a functional perspective.
> > > Per Maxim's experiments, a misbehaving VM could spam the AVIC doorbell so
> > > fast as to induce a 50%+ loss in performance.
> > >
> > > Link: https://lore.kernel.org/all/8d7e0d0391df4efc7cb28557297eb2ec9904f1e5.camel@redhat.com
> > > Cc: Maxim Levitsky <mlevitsk@...hat.com>
> > > Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> > > ---
> > > arch/x86/kvm/svm/avic.c | 31 ++++++++++++++++++-------------
> > > 1 file changed, 18 insertions(+), 13 deletions(-)
> > >
> > > diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> > > index bf8b59556373..3cf929ac117f 100644
> > > --- a/arch/x86/kvm/svm/avic.c
> > > +++ b/arch/x86/kvm/svm/avic.c
> > > @@ -1121,19 +1121,24 @@ void avic_vcpu_blocking(struct kvm_vcpu *vcpu)
> > > if (!kvm_vcpu_apicv_active(vcpu))
> > > return;
> > >
> > > - /*
> > > - * Unload the AVIC when the vCPU is about to block, _before_
> > > - * the vCPU actually blocks.
> > > - *
> > > - * Any IRQs that arrive before IsRunning=0 will not cause an
> > > - * incomplete IPI vmexit on the source, therefore vIRR will also
> > > - * be checked by kvm_vcpu_check_block() before blocking. The
> > > - * memory barrier implicit in set_current_state orders writing
> > > - * IsRunning=0 before reading the vIRR. The processor needs a
> > > - * matching memory barrier on interrupt delivery between writing
> > > - * IRR and reading IsRunning; the lack of this barrier might be
> > > - * the cause of errata #1235).
> > > - */
> > > + /*
> > > + * Unload the AVIC when the vCPU is about to block, _before_ the vCPU
> > > + * actually blocks.
> > > + *
> > > + * Note, any IRQs that arrive before IsRunning=0 will not cause an
> > > + * incomplete IPI vmexit on the source; kvm_vcpu_check_block() handles
> > > + * this by checking vIRR one last time before blocking. The memory
> > > + * barrier implicit in set_current_state orders writing IsRunning=0
> > > + * before reading the vIRR. The processor needs a matching memory
> > > + * barrier on interrupt delivery between writing IRR and reading
> > > + * IsRunning; the lack of this barrier might be the cause of errata #1235).
> > > + *
> > > + * Clear IsRunning=0 even if guest IRQs are disabled, i.e. even if KVM
> > > + * doesn't need to detect events for scheduling purposes. The doorbell
> >
> > Nit: just IsRunning (you can drop the =0 part).
>
> Hmm, not really. It could be:
>
> /* Note, any IRQs that arrive while IsRunning is set will not cause an
>
> or
>
> /* Note, any IRQs that arrive while IsRunning=1 will not cause an
>
> but that's just regurgitating the spec. The slightly more interesting scenario
> that's being described here is what will happen if an IRQ arrives _just_ before
> the below code toggle IsRunning from 1 => 0.
Oh, you're right. I was referring to the last paragraph starting with
"Clear IsRunning=0 even if", which to me feels like a double negation.
It reads better if it is "Clear IsRunning even if ..."
>
> > Trying to understand the significance of IRQs being disabled here. Is
> > that a path KVM tries to optimize?
>
> Yep. KVM doesn't need a notification for the undelivered (virtual) IRQ, because
> it won't be handled by the vCPU until the vCPU enables IRQs, and thus it's not a
> valid wake event for the vCPU.
>
> So, *if* spurious doorbells didn't affect performance or functionality, then
> ideally KVM would leave IsRunning=1, e.g. so that the IOMMU doesn't need to
> generate GA log events, and so that other vCPUs aren't forced to VM-Exit when
> sending an IPI. Unfortunately, spurious doorbells are quite intrusive, and so
> KVM "needs" to clear IsRunning.
Ok, got that. kvm_arch_vcpu_blocking() looks like it exists precisely
for this reason.
Thanks,
Naveen
Powered by blists - more mailing lists