[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7c862212b92efea218ed542e0db7ddf7627c525c.camel@redhat.com>
Date: Thu, 02 Dec 2021 12:47:21 +0200
From: Maxim Levitsky <mlevitsk@...hat.com>
To: Sean Christopherson <seanjc@...gle.com>
Cc: Marc Zyngier <maz@...nel.org>, Huacai Chen <chenhuacai@...nel.org>,
Aleksandar Markovic <aleksandar.qemu.devel@...il.com>,
Paul Mackerras <paulus@...abs.org>,
Anup Patel <anup.patel@....com>,
Paul Walmsley <paul.walmsley@...ive.com>,
Palmer Dabbelt <palmer@...belt.com>,
Albert Ou <aou@...s.berkeley.edu>,
Christian Borntraeger <borntraeger@...ibm.com>,
Janosch Frank <frankja@...ux.ibm.com>,
Paolo Bonzini <pbonzini@...hat.com>,
James Morse <james.morse@....com>,
Alexandru Elisei <alexandru.elisei@....com>,
Suzuki K Poulose <suzuki.poulose@....com>,
Atish Patra <atish.patra@....com>,
David Hildenbrand <david@...hat.com>,
Cornelia Huck <cohuck@...hat.com>,
Claudio Imbrenda <imbrenda@...ux.ibm.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>,
linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.cs.columbia.edu,
linux-mips@...r.kernel.org, kvm@...r.kernel.org,
kvm-ppc@...r.kernel.org, kvm-riscv@...ts.infradead.org,
linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org,
David Matlack <dmatlack@...gle.com>,
Oliver Upton <oupton@...gle.com>,
Jing Zhang <jingzhangos@...gle.com>
Subject: Re: [PATCH v2 11/43] KVM: Don't block+unblock when halt-polling is
successful
On Thu, 2021-12-02 at 12:20 +0200, Maxim Levitsky wrote:
> On Mon, 2021-11-29 at 17:25 +0000, Sean Christopherson wrote:
> > On Mon, Nov 29, 2021, Maxim Levitsky wrote:
> > > (This thing is that when you tell the IOMMU that a vCPU is not running,
> > > Another thing I discovered that this patch series totally breaks my VMs,
> > > without cpu_pm=on The whole series (I didn't yet bisect it) makes even my
> > > fedora32 VM be very laggy, almost unusable, and it only has one
> > > passed-through device, a nic).
> >
> > Grrrr, the complete lack of comments in the KVM code and the separate paths for
> > VMX vs SVM when handling HLT with APICv make this all way for difficult to
> > understand than it should be.
> >
> > The hangs are likely due to:
> >
> > KVM: SVM: Unconditionally mark AVIC as running on vCPU load (with APICv)
> >
> > If a posted interrupt arrives after KVM has done its final search through the vIRR,
> > but before avic_update_iommu_vcpu_affinity() is called, the posted interrupt will
> > be set in the vIRR without triggering a host IRQ to wake the vCPU via the GA log.
> >
> > I.e. KVM is missing an equivalent to VMX's posted interrupt check for an outstanding
> > notification after switching to the wakeup vector.
> >
> > For now, the least awful approach is sadly to keep the vcpu_(un)blocking() hooks.
> > Unlike VMX's PI support, there's no fast check for an interrupt being posted (KVM
> > would have to rewalk the vIRR), no easy to signal the current CPU to do wakeup (I
> > don't think KVM even has access to the IRQ used by the owning IOMMU), and there's
> > no simplification of load/put code.
>
> I have an idea.
>
> Why do we even use/need the GA log?
> Why not, just disable the 'guest mode' in the iommu and let it sent good old normal interrupt
> when a vCPU is not running, just like we do when we inhibit the AVIC?
>
> GA log makes all devices that share an iommu (there are 4 iommus per package these days,
> some without useful devices) go through a single (!) msi like interrupt,
> which is even for some reason implemented by a threaded IRQ in the linux kernel.
Yep, this gross hack works!
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 958966276d00b8..6136b94f6b5f5e 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -987,8 +987,9 @@ void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
entry |= AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
- avic_update_iommu_vcpu_affinity(vcpu, h_physical_id,
- svm->avic_is_running);
+
+ svm_set_pi_irte_mode(vcpu, svm->avic_is_running);
+ avic_update_iommu_vcpu_affinity(vcpu, h_physical_id, true);
}
void avic_vcpu_put(struct kvm_vcpu *vcpu)
@@ -997,8 +998,9 @@ void avic_vcpu_put(struct kvm_vcpu *vcpu)
struct vcpu_svm *svm = to_svm(vcpu);
entry = READ_ONCE(*(svm->avic_physical_id_cache));
- if (entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK)
- avic_update_iommu_vcpu_affinity(vcpu, -1, 0);
+ if (entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK) {
+ svm_set_pi_irte_mode(vcpu, false);
+ }
entry &= ~AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
>
GA log interrupts almost gone (there are still few because svm_set_pi_irte_mode sets is_running false)
devices works as expected sending normal interrupts unless guest is loaded, then normal interrupts disappear,
as expected.
Best regards,
Maxim Levitsky
>
> Best regards,
> Maxim Levitsky
>
> > If the scheduler were changed to support waking in the sched_out path, then I'd be
> > more inclined to handle this in avic_vcpu_put() by rewalking the vIRR one final
> > time, but for now it's not worth it.
> >
> > > If I apply though only the patch series up to this patch, my fedora VM seems
> > > to work fine, but my windows VM still locks up hard when I run 'LatencyTop'
> > > in it, which doesn't happen without this patch.
> >
> > Buy "run 'LatencyTop' in it", do you mean running something in the Windows guest?
> > The only search results I can find for LatencyTop are Linux specific.
> >
> > > So far the symptoms I see is that on VCPU 0, ISR has quite high interrupt
> > > (0xe1 last time I seen it), TPR and PPR are 0xe0 (although I have seen TPR to
> > > have different values), and IRR has plenty of interrupts with lower priority.
> > > The VM seems to be stuck in this case. As if its EOI got lost or something is
> > > preventing the IRQ handler from issuing EOI.
> > >
> > > LatencyTop does install some form of a kernel driver which likely does meddle
> > > with interrupts (maybe it sends lots of self IPIs?).
> > >
> > > 100% reproducible as soon as I start monitoring with LatencyTop.
> > >
> > > Without this patch it works (or if disabling halt polling),
> >
> > Huh. I assume everything works if you disable halt polling _without_ this patch
> > applied?
> >
> > If so, that implies that successful halt polling without mucking with vCPU IOMMU
> > affinity is somehow problematic. I can't think of any relevant side effects other
> > than timing.
> >
Powered by blists - more mailing lists