[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZRWdrtHynEbtQnpZ@google.com>
Date: Thu, 28 Sep 2023 08:37:18 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Maxim Levitsky <mlevitsk@...hat.com>
Cc: kvm@...r.kernel.org, Will Deacon <will@...nel.org>,
Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
Thomas Gleixner <tglx@...utronix.de>,
Paolo Bonzini <pbonzini@...hat.com>, x86@...nel.org,
Robin Murphy <robin.murphy@....com>, iommu@...ts.linux.dev,
Ingo Molnar <mingo@...hat.com>, Joerg Roedel <joro@...tes.org>,
"H. Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 5/5] x86: KVM: SVM: workaround for AVIC's errata #1235
KVM: SVM: for the shortlog scope (applies to all relevant patches in this series)
On Thu, Sep 28, 2023, Maxim Levitsky wrote:
> On Zen2 (and likely on Zen1 as well), AVIC doesn't reliably detect a change
> in the 'is_running' bit during ICR write emulation and might skip a
> VM exit, if that bit was recently cleared.
>
> The absence of the VM exit, leads to the KVM not waking up / triggering
> nested vm exit on the target(s) of the IPI which can, in some cases,
> lead to an unbounded delays in the guest execution.
>
> As I recently discovered, a reasonable workaround exists: make the KVM
Nit, please just write "KVM", not "the KVM". KVM is a proper noun when used in
this way, e.g. saying "the KVM" is like saying "the Sean" or "the Maxim".
> never set the is_running bit.
>
> This workaround ensures that (*) all ICR writes always cause a VM exit
> and therefore correctly emulated, in expense of never enjoying VM exit-less
> ICR emulation.
This breaks svm_ir_list_add(), which relies on the vCPU's entry being up-to-date
and marked running to detect that IOMMU needs to be immediately pointed at the
current pCPU.
/*
* Update the target pCPU for IOMMU doorbells if the vCPU is running.
* If the vCPU is NOT running, i.e. is blocking or scheduled out, KVM
* will update the pCPU info when the vCPU awkened and/or scheduled in.
* See also avic_vcpu_load().
*/
entry = READ_ONCE(*(svm->avic_physical_id_cache));
if (entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK)
amd_iommu_update_ga(entry & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK,
true, pi->ir_data);
> This workaround does carry a performance penalty but according to my
> benchmarks is still much better than not using AVIC at all,
> because AVIC is still used for the receiving end of the IPIs, and for the
> posted interrupts.
I really, really don't like the idea of carrying a workaround like this in
perpetuity. If there is a customer that is determined to enable AVIC on Zen1/Zen2,
then *maybe* it's something to consider, but I don't think we should carry this
if the only anticipated beneficiary is one-off users and KVM developers. IMO, the
AVIC code is complex enough as it is.
Powered by blists - more mailing lists