lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZRWdrtHynEbtQnpZ@google.com>
Date:   Thu, 28 Sep 2023 08:37:18 -0700
From:   Sean Christopherson <seanjc@...gle.com>
To:     Maxim Levitsky <mlevitsk@...hat.com>
Cc:     kvm@...r.kernel.org, Will Deacon <will@...nel.org>,
        Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Paolo Bonzini <pbonzini@...hat.com>, x86@...nel.org,
        Robin Murphy <robin.murphy@....com>, iommu@...ts.linux.dev,
        Ingo Molnar <mingo@...hat.com>, Joerg Roedel <joro@...tes.org>,
        "H. Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 5/5] x86: KVM: SVM: workaround for AVIC's errata #1235

KVM: SVM: for the shortlog scope (applies to all relevant patches in this series)

On Thu, Sep 28, 2023, Maxim Levitsky wrote:
> On Zen2 (and likely on Zen1 as well), AVIC doesn't reliably detect a change
> in the 'is_running' bit during ICR write emulation and might skip a
> VM exit, if that bit was recently cleared.
> 
> The absence of the VM exit, leads to the KVM not waking up / triggering
> nested vm exit on the target(s) of the IPI which can, in some cases,
> lead to an unbounded delays in the guest execution.
> 
> As I recently discovered, a reasonable workaround exists: make the KVM

Nit, please just write "KVM", not "the KVM".  KVM is a proper noun when used in
this way, e.g. saying "the KVM" is like saying "the Sean" or "the Maxim".

> never set the is_running bit.
> 
> This workaround ensures that (*) all ICR writes always cause a VM exit
> and therefore correctly emulated, in expense of never enjoying VM exit-less
> ICR emulation.

This breaks svm_ir_list_add(), which relies on the vCPU's entry being up-to-date
and marked running to detect that IOMMU needs to be immediately pointed at the
current pCPU.

	/*
	 * Update the target pCPU for IOMMU doorbells if the vCPU is running.
	 * If the vCPU is NOT running, i.e. is blocking or scheduled out, KVM
	 * will update the pCPU info when the vCPU awkened and/or scheduled in.
	 * See also avic_vcpu_load().
	 */
	entry = READ_ONCE(*(svm->avic_physical_id_cache));
	if (entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK)
		amd_iommu_update_ga(entry & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK,
				    true, pi->ir_data);

> This workaround does carry a performance penalty but according to my
> benchmarks is still much better than not using AVIC at all,
> because AVIC is still used for the receiving end of the IPIs, and for the
> posted interrupts.

I really, really don't like the idea of carrying a workaround like this in
perpetuity.  If there is a customer that is determined to enable AVIC on Zen1/Zen2,
then *maybe* it's something to consider, but I don't think we should carry this
if the only anticipated beneficiary is one-off users and KVM developers.  IMO, the
AVIC code is complex enough as it is.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ