[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZgDyxpaf+HgQzYDp@chao-email>
Date: Mon, 25 Mar 2024 11:43:02 +0800
From: Chao Gao <chao.gao@...el.com>
To: Sean Christopherson <seanjc@...gle.com>
CC: Paolo Bonzini <pbonzini@...hat.com>, Lai Jiangshan
<jiangshanlai@...il.com>, "Paul E. McKenney" <paulmck@...nel.org>, "Josh
Triplett" <josh@...htriplett.org>, <kvm@...r.kernel.org>,
<rcu@...r.kernel.org>, <linux-kernel@...r.kernel.org>, Kevin Tian
<kevin.tian@...el.com>, Yan Zhao <yan.y.zhao@...el.com>, Yiwei Zhang
<zzyiwei@...gle.com>
Subject: Re: [PATCH 5/5] KVM: VMX: Always honor guest PAT on CPUs that
support self-snoop
On Fri, Mar 08, 2024 at 05:09:29PM -0800, Sean Christopherson wrote:
>Unconditionally honor guest PAT on CPUs that support self-snoop, as
>Intel has confirmed that CPUs that support self-snoop always snoop caches
>and store buffers. I.e. CPUs with self-snoop maintain cache coherency
>even in the presence of aliased memtypes, thus there is no need to trust
>the guest behaves and only honor PAT as a last resort, as KVM does today.
>
>Honoring guest PAT is desirable for use cases where the guest has access
>to non-coherent DMA _without_ bouncing through VFIO, e.g. when a virtual
>(mediated, for all intents and purposes) GPU is exposed to the guest, along
>with buffers that are consumed directly by the physical GPU, i.e. which
>can't be proxied by the host to ensure writes from the guest are performed
>with the correct memory type for the GPU.
>
>Cc: Yiwei Zhang <zzyiwei@...gle.com>
>Suggested-by: Yan Zhao <yan.y.zhao@...el.com>
>Suggested-by: Kevin Tian <kevin.tian@...el.com>
>Signed-off-by: Sean Christopherson <seanjc@...gle.com>
>---
> arch/x86/kvm/mmu/mmu.c | 8 +++++---
> arch/x86/kvm/vmx/vmx.c | 10 ++++++----
> 2 files changed, 11 insertions(+), 7 deletions(-)
>
>diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
>index 403cd8f914cd..7fa514830628 100644
>--- a/arch/x86/kvm/mmu/mmu.c
>+++ b/arch/x86/kvm/mmu/mmu.c
>@@ -4622,14 +4622,16 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu,
> bool kvm_mmu_may_ignore_guest_pat(void)
> {
> /*
>- * When EPT is enabled (shadow_memtype_mask is non-zero), and the VM
>+ * When EPT is enabled (shadow_memtype_mask is non-zero), the CPU does
>+ * not support self-snoop (or is affected by an erratum), and the VM
> * has non-coherent DMA (DMA doesn't snoop CPU caches), KVM's ABI is to
> * honor the memtype from the guest's PAT so that guest accesses to
> * memory that is DMA'd aren't cached against the guest's wishes. As a
> * result, KVM _may_ ignore guest PAT, whereas without non-coherent DMA,
>- * KVM _always_ ignores guest PAT (when EPT is enabled).
>+ * KVM _always_ ignores or honors guest PAT, i.e. doesn't toggle SPTE
>+ * bits in response to non-coherent device (un)registration.
> */
>- return shadow_memtype_mask;
>+ return !static_cpu_has(X86_FEATURE_SELFSNOOP) && shadow_memtype_mask;
> }
>
> int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
>diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>index 17a8e4fdf9c4..5dc4c24ae203 100644
>--- a/arch/x86/kvm/vmx/vmx.c
>+++ b/arch/x86/kvm/vmx/vmx.c
>@@ -7605,11 +7605,13 @@ static u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio)
>
> /*
> * Force WB and ignore guest PAT if the VM does NOT have a non-coherent
>- * device attached. Letting the guest control memory types on Intel
>- * CPUs may result in unexpected behavior, and so KVM's ABI is to trust
>- * the guest to behave only as a last resort.
>+ * device attached and the CPU doesn't support self-snoop. Letting the
>+ * guest control memory types on Intel CPUs without self-snoop may
>+ * result in unexpected behavior, and so KVM's (historical) ABI is to
>+ * trust the guest to behave only as a last resort.
> */
>- if (!kvm_arch_has_noncoherent_dma(vcpu->kvm))
>+ if (!static_cpu_has(X86_FEATURE_SELFSNOOP) &&
>+ !kvm_arch_has_noncoherent_dma(vcpu->kvm))
> return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT;
W/ this change, guests w/o pass-thru devices can also access UC memory. Locking
UC memory leads to bus lock. So, guests w/o pass-thru devices can potentially
launch DOS attacks on other CPUs on host. isn't it a problem?
Powered by blists - more mailing lists