[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20220423034752.1161007-11-seanjc@google.com>
Date: Sat, 23 Apr 2022 03:47:50 +0000
From: Sean Christopherson <seanjc@...gle.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: Sean Christopherson <seanjc@...gle.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, Ben Gardon <bgardon@...gle.com>,
David Matlack <dmatlack@...gle.com>,
Venkatesh Srinivas <venkateshs@...gle.com>,
Chao Peng <chao.p.peng@...ux.intel.com>
Subject: [PATCH 10/12] DO NOT MERGE: KVM: x86/mmu: Always send !PRESENT faults
down the fast path
Posted for posterity, and to show that it's possible to funnel indirect
page faults down the fast path.
Not-signed-off-by: Sean Christopherson <seanjc@...gle.com>
---
arch/x86/kvm/mmu/mmu.c | 44 +++++++++++++++++++++---------------------
1 file changed, 22 insertions(+), 22 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 744c06bd7017..7ba88907d032 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3006,26 +3006,25 @@ static bool page_fault_can_be_fast(struct kvm_page_fault *fault)
return false;
/*
- * #PF can be fast if:
- *
- * 1. The shadow page table entry is not present and A/D bits are
- * disabled _by KVM_, which could mean that the fault is potentially
- * caused by access tracking (if enabled). If A/D bits are enabled
- * by KVM, but disabled by L1 for L2, KVM is forced to disable A/D
- * bits for L2 and employ access tracking, but the fast page fault
- * mechanism only supports direct MMUs.
- * 2. The shadow page table entry is present, the access is a write,
- * and no reserved bits are set (MMIO SPTEs cannot be "fixed"), i.e.
- * the fault was caused by a write-protection violation. If the
- * SPTE is MMU-writable (determined later), the fault can be fixed
- * by setting the Writable bit, which can be done out of mmu_lock.
+ * Unconditionally send !PRESENT page faults (except for emulated MMIO)
+ * through the fast path. There are two scenarios where the fast path
+ * can resolve the fault. The first is if the fault is spurious, i.e.
+ * a different vCPU has faulted in the page, which applies to all MMUs.
+ * The second scenario is if KVM marked the SPTE !PRESENT for access
+ * tracking (due to lack of EPT A/D bits), in which case KVM can fix
+ * the fault after logging the access.
*/
if (!fault->present)
- return !kvm_ad_enabled();
+ return true;
/*
- * Note, instruction fetches and writes are mutually exclusive, ignore
- * the "exec" flag.
+ * Skip the fast path if the fault is due to a protection violation and
+ * the access isn't a write. Write-protection violations can be fixed
+ * by KVM, e.g. if memory is write-protected for dirty logging, but all
+ * other protection violations are in the domain of a third party, i.e.
+ * either the primary MMU or the guest's page tables, and thus are
+ * extremely unlikely to be resolved by KVM. Note, instruction fetches
+ * and writes are mutually exclusive, ignore the "exec" flag.
*/
return fault->write;
}
@@ -3041,12 +3040,13 @@ fast_pf_fix_direct_spte(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
/*
* Theoretically we could also set dirty bit (and flush TLB) here in
* order to eliminate unnecessary PML logging. See comments in
- * set_spte. But fast_page_fault is very unlikely to happen with PML
- * enabled, so we do not do this. This might result in the same GPA
- * to be logged in PML buffer again when the write really happens, and
- * eventually to be called by mark_page_dirty twice. But it's also no
- * harm. This also avoids the TLB flush needed after setting dirty bit
- * so non-PML cases won't be impacted.
+ * set_spte. But a write-protection violation that can be fixed outside
+ * of mmu_lock is very unlikely to happen with PML enabled, so we don't
+ * do this. This might result in the same GPA to be logged in the PML
+ * buffer again when the write really happens, and eventually to be
+ * sent to mark_page_dirty() twice, but that's a minor performance blip
+ * and not a function issue. This also avoids the TLB flush needed
+ * after setting dirty bit so non-PML cases won't be impacted.
*
* Compare with set_spte where instead shadow_dirty_mask is set.
*/
--
2.36.0.rc2.479.g8af0fa9b8e-goog
Powered by blists - more mailing lists