lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 23 Sep 2021 18:27:03 +0200 From: Paolo Bonzini <pbonzini@...hat.com> To: Sean Christopherson <seanjc@...gle.com> Cc: Vitaly Kuznetsov <vkuznets@...hat.com>, Wanpeng Li <wanpengli@...cent.com>, Jim Mattson <jmattson@...gle.com>, Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org, linux-kernel@...r.kernel.org, Sergey Senozhatsky <senozhatsky@...gle.com>, Ben Gardon <bgardon@...gle.com> Subject: Re: [PATCH] KVM: x86/mmu: Complete prefetch for trailing SPTEs for direct, legacy MMU On 19/08/21 01:56, Sean Christopherson wrote: > Make a final call to direct_pte_prefetch_many() if there are "trailing" > SPTEs to prefetch, i.e. SPTEs for GFNs following the faulting GFN. The > call to direct_pte_prefetch_many() in the loop only handles the case > where there are !PRESENT SPTEs preceding a PRESENT SPTE. > > E.g. if the faulting GFN is a multiple of 8 (the prefetch size) and all > SPTEs for the following GFNs are !PRESENT, the loop will terminate with > "start = sptep+1" and not prefetch any SPTEs. > > Prefetching trailing SPTEs as intended can drastically reduce the number > of guest page faults, e.g. accessing the first byte of every 4kb page in > a 6gb chunk of virtual memory, in a VM with 8gb of preallocated memory, > the number of pf_fixed events observed in L0 drops from ~1.75M to <0.27M. > > Note, this only affects memory that is backed by 4kb pages as KVM doesn't > prefetch when installing hugepages. Shadow paging prefetching is not > affected as it does not batch the prefetches due to the need to process > the corresponding guest PTE. The TDP MMU is not affected because it > doesn't have prefetching, yet... > > Fixes: 957ed9effd80 ("KVM: MMU: prefetch ptes when intercepted guest #PF") > Cc: Sergey Senozhatsky <senozhatsky@...gle.com> > Cc: Ben Gardon <bgardon@...gle.com> > Signed-off-by: Sean Christopherson <seanjc@...gle.com> > --- > > Cc'd Ben as this highlights a potential gap with the TDP MMU, which lacks > prefetching of any sort. For large VMs, which are likely backed by > hugepages anyways, this is a non-issue as the benefits of holding mmu_lock > for read likely masks the cost of taking more VM-Exits. But VMs with a > small number of vCPUs won't benefit as much from parallel page faults, > e.g. there's no benefit at all if there's a single vCPU. > > arch/x86/kvm/mmu/mmu.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index a272ccbddfa1..daf7df35f788 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -2818,11 +2818,13 @@ static void __direct_pte_prefetch(struct kvm_vcpu *vcpu, > if (!start) > continue; > if (direct_pte_prefetch_many(vcpu, sp, start, spte) < 0) > - break; > + return; > start = NULL; > } else if (!start) > start = spte; > } > + if (start) > + direct_pte_prefetch_many(vcpu, sp, start, spte); > } > > static void direct_pte_prefetch(struct kvm_vcpu *vcpu, u64 *sptep) > Queued, thanks. Paolo
Powered by blists - more mailing lists