lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 17 Feb 2022 16:03:40 -0500
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Cc:     seanjc@...gle.com
Subject: [PATCH v2 18/18] KVM: x86: do not unload MMU roots on all role changes

kvm_mmu_reset_context is called on all role changes and right now it
calls kvm_mmu_unload.  With the legacy MMU this is a relatively cheap
operation; the previous PGDs remains in the hash table and is picked
up immediately on the next page fault.  With the TDP MMU, however, the
roots are thrown away for good and a full rebuild of the page tables is
necessary, which is many times more expensive.

Fortunately, throwing away the roots is not necessary except when
the manual says a TLB flush is required:

- changing CR0.PG from 1 to 0 (because it flushes the TLB according to
  the x86 architecture specification)

- changing CPUID (which changes the interpretation of page tables in
  ways not reflected by the role).

- changing CR4.SMEP from 0 to 1 (not doing so actually breaks access.c)

Except for these cases, once the MMU has updated the CPU/MMU roles
and metadata it is enough to force-reload the current value of CR3.
KVM will look up the cached roots for an entry with the right role and
PGD, and only if the cache misses a new root will be created.

Measuring with vmexit.flat from kvm-unit-tests shows the following
improvement:

             TDP         legacy       shadow
   before    46754       5096         5150
   after     4879        4875         5006

which is for very small page tables.  The impact is however much larger
when running as an L1 hypervisor, because the new page tables cause
extra work for L0 to shadow them.

Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
---
 arch/x86/kvm/mmu/mmu.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index c44b5114f947..913cc7229bf4 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5043,8 +5043,8 @@ EXPORT_SYMBOL_GPL(kvm_init_mmu);
 void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
 {
 	/*
-	 * Invalidate all MMU roles to force them to reinitialize as CPUID
-	 * information is factored into reserved bit calculations.
+	 * Invalidate all MMU roles and roots to force them to reinitialize,
+	 * as CPUID information is factored into reserved bit calculations.
 	 *
 	 * Correctly handling multiple vCPU models with respect to paging and
 	 * physical address properties) in a single VM would require tracking
@@ -5057,6 +5057,7 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	vcpu->arch.root_mmu.mmu_role.ext.valid = 0;
 	vcpu->arch.guest_mmu.mmu_role.ext.valid = 0;
 	vcpu->arch.nested_mmu.mmu_role.ext.valid = 0;
+	kvm_mmu_unload(vcpu);
 	kvm_mmu_reset_context(vcpu);
 
 	/*
@@ -5068,8 +5069,8 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
 
 void kvm_mmu_reset_context(struct kvm_vcpu *vcpu)
 {
-	kvm_mmu_unload(vcpu);
 	kvm_init_mmu(vcpu);
+	kvm_make_request(KVM_REQ_MMU_UPDATE_ROOT, vcpu);
 }
 EXPORT_SYMBOL_GPL(kvm_mmu_reset_context);
 
-- 
2.31.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ