[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230322013731.102955-1-minipli@grsecurity.net>
Date: Wed, 22 Mar 2023 02:37:25 +0100
From: Mathias Krause <minipli@...ecurity.net>
To: kvm@...r.kernel.org
Cc: linux-kernel@...r.kernel.org,
Sean Christopherson <seanjc@...gle.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Mathias Krause <minipli@...ecurity.net>
Subject: [PATCH v4 0/6] KVM: MMU: performance tweaks for heavy CR0.WP users
v3: https://lore.kernel.org/kvm/20230201194604.11135-1-minipli@grsecurity.net/
This series is the fourth iteration of resurrecting the missing pieces of
Paolo's previous attempt[1] to avoid needless MMU roots unloading.
It's incorporating Sean's feedback to v3 and rebased on top of
kvm-x86/next, namely commit d8708b80fa0e ("KVM: Change return type of
kvm_arch_vm_ioctl() to "int"").
The performance gap between TDP and legacy MMU is still existent,
especially noticeable under grsecurity which implements kernel W^X by
toggling CR0.WP, which happens very frequently. This series tries to fix
this needless performance loss.
Patch 1 is a v3 of [3], addressing Sean's feedback.
Patch 2 implements Sean's feedback[2] to Paolo's original approach and
skips unloading the MMU roots for CR0.WP toggling under TDP.
Patch 3 further micro-optimizes this for non-paging guests -- anyone still
running MS Singularity? ;)
Sean was suggesting another change on top of v2 of this series (and then
again a more elaborate version on top of v3), to skip intercepting CR0.WP
writes completely for VMX[4]. That turned out to be yet another
performance boost and is implemented in patch 6.
Patches 2 and 6 are the most important ones, as they bring the big
performance gains.
I used 'ssdd 10 50000' from rt-tests[5] as a micro-benchmark, running on a
grsecurity L1 VM. Below table shows the results (runtime in seconds, lower
is better):
legacy TDP shadow
kvm-x86/next@...08b 8.43s 9.45s 70.3s
+ patches 1-3 5.39s 5.63s 70.2s
+ patches 4-6 3.51s 3.47s 67.8s
I re-ran the benchmarks (again) and the numbers changed a little from the
ones in v3. We got a better baseline which is likely caused by the rebase
to a v6.3-rc2 based tree (was v6.2-rc3 based before).
Patches 1, 4 and 5 didn't change from v3, beside minor changelog tweaks.
Patches 2 and 6 have been rewritten.
Patch 3 is new to this series.
Bonus rant^Wbug report:
Actually re-running the benchmarks took me a while because my VM was
constantly crashing on me with a #GP during scheduling. Looking a little
closer, I noticed it was for a IA32_PRED_CMD MSR write which was puzzling,
as the VM's kernel didn't change for my tests (built it more than a month
ago!), so the old test runs should have triggered that code path (and #GP)
too! Digging through some kernel code let me see it's all tied to the x86
feature flag X86_FEATURE_USE_IBPB which gets set when X86_FEATURE_IBPB is,
i.e. the CPU supports IBPB.
*head-scratching pause*
Foolish me probably did a system update of the host and got a microcode
update that added IBPB support to my CPU. Yayyy... NOT! As this implies
announcing IBPB support to the VM as well and, in turn, makes the guest
kernel try to use it, I'm doomed to hit that bug. *bummer*
Something must not be right with KVM / QEMU / the kernel, as this guest
behaviour clearly shouldn't cause KVM to inject a #GP into the guest.
The bugging call chain in the guest kernel is:
-> switch_mm_irqs_off()
-> cond_mitigation()
-> indirect_branch_prediction_barrier()
-> alternative_msr_write(MSR_IA32_PRED_CMD, val, X86_FEATURE_USE_IBPB))
So far I'm working around this by passing 'clearcpuid=ibpb' on the kernel
commandline. But this should probably be fixed, that's why I'm mentioning
it. It's just too late here already to debug this further today. Well, for
some definition of "today."
Thanks,
Mathias
[1] https://lore.kernel.org/kvm/20220217210340.312449-1-pbonzini@redhat.com/
[2] https://lore.kernel.org/kvm/YhATewkkO%2Fl4P9UN@google.com/
[3] https://lore.kernel.org/kvm/YhAB1d1%2FnQbx6yvk@google.com/
[4] https://lore.kernel.org/kvm/Y8cTMnyBzNdO5dY3@google.com/
[5] https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git
Mathias Krause (5):
KVM: x86: Do not unload MMU roots when only toggling CR0.WP with TDP
enabled
KVM: x86: Ignore CR0.WP toggles in non-paging mode
KVM: x86: Make use of kvm_read_cr*_bits() when testing bits
KVM: x86/mmu: Fix comment typo
KVM: VMX: Make CR0.WP a guest owned bit
Paolo Bonzini (1):
KVM: x86/mmu: Avoid indirect call for get_cr3
arch/x86/kvm/kvm_cache_regs.h | 2 +-
arch/x86/kvm/mmu/mmu.c | 31 ++++++++++++++++++++-----------
arch/x86/kvm/mmu/paging_tmpl.h | 2 +-
arch/x86/kvm/mmu/spte.c | 2 +-
arch/x86/kvm/pmu.c | 4 ++--
arch/x86/kvm/vmx/nested.c | 4 ++--
arch/x86/kvm/vmx/vmx.c | 6 +++---
arch/x86/kvm/vmx/vmx.h | 18 ++++++++++++++++++
arch/x86/kvm/x86.c | 18 ++++++++++++++++++
9 files changed, 66 insertions(+), 21 deletions(-)
--
2.39.2
Powered by blists - more mailing lists