[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251030224246.3456492-1-seanjc@google.com>
Date: Thu, 30 Oct 2025 15:42:42 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>, Paolo Bonzini <pbonzini@...hat.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Jon Kohler <jon@...anix.com>
Subject: [PATCH 0/4] KVM: x86: Cleanup #MC and XCR0/XSS/PKRU handling
This series is the result of the recent PUCK discussion[*] on optimizing the
XCR0/XSS loads that are currently done on every VM-Enter and VM-Exit.  My
initial thought that swapping XCR0/XSS outside of the fastpath was spot on;
turns out the only reason they're swapped in the fastpath is because of a
hack-a-fix that papered over an egregious #MC handling bug where the kernel #MC
handler would call schedule() from an atomic context.  The resulting #GP due to
trying to swap FPU state with a guest XCR0/XSS was "fixed" by loading the host
values before handling #MCs from the guest.
Thankfully, the #MC mess has long since been cleaned up, so it's once again
safe to swap XCR0/XSS outside of the fastpath (but when IRQs are disabled!).
As for what may be contributing to the SAP HANA performance improvements when
enabling PKU, my instincts again appear to be spot on.  As predicted, the
fastpath savings are ~300 cycles on Intel (~500 on AMD).  I.e. if the guest
is literally doing _nothing_ but generating fastpath exits, it will see a
~%25 improvement.  There's basically zero chance the uplift seen with enabling
PKU is dues to eliding XCR0 loads; my guess is that the guest actualy uses
protection keys to optimize something.
Why does kvm_load_guest_xsave_state() show up in perf?  Probably because it's
the only visible symbol other than vmx_vmexit() (and vmx_vcpu_run() when not
hammering the fastpath).  E.g. running perf top on a running VM instance yields
these numbers with various guest workloads (the middle one is running
mmu_stress_test in the guest, which hammers on mmu_lock in L0).  But other than
doing INVD (handled in the fastpath) in a tight loop, there's no perceived perf
improvement from the guest.
Overhead  Shared Object       Symbol
  15.65%  [kernel]            [k] vmx_vmexit
   6.78%  [kernel]            [k] kvm_vcpu_halt
   5.15%  [kernel]            [k] __srcu_read_lock
   4.73%  [kernel]            [k] kvm_load_guest_xsave_state
   4.69%  [kernel]            [k] __srcu_read_unlock
   4.65%  [kernel]            [k] read_tsc
   4.44%  [kernel]            [k] vmx_sync_pir_to_irr
   4.03%  [kernel]            [k] kvm_apic_has_interrupt
  45.52%  [kernel]            [k] queued_spin_lock_slowpath
  24.40%  [kernel]            [k] vmx_vmexit
   2.84%  [kernel]            [k] queued_write_lock_slowpath
   1.92%  [kernel]            [k] vmx_vcpu_run
   1.40%  [kernel]            [k] vcpu_run
   1.00%  [kernel]            [k] kvm_load_guest_xsave_state
   0.84%  [kernel]            [k] kvm_load_host_xsave_state
   0.72%  [kernel]            [k] mmu_try_to_unsync_pages
   0.68%  [kernel]            [k] __srcu_read_lock
   0.65%  [kernel]            [k] try_get_folio
  17.78%  [kernel]            [k] vmx_vmexit
   5.08%  [kernel]            [k] vmx_vcpu_run
   4.24%  [kernel]            [k] vcpu_run
   4.21%  [kernel]            [k] _raw_spin_lock_irqsave
   2.99%  [kernel]            [k] kvm_load_guest_xsave_state
   2.51%  [kernel]            [k] rcu_note_context_switch
   2.47%  [kernel]            [k] ktime_get_update_offsets_now
   2.21%  [kernel]            [k] kvm_load_host_xsave_state
   2.16%  [kernel]            [k] fput
[*] https://drive.google.com/corp/drive/folders/1DCdvqFGudQc7pxXjM7f35vXogTf9uhD4
Sean Christopherson (4):
  KVM: SVM: Handle #MCs in guest outside of fastpath
  KVM: VMX: Handle #MCs on VM-Enter/TD-Enter outside of the fastpath
  KVM: x86: Load guest/host XCR0 and XSS outside of the fastpath run
    loop
  KVM: x86: Load guest/host PKRU outside of the fastpath run loop
 arch/x86/kvm/svm/svm.c  | 20 ++++++++--------
 arch/x86/kvm/vmx/main.c | 13 ++++++++++-
 arch/x86/kvm/vmx/tdx.c  |  3 ---
 arch/x86/kvm/vmx/vmx.c  |  7 ------
 arch/x86/kvm/x86.c      | 51 ++++++++++++++++++++++++++++-------------
 arch/x86/kvm/x86.h      |  2 --
 6 files changed, 56 insertions(+), 40 deletions(-)
base-commit: 4cc167c50eb19d44ac7e204938724e685e3d8057
-- 
2.51.1.930.gacf6e81ea2-goog
Powered by blists - more mailing lists
 
