[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250725220713.264711-6-seanjc@google.com>
Date: Fri, 25 Jul 2025 15:07:05 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Paolo Bonzini <pbonzini@...hat.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
Sean Christopherson <seanjc@...gle.com>
Subject: [GIT PULL] KVM: x86: Misc changes for 6.17
The highlights are the DEBUGCTL.FREEZE_IN_SMM fix from Maxim, Jim's APERF/MPERF
support that has probably made him question the meaning of life, and a big
cleanup of the MSR interception code to ease the pain of adding support for
CET, FRED, and the mediated PMU (and any other features that deal with MSRs).
But the one change that I really want your eyeballs on is that last commit,
"Reject KVM_SET_TSC_KHZ VM ioctl when vCPUs have been created"; it's an ABI
change that could break userspace. AFAICT, it won't affect any (known)
userspace, and restricting the ioctl for all VM types is much simpler than
special casing "secure" TSC guests. Holler if you want a new tag/pull request
without that change; I deliberately kept it dead last specifically so it could
be omitted without any fuss.
The following changes since commit 28224ef02b56fceee2c161fe2a49a0bb197e44f5:
KVM: TDX: Report supported optional TDVMCALLs in TDX capabilities (2025-06-20 14:20:20 -0400)
are available in the Git repository at:
https://github.com/kvm-x86/linux.git tags/kvm-x86-misc-6.17
for you to fetch changes up to dcbe5a466c123a475bb66492749549f09b5cab00:
KVM: x86: Reject KVM_SET_TSC_KHZ VM ioctl when vCPUs have been created (2025-07-14 15:29:33 -0700)
----------------------------------------------------------------
KVM x86 misc changes for 6.17
- Prevert the host's DEBUGCTL.FREEZE_IN_SMM (Intel only) when running the
guest. Failure to honor FREEZE_IN_SMM can bleed host state into the guest.
- Explicitly check vmcs12.GUEST_DEBUGCTL on nested VM-Enter (Intel only) to
prevent L1 from running L2 with features that KVM doesn't support, e.g. BTF.
- Intercept SPEC_CTRL on AMD if the MSR shouldn't exist according to the
vCPU's CPUID model.
- Rework the MSR interception code so that the SVM and VMX APIs are more or
less identical.
- Recalculate all MSR intercepts from the "source" on MSR filter changes, and
drop the dedicated "shadow" bitmaps (and their awful "max" size defines).
- WARN and reject loading kvm-amd.ko instead of panicking the kernel if the
nested SVM MSRPM offsets tracker can't handle an MSR.
- Advertise support for LKGS (Load Kernel GS base), a new instruction that's
loosely related to FRED, but is supported and enumerated independently.
- Fix a user-triggerable WARN that syzkaller found by stuffing INIT_RECEIVED,
a.k.a. WFS, and then putting the vCPU into VMX Root Mode (post-VMXON). Use
the same approach KVM uses for dealing with "impossible" emulation when
running a !URG guest, and simply wait until KVM_RUN to detect that the vCPU
has architecturally impossible state.
- Add KVM_X86_DISABLE_EXITS_APERFMPERF to allow disabling interception of
APERF/MPERF reads, so that a "properly" configured VM can "virtualize"
APERF/MPERF (with many caveats).
- Reject KVM_SET_TSC_KHZ if vCPUs have been created, as changing the "default"
frequency is unsupported for VMs with a "secure" TSC, and there's no known
use case for changing the default frequency for other VM types.
----------------------------------------------------------------
Chao Gao (2):
KVM: x86: Deduplicate MSR interception enabling and disabling
KVM: SVM: Simplify MSR interception logic for IA32_XSS MSR
Jim Mattson (3):
KVM: x86: Replace growing set of *_in_guest bools with a u64
KVM: x86: Provide a capability to disable APERF/MPERF read intercepts
KVM: selftests: Test behavior of KVM_X86_DISABLE_EXITS_APERFMPERF
Kai Huang (1):
KVM: x86: Reject KVM_SET_TSC_KHZ VM ioctl when vCPUs have been created
Maxim Levitsky (3):
KVM: nVMX: Check vmcs12->guest_ia32_debugctl on nested VM-Enter
KVM: VMX: Wrap all accesses to IA32_DEBUGCTL with getter/setter APIs
KVM: VMX: Preserve host's DEBUGCTLMSR_FREEZE_IN_SMM while running the guest
Sean Christopherson (44):
KVM: TDX: Use kvm_arch_vcpu.host_debugctl to restore the host's DEBUGCTL
KVM: x86: Convert vcpu_run()'s immediate exit param into a generic bitmap
KVM: x86: Drop kvm_x86_ops.set_dr6() in favor of a new KVM_RUN flag
KVM: VMX: Allow guest to set DEBUGCTL.RTM_DEBUG if RTM is supported
KVM: VMX: Extract checking of guest's DEBUGCTL into helper
KVM: SVM: Disable interception of SPEC_CTRL iff the MSR exists for the guest
KVM: SVM: Allocate IOPM pages after initial setup in svm_hardware_setup()
KVM: SVM: Don't BUG if setting up the MSR intercept bitmaps fails
KVM: SVM: Tag MSR bitmap initialization helpers with __init
KVM: SVM: Use ARRAY_SIZE() to iterate over direct_access_msrs
KVM: SVM: Kill the VM instead of the host if MSR interception is buggy
KVM: x86: Use non-atomic bit ops to manipulate "shadow" MSR intercepts
KVM: SVM: Massage name and param of helper that merges vmcb01 and vmcb12 MSRPMs
KVM: SVM: Clean up macros related to architectural MSRPM definitions
KVM: nSVM: Use dedicated array of MSRPM offsets to merge L0 and L1 bitmaps
KVM: nSVM: Omit SEV-ES specific passthrough MSRs from L0+L1 bitmap merge
KVM: nSVM: Don't initialize vmcb02 MSRPM with vmcb01's "always passthrough"
KVM: SVM: Add helpers for accessing MSR bitmap that don't rely on offsets
KVM: SVM: Implement and adopt VMX style MSR intercepts APIs
KVM: SVM: Pass through GHCB MSR if and only if VM is an SEV-ES guest
KVM: SVM: Drop "always" flag from list of possible passthrough MSRs
KVM: x86: Move definition of X2APIC_MSR() to lapic.h
KVM: VMX: Manually recalc all MSR intercepts on userspace MSR filter change
KVM: SVM: Manually recalc all MSR intercepts on userspace MSR filter change
KVM: x86: Rename msr_filter_changed() => recalc_msr_intercepts()
KVM: SVM: Rename init_vmcb_after_set_cpuid() to make it intercepts specific
KVM: SVM: Fold svm_vcpu_init_msrpm() into its sole caller
KVM: SVM: Merge "after set CPUID" intercept recalc helpers
KVM: SVM: Drop explicit check on MSRPM offset when emulating SEV-ES accesses
KVM: SVM: Move svm_msrpm_offset() to nested.c
KVM: SVM: Store MSRPM pointer as "void *" instead of "u32 *"
KVM: nSVM: Access MSRPM in 4-byte chunks only for merging L0 and L1 bitmaps
KVM: SVM: Return -EINVAL instead of MSR_INVALID to signal out-of-range MSR
KVM: nSVM: Merge MSRPM in 64-bit chunks on 64-bit kernels
KVM: SVM: Add a helper to allocate and initialize permissions bitmaps
KVM: x86: Simplify userspace filter logic when disabling MSR interception
KVM: selftests: Verify KVM disable interception (for userspace) on filter change
KVM: x86: Drop pending_smi vs. INIT_RECEIVED check when setting MP_STATE
KVM: x86: WARN and reject KVM_RUN if vCPU's MP_STATE is SIPI_RECEIVED
KVM: x86: Move INIT_RECEIVED vs. INIT/SIPI blocked check to KVM_RUN
KVM: x86: Refactor handling of SIPI_RECEIVED when setting MP_STATE
KVM: VMX: Add a macro to track which DEBUGCTL bits are host-owned
KVM: selftests: Expand set of APIs for pinning tasks to a single CPU
KVM: selftests: Convert arch_timer tests to common helpers to pin task
Xin Li (1):
KVM: x86: Advertise support for LKGS
Documentation/virt/kvm/api.rst | 25 +-
arch/x86/include/asm/kvm-x86-ops.h | 3 +-
arch/x86/include/asm/kvm_host.h | 22 +-
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kvm/cpuid.c | 1 +
arch/x86/kvm/lapic.h | 2 +
arch/x86/kvm/svm/nested.c | 128 ++++--
arch/x86/kvm/svm/sev.c | 33 +-
arch/x86/kvm/svm/svm.c | 500 +++++++--------------
arch/x86/kvm/svm/svm.h | 104 ++++-
arch/x86/kvm/vmx/common.h | 2 -
arch/x86/kvm/vmx/main.c | 23 +-
arch/x86/kvm/vmx/nested.c | 27 +-
arch/x86/kvm/vmx/pmu_intel.c | 8 +-
arch/x86/kvm/vmx/tdx.c | 24 +-
arch/x86/kvm/vmx/vmx.c | 284 ++++--------
arch/x86/kvm/vmx/vmx.h | 61 ++-
arch/x86/kvm/vmx/x86_ops.h | 6 +-
arch/x86/kvm/x86.c | 106 +++--
arch/x86/kvm/x86.h | 18 +-
include/uapi/linux/kvm.h | 1 +
tools/include/uapi/linux/kvm.h | 1 +
tools/testing/selftests/kvm/Makefile.kvm | 1 +
tools/testing/selftests/kvm/arch_timer.c | 7 +-
.../selftests/kvm/arm64/arch_timer_edge_cases.c | 23 +-
tools/testing/selftests/kvm/include/kvm_util.h | 31 +-
tools/testing/selftests/kvm/lib/kvm_util.c | 15 +-
tools/testing/selftests/kvm/lib/memstress.c | 2 +-
tools/testing/selftests/kvm/x86/aperfmperf_test.c | 213 +++++++++
.../selftests/kvm/x86/userspace_msr_exit_test.c | 8 +
30 files changed, 930 insertions(+), 750 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86/aperfmperf_test.c
Powered by blists - more mailing lists