[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240928153302.92406-1-pbonzini@redhat.com>
Date: Sat, 28 Sep 2024 11:33:02 -0400
From: Paolo Bonzini <pbonzini@...hat.com>
To: torvalds@...ux-foundation.org
Cc: linux-kernel@...r.kernel.org,
kvm@...r.kernel.org
Subject: [GIT PULL] KVM/x86 changes for Linux 6.12
Linus,
The following changes since commit da3ea35007d0af457a0afc87e84fddaebc4e0b63:
Linux 6.11-rc7 (2024-09-08 14:50:28 -0700)
are available in the Git repository at:
https://git.kernel.org/pub/scm/virt/kvm/kvm.git tags/for-linus
for you to fetch changes up to efbc6bd090f48ccf64f7a8dd5daea775821d57ec:
Documentation: KVM: fix warning in "make htmldocs" (2024-09-27 11:45:50 -0400)
Apologize for the late pull request; all the traveling made things a
bit messy. Also, we have a known regression here on ancient processors
and will fix it next week.
Paolo
----------------------------------------------------------------
x86:
* KVM currently invalidates the entirety of the page tables, not just
those for the memslot being touched, when a memslot is moved or deleted.
The former does not have particularly noticeable overhead, but Intel's
TDX will require the guest to re-accept private pages if they are
dropped from the secure EPT, which is a non starter. Actually,
the only reason why this is not already being done is a bug which
was never fully investigated and caused VM instability with assigned
GeForce GPUs, so allow userspace to opt into the new behavior.
* Advertise AVX10.1 to userspace (effectively prep work for the "real" AVX10
functionality that is on the horizon).
* Rework common MSR handling code to suppress errors on userspace accesses to
unsupported-but-advertised MSRs. This will allow removing (almost?) all of
KVM's exemptions for userspace access to MSRs that shouldn't exist based on
the vCPU model (the actual cleanup is non-trivial future work).
* Rework KVM's handling of x2APIC ICR, again, because AMD (x2AVIC) splits the
64-bit value into the legacy ICR and ICR2 storage, whereas Intel (APICv)
stores the entire 64-bit value at the ICR offset.
* Fix a bug where KVM would fail to exit to userspace if one was triggered by
a fastpath exit handler.
* Add fastpath handling of HLT VM-Exit to expedite re-entering the guest when
there's already a pending wake event at the time of the exit.
* Fix a WARN caused by RSM entering a nested guest from SMM with invalid guest
state, by forcing the vCPU out of guest mode prior to signalling SHUTDOWN
(the SHUTDOWN hits the VM altogether, not the nested guest)
* Overhaul the "unprotect and retry" logic to more precisely identify cases
where retrying is actually helpful, and to harden all retry paths against
putting the guest into an infinite retry loop.
* Add support for yielding, e.g. to honor NEED_RESCHED, when zapping rmaps in
the shadow MMU.
* Refactor pieces of the shadow MMU related to aging SPTEs in prepartion for
adding multi generation LRU support in KVM.
* Don't stuff the RSB after VM-Exit when RETPOLINE=y and AutoIBRS is enabled,
i.e. when the CPU has already flushed the RSB.
* Trace the per-CPU host save area as a VMCB pointer to improve readability
and cleanup the retrieval of the SEV-ES host save area.
* Remove unnecessary accounting of temporary nested VMCB related allocations.
* Set FINAL/PAGE in the page fault error code for EPT violations if and only
if the GVA is valid. If the GVA is NOT valid, there is no guest-side page
table walk and so stuffing paging related metadata is nonsensical.
* Fix a bug where KVM would incorrectly synthesize a nested VM-Exit instead of
emulating posted interrupt delivery to L2.
* Add a lockdep assertion to detect unsafe accesses of vmcs12 structures.
* Harden eVMCS loading against an impossible NULL pointer deref (really truly
should be impossible).
* Minor SGX fix and a cleanup.
* Misc cleanups
Generic:
* Register KVM's cpuhp and syscore callbacks when enabling virtualization in
hardware, as the sole purpose of said callbacks is to disable and re-enable
virtualization as needed.
* Enable virtualization when KVM is loaded, not right before the first VM
is created. Together with the previous change, this simplifies a
lot the logic of the callbacks, because their very existence implies
virtualization is enabled.
* Fix a bug that results in KVM prematurely exiting to userspace for coalesced
MMIO/PIO in many cases, clean up the related code, and add a testcase.
* Fix a bug in kvm_clear_guest() where it would trigger a buffer overflow _if_
the gpa+len crosses a page boundary, which thankfully is guaranteed to not
happen in the current code base. Add WARNs in more helpers that read/write
guest memory to detect similar bugs.
Selftests:
* Fix a goof that caused some Hyper-V tests to be skipped when run on bare
metal, i.e. NOT in a VM.
* Add a regression test for KVM's handling of SHUTDOWN for an SEV-ES guest.
* Explicitly include one-off assets in .gitignore. Past Sean was completely
wrong about not being able to detect missing .gitignore entries.
* Verify userspace single-stepping works when KVM happens to handle a VM-Exit
in its fastpath.
* Misc cleanups
----------------------------------------------------------------
Amit Shah (1):
KVM: SVM: let alternatives handle the cases when RSB filling is required
Christoph Schlameuss (7):
selftests: kvm: s390: Define page sizes in shared header
selftests: kvm: s390: Add kvm_s390_sie_block definition for userspace tests
selftests: kvm: s390: Add s390x ucontrol test suite with hpage test
selftests: kvm: s390: Add test fixture and simple VM setup tests
selftests: kvm: s390: Add debug print functions
selftests: kvm: s390: Add VM run test case
s390: Enable KVM_S390_UCONTROL config in debug_defconfig
Hariharan Mari (1):
KVM: s390: Fix SORTL and DFLTCC instruction format error in __insn32_query
Ilias Stamatis (1):
KVM: Fix coalesced_mmio_has_room() to avoid premature userspace exit
Kai Huang (2):
KVM: VMX: Do not account for temporary memory allocation in ECREATE emulation
KVM: VMX: Also clear SGX EDECCSSA in KVM CPU caps when SGX is disabled
Li Chen (1):
KVM: x86: Use this_cpu_ptr() in kvm_user_return_msr_cpu_online
Maxim Levitsky (1):
KVM: nVMX: Use vmx_segment_cache_clear() instead of open coded equivalent
Paolo Bonzini (12):
Merge tag 'kvm-s390-next-6.12-1' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD
Merge branch 'kvm-memslot-zap-quirk' into HEAD
Merge branch 'kvm-redo-enable-virt' into HEAD
Merge tag 'kvm-x86-generic-6.12' of https://github.com/kvm-x86/linux into HEAD
Merge tag 'kvm-x86-misc-6.12' of https://github.com/kvm-x86/linux into HEAD
Merge tag 'kvm-x86-selftests-6.12' of https://github.com/kvm-x86/linux into HEAD
Merge tag 'kvm-x86-mmu-6.12' of https://github.com/kvm-x86/linux into HEAD
Merge tag 'kvm-x86-pat_vmx_msrs-6.12' of https://github.com/kvm-x86/linux into HEAD
Merge tag 'kvm-x86-svm-6.12' of https://github.com/kvm-x86/linux into HEAD
Merge tag 'kvm-x86-vmx-6.12' of https://github.com/kvm-x86/linux into HEAD
Documentation: KVM: fix warning in "make htmldocs"
Merge remote-tracking branch 'origin/master' into HEAD
Peter Gonda (1):
KVM: selftests: Add SEV-ES shutdown test
Qiang Liu (1):
KVM: VMX: Modify the BUILD_BUG_ON_MSG of the 32-bit field in the vmcs_check16 function
Sean Christopherson (94):
x86/cpu: KVM: Add common defines for architectural memory types (PAT, MTRRs, etc.)
x86/cpu: KVM: Move macro to encode PAT value to common header
KVM: x86: Stuff vCPU's PAT with default value at RESET, not creation
KVM: nVMX: Add a helper to encode VMCS info in MSR_IA32_VMX_BASIC
KVM VMX: Move MSR_IA32_VMX_MISC bit defines to asm/vmx.h
KVM: nVMX: Honor userspace MSR filter lists for nested VM-Enter/VM-Exit
KVM: x86/mmu: Clean up function comments for dirty logging APIs
KVM: SVM: Disallow guest from changing userspace's MSR_AMD64_DE_CFG value
KVM: x86: Move MSR_TYPE_{R,W,RW} values from VMX to x86, as enums
KVM: x86: Rename KVM_MSR_RET_INVALID to KVM_MSR_RET_UNSUPPORTED
KVM: x86: Refactor kvm_x86_ops.get_msr_feature() to avoid kvm_msr_entry
KVM: x86: Rename get_msr_feature() APIs to get_feature_msr()
KVM: x86: Refactor kvm_get_feature_msr() to avoid struct kvm_msr_entry
KVM: x86: Funnel all fancy MSR return value handling into a common helper
KVM: x86: Hoist x86.c's global msr_* variables up above kvm_do_msr_access()
KVM: x86: Suppress failures on userspace access to advertised, unsupported MSRs
KVM: x86: Suppress userspace access failures on unsupported, "emulated" MSRs
KVM: x86: Enforce x2APIC's must-be-zero reserved ICR bits
KVM: x86: Move x2APIC ICR helper above kvm_apic_write_nodecode()
KVM: x86: Re-split x2APIC ICR into ICR+ICR2 for AMD (x2AVIC)
KVM: selftests: Open code vcpu_run() equivalent in guest_printf test
KVM: selftests: Report unhandled exceptions on x86 as regular guest asserts
KVM: selftests: Add x86 helpers to play nice with x2APIC MSR #GPs
KVM: selftests: Skip ICR.BUSY test in xapic_state_test if x2APIC is enabled
KVM: selftests: Test x2APIC ICR reserved bits
KVM: selftests: Verify the guest can read back the x2APIC ICR it wrote
KVM: selftests: Play nice with AMD's AVIC errata
KVM: selftests: Remove unused kvm_memcmp_hva_gva()
KVM: selftests: Always unlink memory regions when deleting (VM free)
KVM: x86/mmu: Decrease indentation in logic to sync new indirect shadow page
KVM: x86/mmu: Drop pointless "return" wrapper label in FNAME(fetch)
KVM: x86/mmu: Reword a misleading comment about checking gpte_changed()
KVM: SVM: Add a helper to convert a SME-aware PA back to a struct page
KVM: SVM: Add host SEV-ES save area structure into VMCB via a union
KVM: SVM: Track the per-CPU host save area as a VMCB pointer
KVM: selftests: Add a test for coalesced MMIO (and PIO on x86)
KVM: Clean up coalesced MMIO ring full check
KVM: selftests: Explicitly include committed one-off assets in .gitignore
KVM: x86: Re-enter guest if WRMSR(X2APIC_ICR) fastpath is successful
KVM: x86: Dedup fastpath MSR post-handling logic
KVM: x86: Exit to userspace if fastpath triggers one on instruction skip
KVM: x86: Reorganize code in x86.c to co-locate vCPU blocking/running helpers
KVM: x86: Add fastpath handling of HLT VM-Exits
KVM: Use dedicated mutex to protect kvm_usage_count to avoid deadlock
KVM: Register cpuhp and syscore callbacks when enabling hardware
KVM: Rename symbols related to enabling virtualization hardware
KVM: Rename arch hooks related to per-CPU virtualization enabling
KVM: MIPS: Rename virtualization {en,dis}abling APIs to match common KVM
KVM: x86: Rename virtualization {en,dis}abling APIs to match common KVM
KVM: Add a module param to allow enabling virtualization when KVM is loaded
KVM: Add arch hooks for enabling/disabling virtualization
x86/reboot: Unconditionally define cpu_emergency_virt_cb typedef
KVM: x86: Register "emergency disable" callbacks when virt is enabled
KVM: x86: Forcibly leave nested if RSM to L2 hits shutdown
KVM: selftests: Verify single-stepping a fastpath VM-Exit exits to userspace
KVM: x86: Move "ack" phase of local APIC IRQ delivery to separate API
KVM: nVMX: Get to-be-acknowledge IRQ for nested VM-Exit at injection site
KVM: nVMX: Suppress external interrupt VM-Exit injection if there's no IRQ
KVM: nVMX: Detect nested posted interrupt NV at nested VM-Exit injection
KVM: x86: Fold kvm_get_apic_interrupt() into kvm_cpu_get_interrupt()
KVM: nVMX: Explicitly invalidate posted_intr_nv if PI is disabled at VM-Enter
KVM: nVMX: Assert that vcpu->mutex is held when accessing secondary VMCSes
KVM: Write the per-page "segment" when clearing (part of) a guest page
KVM: Harden guest memory APIs against out-of-bounds accesses
KVM: x86/mmu: Replace PFERR_NESTED_GUEST_PAGE with a more descriptive helper
KVM: x86/mmu: Trigger unprotect logic only on write-protection page faults
KVM: x86/mmu: Skip emulation on page fault iff 1+ SPs were unprotected
KVM: x86: Retry to-be-emulated insn in "slow" unprotect path iff sp is zapped
KVM: x86: Get RIP from vCPU state when storing it to last_retry_eip
KVM: x86: Store gpa as gpa_t, not unsigned long, when unprotecting for retry
KVM: x86/mmu: Apply retry protection to "fast nTDP unprotect" path
KVM: x86/mmu: Try "unprotect for retry" iff there are indirect SPs
KVM: x86: Move EMULTYPE_ALLOW_RETRY_PF to x86_emulate_instruction()
KVM: x86: Fold retry_instruction() into x86_emulate_instruction()
KVM: x86/mmu: Don't try to unprotect an INVALID_GPA
KVM: x86/mmu: Always walk guest PTEs with WRITE access when unprotecting
KVM: x86/mmu: Move event re-injection unprotect+retry into common path
KVM: x86: Remove manual pfn lookup when retrying #PF after failed emulation
KVM: x86: Check EMULTYPE_WRITE_PF_TO_SP before unprotecting gfn
KVM: x86: Apply retry protection to "unprotect on failure" path
KVM: x86: Update retry protection fields when forcing retry on emulation failure
KVM: x86: Rename reexecute_instruction()=>kvm_unprotect_and_retry_on_failure()
KVM: x86/mmu: Subsume kvm_mmu_unprotect_page() into the and_retry() version
KVM: x86/mmu: Detect if unprotect will do anything based on invalid_list
KVM: x86/mmu: WARN on MMIO cache hit when emulating write-protected gfn
KVM: x86/mmu: Move walk_slot_rmaps() up near for_each_slot_rmap_range()
KVM: x86/mmu: Plumb a @can_yield parameter into __walk_slot_rmaps()
KVM: x86/mmu: Add a helper to walk and zap rmaps for a memslot
KVM: x86/mmu: Honor NEED_RESCHED when zapping rmaps and blocking is allowed
KVM: x86/mmu: Morph kvm_handle_gfn_range() into an aging specific helper
KVM: x86/mmu: Fold mmu_spte_age() into kvm_rmap_age_gfn_range()
KVM: x86/mmu: Add KVM_RMAP_MANY to replace open coded '1' and '1ul' literals
KVM: x86/mmu: Use KVM_PAGES_PER_HPAGE() instead of an open coded equivalent
KVM: VMX: Set PFERR_GUEST_{FINAL,PAGE}_MASK if and only if the GVA is valid
Tao Su (1):
KVM: x86: Advertise AVX10.1 CPUID to userspace
Thorsten Blum (1):
KVM: x86: Optimize local variable in start_sw_tscdeadline()
Vitaly Kuznetsov (3):
KVM: VMX: hyper-v: Prevent impossible NULL pointer dereference in evmcs_load()
KVM: selftests: Move Hyper-V specific functions out of processor.c
KVM: selftests: Re-enable hyperv_evmcs/hyperv_svm_test on bare metal
Xin Li (5):
KVM: VMX: Move MSR_IA32_VMX_BASIC bit defines to asm/vmx.h
KVM: VMX: Track CPU's MSR_IA32_VMX_BASIC as a single 64-bit value
KVM: nVMX: Use macros and #defines in vmx_restore_vmx_basic()
KVM: VMX: Open code VMX preemption timer rate mask in its accessor
KVM: nVMX: Use macros and #defines in vmx_restore_vmx_misc()
Yan Zhao (4):
KVM: x86/mmu: Introduce a quirk to control memslot zap behavior
KVM: selftests: Test slot move/delete with slot zap quirk enabled/disabled
KVM: selftests: Allow slot modification stress test with quirk disabled
KVM: selftests: Test memslot move in memslot_perf_test with quirk disabled
Yongqiang Liu (1):
KVM: SVM: Remove unnecessary GFP_KERNEL_ACCOUNT in svm_set_nested_state()
Yue Haibing (1):
KVM: x86: Remove some unused declarations
Documentation/admin-guide/kernel-parameters.txt | 17 +
Documentation/virt/kvm/api.rst | 31 +-
Documentation/virt/kvm/locking.rst | 32 +-
arch/arm64/kvm/arm.c | 6 +-
arch/loongarch/kvm/main.c | 4 +-
arch/mips/include/asm/kvm_host.h | 4 +-
arch/mips/kvm/mips.c | 8 +-
arch/mips/kvm/vz.c | 8 +-
arch/riscv/kvm/main.c | 4 +-
arch/s390/configs/debug_defconfig | 1 +
arch/s390/kvm/kvm-s390.c | 27 +-
arch/x86/include/asm/cpuid.h | 1 +
arch/x86/include/asm/kvm-x86-ops.h | 6 +-
arch/x86/include/asm/kvm_host.h | 32 +-
arch/x86/include/asm/msr-index.h | 34 +-
arch/x86/include/asm/reboot.h | 2 +-
arch/x86/include/asm/svm.h | 20 +-
arch/x86/include/asm/vmx.h | 40 +-
arch/x86/include/uapi/asm/kvm.h | 1 +
arch/x86/kernel/cpu/mtrr/mtrr.c | 6 +
arch/x86/kvm/cpuid.c | 30 +-
arch/x86/kvm/irq.c | 10 +-
arch/x86/kvm/lapic.c | 84 +-
arch/x86/kvm/lapic.h | 3 +-
arch/x86/kvm/mmu.h | 2 -
arch/x86/kvm/mmu/mmu.c | 558 ++++++-----
arch/x86/kvm/mmu/mmu_internal.h | 5 +-
arch/x86/kvm/mmu/mmutrace.h | 1 +
arch/x86/kvm/mmu/paging_tmpl.h | 63 +-
arch/x86/kvm/mmu/tdp_mmu.c | 6 +-
arch/x86/kvm/reverse_cpuid.h | 8 +
arch/x86/kvm/smm.c | 24 +-
arch/x86/kvm/svm/nested.c | 4 +-
arch/x86/kvm/svm/svm.c | 87 +-
arch/x86/kvm/svm/svm.h | 18 +-
arch/x86/kvm/svm/vmenter.S | 8 +-
arch/x86/kvm/vmx/capabilities.h | 10 +-
arch/x86/kvm/vmx/main.c | 10 +-
arch/x86/kvm/vmx/nested.c | 134 ++-
arch/x86/kvm/vmx/nested.h | 8 +-
arch/x86/kvm/vmx/sgx.c | 2 +-
arch/x86/kvm/vmx/vmx.c | 67 +-
arch/x86/kvm/vmx/vmx.h | 9 +-
arch/x86/kvm/vmx/vmx_onhyperv.h | 8 +
arch/x86/kvm/vmx/vmx_ops.h | 2 +-
arch/x86/kvm/vmx/x86_ops.h | 7 +-
arch/x86/kvm/x86.c | 1006 ++++++++++----------
arch/x86/kvm/x86.h | 31 +-
arch/x86/mm/pat/memtype.c | 36 +-
include/linux/kvm_host.h | 18 +-
tools/testing/selftests/kvm/.gitignore | 4 +
tools/testing/selftests/kvm/Makefile | 4 +
tools/testing/selftests/kvm/coalesced_io_test.c | 236 +++++
tools/testing/selftests/kvm/guest_print_test.c | 19 +-
tools/testing/selftests/kvm/include/kvm_util.h | 28 +-
.../selftests/kvm/include/s390x/debug_print.h | 69 ++
.../selftests/kvm/include/s390x/processor.h | 5 +
tools/testing/selftests/kvm/include/s390x/sie.h | 240 +++++
tools/testing/selftests/kvm/include/x86_64/apic.h | 23 +-
.../testing/selftests/kvm/include/x86_64/hyperv.h | 18 +
.../selftests/kvm/include/x86_64/processor.h | 7 +-
tools/testing/selftests/kvm/lib/kvm_util.c | 85 +-
tools/testing/selftests/kvm/lib/s390x/processor.c | 10 +-
tools/testing/selftests/kvm/lib/x86_64/hyperv.c | 67 ++
tools/testing/selftests/kvm/lib/x86_64/processor.c | 69 +-
.../kvm/memslot_modification_stress_test.c | 19 +-
tools/testing/selftests/kvm/memslot_perf_test.c | 12 +-
tools/testing/selftests/kvm/s390x/cmma_test.c | 7 +-
tools/testing/selftests/kvm/s390x/config | 2 +
tools/testing/selftests/kvm/s390x/debug_test.c | 4 +-
tools/testing/selftests/kvm/s390x/memop.c | 4 +-
tools/testing/selftests/kvm/s390x/tprot.c | 5 +-
tools/testing/selftests/kvm/s390x/ucontrol_test.c | 332 +++++++
.../testing/selftests/kvm/set_memory_region_test.c | 29 +-
tools/testing/selftests/kvm/x86_64/debug_regs.c | 11 +-
tools/testing/selftests/kvm/x86_64/hyperv_evmcs.c | 2 +-
.../testing/selftests/kvm/x86_64/hyperv_svm_test.c | 2 +-
.../testing/selftests/kvm/x86_64/sev_smoke_test.c | 32 +
.../selftests/kvm/x86_64/xapic_state_test.c | 54 +-
.../testing/selftests/kvm/x86_64/xen_vmcall_test.c | 1 +
virt/kvm/coalesced_mmio.c | 31 +-
virt/kvm/kvm_main.c | 281 +++---
82 files changed, 2803 insertions(+), 1452 deletions(-)
Powered by blists - more mailing lists