[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <cover.1770116050.git.isaku.yamahata@intel.com>
Date: Tue, 3 Feb 2026 10:16:43 -0800
From: isaku.yamahata@...el.com
To: kvm@...r.kernel.org
Cc: isaku.yamahata@...el.com,
isaku.yamahata@...il.com,
Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <seanjc@...gle.com>,
linux-kernel@...r.kernel.org
Subject: [PATCH 00/32] KVM: VMX APIC timer virtualization support
From: Isaku Yamahata <isaku.yamahata@...el.com>
This patch series implements support for APIC timer virtualization for
VMX and nVMX.
Background
==========
X86 provides the TSC deadline timer as the primary local timer
interrupt source. Currently, KVM intercepts the guest programming of
the timer and emulates it using either the host OS timer or the VMX
preemption timer.
Problem
=======
VMM emulation causes high latency. Some workloads require lower
latency, such as gaming applications, while there have been efforts to
reduce latency in the past, a hardware extension can reduce it further
by eliminating VM Exits.
Solution
========
Hardware Extension
------------------
The APIC timer virtualization [1] allows the guest to directly access
the TSC DEADLINE MSR and receive timer interrupts without VM Exits.
It introduces
- A feature bit in the tertiary processor-based VM-execution controls
- Guest deadline: 64-bit physical deadline (host TSC value)
- Guest deadline shadow: 64-bit virtual deadline (virtualized TSC
value with TSC offset and multiplier)
- Virtual timer vector: interrupt vector to inject on timeout.
Implementation
--------------
Add hooks to the LAPIC timer emulation and implement them in the VMX
backend. Enable the feature when available, falling back to
software/preemption timer in the following cases
One-shot or periodic APIC timer:
The hardware supports only the TSC deadline timer
Masked the timer interrupt in LVTT:
The hardware doesn't respect the emulated LVTT and always generates an
interrupt on timeout.
vCPU blocking/unblocking:
The hardware generates an interrupt while the vCPU is running. The KVM
must wake up from vCPU blocking by getting the latest TSC
deadline and setting a software timer before blocking the vCPU.
VM Entry to L2 vCPU:
If the L1 timer interrupt fires while the L2 vCPU is running, the
expected behavior is a VM Exit from L2 to L1, followed by an interrupt
injection into the L1 vCPU.
nVMX Support
------------
Support nVMX to address the benchmark result below. Emulate related
MSRs and VMCS individually.
MSRs: capability reporting registers of primary/tertiary processor-based
VM-execution controls.
VMCS fields: primary/tertiary VM-execution controls, guest deadline,
guest deadline shadow, and virtual timer vector.
Patch Organization
------------------
The patch is organized into 5 parts as follows.
Patches 1- 8: VMX support (feature probe, hooks to KVM LAPIC, VMX hooks)
Patches 9-18: nVMX support (implement emulation of MSR and VMCS fields)
Patches 19-23: Expose the feature to the user
Patches 24-31: KVM selftests
Patches 32 : Documentation update
Patches for QEMU and KVM unit tests will be posted.
(KVM unit tests turned out test case issue. It needs fixes.)
Test
====
The following tests were conducted: The newly added test case as a
part of KVM selftests, KVM unit tests, and cyclic test included in
rt-tests [2]. Selftests and KVM unit tests were run on platforms with
and without APIC timer virtualization.
Benchmark Results
=================
cyclictest
----------
10-minute run of
cyclictest --quiet --nsecs --smp --mlockall --priority=95 --policy=fifo
# of vCPU: host 256, L1 and L2: 16
Legends:
L1 or L2: cyclic test run as L1/L2 process
Y: feature enabled
N: feature disabled
Run in
| APIC timer virtualization
| | nested APIC timer virtualization
| | | min reduction %
| | | | avg reduction %
| | | | |
L1 N -
L1 Y - 21% 21% (compared to L1 N)
L2 N N
L2 Y N 4% -2% (compared to L2 N N)
L2 Y Y 75% 51% (compared to L2 N N)
Micro benchmark: Timer latency
------------------------------
10-minute run of custom micro benchmark, timer_latency.
# of vCPU: host 256, L1 and L2: 16
Legends:
L1: the benchmark run in L0 Linux.
L2: the benchmark run in L1 Linux.
Y: feature enabled
N: feature disabled
Run as
| APIC timer virtualization
| | nested APIC timer virtualization
| | | HLT or busy
| | | | min reduction %
| | | | | avg reduction %
| | | | | |
L1 N - HLT
L1 Y - HLT 49% 24% (compared to L1 N HLT)
L1 N - busy
L1 Y - busy 63% 61% (compared to L1 N busy)
L2 N N HLT
L2 Y N HLT -19% -3% (compared to L2 N N HLT)
L2 Y Y HLT 99% 27% (compared to L2 N N HLT)
L2 N N busy
L2 Y N busy -5% -4% (compared to L2 N N busy)
L2 Y Y busy 99% 97% (compared to L2 N N busy)
[1] Intel Architecture Instruction Set Extensions and Future Features
September 2025 319433-059
Chapter 8 APIC-TIMER VIRTUALIZATION
https://cdrdv2.intel.com/v1/dl/getContent/671368
[2] rt-tests
https://git.kernel.org/pub/scm/utils/rt-tests/rt-tests.git/
Isaku Yamahata (25):
KVM: x86/lapic: Wire DEADLINE MSR update to guest virtual TSC deadline
KVM: VMX: Update APIC timer virtualization on apicv changed
KVM: nVMX: Disallow/allow guest APIC timer virtualization switch
to/from L2
KVM: nVMX: Pass struct msr_data to VMX MSRs emulation
KVM: nVMX: Supports VMX tertiary controls and GUEST_APIC_TIMER bit
KVM: nVMX: Add tertiary VM-execution control VMCS support
KVM: nVMX: Update intercept on TSC deadline MSR
KVM: nVMX: Handle virtual timer vector VMCS field
KVM: VMX: Make vmx_calc_deadline_l1_to_host() non-static
KVM: nVMX: Enable guest deadline and its shadow VMCS field
KVM: nVMX: Add VM entry checks related to APIC timer virtualization
KVM: nVMX: Add check vmread/vmwrite on tertiary control
KVM: nVMX: Add check VMCS index for guest timer virtualization
KVM: VMX: Advertise tertiary controls to the user space
KVM: VMX: Enable APIC timer virtualization
KVM: nVMX: Introduce module parameter for nested APIC timer
virtualization
KVM: selftests: Add a test to measure local timer latency
KVM: selftests: Add nVMX support to timer_latency test case
KVM: selftests: Add test for nVMX MSR_IA32_VMX_PROCBASED_CTLS3
KVM: selftests: Add test vmx_set_nested_state_test with EVMCS disabled
KVM: selftests: Add tests nested state of APIC timer virtualization
KVM: selftests: Add VMCS access test to APIC timer virtualization
KVM: selftests: Test cases for L1 APIC timer virtualization
KVM: selftests: Add tests for nVMX to vmx_apic_timer_virt
Documentation: KVM: x86: Update documentation of struct vmcs12
Yang Zhong (7):
KVM: VMX: Detect APIC timer virtualization bit
KVM: x86: Implement APIC virt timer helpers with callbacks
KVM: x86/lapic: Start/stop sw/hv timer on vCPU un/block
KVM: x86/lapic: Add a trace point for guest virtual timer
KVM: VMX: Implement the hooks for VMX guest virtual deadline timer
KVM: VMX: dump_vmcs() support the guest virt timer
KVM: VMX: Introduce module parameter for APIC virt timer support
Documentation/virt/kvm/x86/nested-vmx.rst | 13 +-
arch/x86/include/asm/kvm-x86-ops.h | 5 +
arch/x86/include/asm/kvm_host.h | 6 +
arch/x86/include/asm/vmx.h | 6 +
arch/x86/include/asm/vmxfeatures.h | 1 +
arch/x86/kvm/lapic.c | 147 +++-
arch/x86/kvm/lapic.h | 15 +
arch/x86/kvm/trace.h | 16 +
arch/x86/kvm/vmx/capabilities.h | 8 +
arch/x86/kvm/vmx/hyperv.c | 17 +
arch/x86/kvm/vmx/main.c | 5 +
arch/x86/kvm/vmx/nested.c | 215 +++++-
arch/x86/kvm/vmx/nested.h | 33 +-
arch/x86/kvm/vmx/vmcs12.c | 6 +
arch/x86/kvm/vmx/vmcs12.h | 11 +-
arch/x86/kvm/vmx/vmcs_shadow_fields.h | 1 +
arch/x86/kvm/vmx/vmx.c | 142 +++-
arch/x86/kvm/vmx/vmx.h | 7 +-
arch/x86/kvm/vmx/x86_ops.h | 5 +
arch/x86/kvm/x86.c | 8 +-
arch/x86/kvm/x86.h | 2 +-
tools/testing/selftests/kvm/Makefile.kvm | 3 +
.../testing/selftests/kvm/include/x86/apic.h | 2 +
.../selftests/kvm/include/x86/processor.h | 6 +
tools/testing/selftests/kvm/include/x86/vmx.h | 14 +
.../testing/selftests/kvm/x86/timer_latency.c | 700 ++++++++++++++++++
.../kvm/x86/vmx_apic_timer_virt_test.c | 508 +++++++++++++
.../kvm/x86/vmx_apic_timer_virt_vmcs_test.c | 461 ++++++++++++
.../testing/selftests/kvm/x86/vmx_msrs_test.c | 53 ++
.../kvm/x86/vmx_set_nested_state_test.c | 249 +++++++
30 files changed, 2644 insertions(+), 21 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86/timer_latency.c
create mode 100644 tools/testing/selftests/kvm/x86/vmx_apic_timer_virt_test.c
create mode 100644 tools/testing/selftests/kvm/x86/vmx_apic_timer_virt_vmcs_test.c
base-commit: 63804fed149a6750ffd28610c5c1c98cce6bd377
--
2.45.2
Powered by blists - more mailing lists