[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250919005955.1366256-10-seanjc@google.com>
Date: Thu, 18 Sep 2025 17:59:55 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Sean Christopherson <seanjc@...gle.com>, Paolo Bonzini <pbonzini@...hat.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: [PATCH 9/9] KVM: nVMX: Add an off-by-default module param to WARN on
missed consistency checks
Add an off-by-default param, "warn_on_missed_cc", to have KVM WARN on a
missed VMX Consistency Check on nested VM-Enter, specifically so that KVM
developers and maintainers can more easily detect missing checks. KVM's
goal/intent is that KVM detect *all* VM-Fail conditions in software, as
relying on hardware leads to false passes when KVM's nested support is a
subset of hardware support, e.g. see commit 095686e6fcb4 ("KVM: nVMX:
Check vmcs12->guest_ia32_debugctl on nested VM-Enter").
With one notable exception, KVM now detects all VM-Fail scenarios for
which there is known test coverage, i.e. KVM developers can enable the
param and expect a clean run, and thus can use the param to detect missed
checks, e.g. when enabling new features, when writing new tests, etc.
The one exception is an unfortunate consistency check on vTPR. Because
the vTPR for L2 comes from the virtual APIC page provided by L1, L2's vTPR
is fully writable at all times, i.e. is inherently subject to TOCTOU
issues with respect to checks in software versus consumption in hardware.
Further complicating matters is KVM's deferred handling of vmcs12 pages
when loading nested state; KVM flat out cannot check vTPR during
KVM_SET_NESTED_STATE without breaking setups that do on-demand paging,
e.g. for live migration and/or live update.
To fudge around the vTPR issue, add a "late" controls check for vTPR and
also treat an invalid virtual APIC as VM-Fail, but gate the check on
warn_on_missed_cc being enabled to avoid unwanted false positives, i.e. to
avoid breaking KVM in production.
Cc: Jim Mattson <jmattson@...gle.com>
Signed-off-by: Sean Christopherson <seanjc@...gle.com>
---
arch/x86/kvm/vmx/nested.c | 43 +++++++++++++++++++++++++++++++++++++++
1 file changed, 43 insertions(+)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index a1ffaccf317d..a9f48493ad72 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -23,6 +23,9 @@
static bool __read_mostly enable_shadow_vmcs = 1;
module_param_named(enable_shadow_vmcs, enable_shadow_vmcs, bool, S_IRUGO);
+static bool __ro_after_init warn_on_missed_cc;
+module_param(warn_on_missed_cc, bool, 0444);
+
#define CC KVM_NESTED_VMENTER_CONSISTENCY_CHECK
/*
@@ -3011,6 +3014,38 @@ static int nested_vmx_check_controls(struct kvm_vcpu *vcpu,
return 0;
}
+static int nested_vmx_check_controls_late(struct kvm_vcpu *vcpu,
+ struct vmcs12 *vmcs12)
+{
+ void *vapic = to_vmx(vcpu)->nested.virtual_apic_map.hva;
+ u32 vtpr = vapic ? (*(u32 *)(vapic + APIC_TASKPRI)) >> 4 : 0;
+
+ /*
+ * Don't bother with the consistency checks if KVM isn't configured to
+ * WARN on missed consistency checks, as KVM needs to rely on hardware
+ * to fully detect an illegal vTPR vs. TRP Threshold combination due to
+ * the vTPR being writable by L1 at all times (it's an in-memory value,
+ * not a VMCS field). I.e. even if the check passes now, it might fail
+ * at the actual VM-Enter.
+ *
+ * Keying off the module param also allows treating an invalid vAPIC
+ * mapping as a consistency check failure without increasing the risk
+ * of breaking a "real" VM.
+ */
+ if (!warn_on_missed_cc)
+ return 0;
+
+ if ((exec_controls_get(to_vmx(vcpu)) & CPU_BASED_TPR_SHADOW) &&
+ nested_cpu_has(vmcs12, CPU_BASED_TPR_SHADOW) &&
+ !nested_cpu_has_vid(vmcs12) &&
+ !nested_cpu_has2(vmcs12, SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES) &&
+ (CC(!vapic) ||
+ CC((vmcs12->tpr_threshold & GENMASK(3, 0)) > (vtpr & GENMASK(3, 0)))))
+ return -EINVAL;
+
+ return 0;
+}
+
static int nested_vmx_check_address_space_size(struct kvm_vcpu *vcpu,
struct vmcs12 *vmcs12)
{
@@ -3486,6 +3521,11 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
return NVMX_VMENTRY_KVM_INTERNAL_ERROR;
}
+ if (nested_vmx_check_controls_late(vcpu, vmcs12)) {
+ vmx_switch_vmcs(vcpu, &vmx->vmcs01);
+ return NVMX_VMENTRY_VMFAIL;
+ }
+
if (nested_vmx_check_guest_state(vcpu, vmcs12,
&entry_failure_code)) {
exit_reason.basic = EXIT_REASON_INVALID_STATE;
@@ -4938,6 +4978,9 @@ void __nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
*/
WARN_ON_ONCE(vmcs_read32(VM_INSTRUCTION_ERROR) !=
VMXERR_ENTRY_INVALID_CONTROL_FIELD);
+
+ /* VM-Fail at VM-Entry means KVM missed a consistency check. */
+ WARN_ON_ONCE(warn_on_missed_cc);
}
/*
--
2.51.0.470.ga7dc726c21-goog
Powered by blists - more mailing lists