lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20250918162529.640943-1-jon@nutanix.com>
Date: Thu, 18 Sep 2025 09:25:28 -0700
From: Jon Kohler <jon@...anix.com>
To: Sean Christopherson <seanjc@...gle.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
        Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
        "H. Peter Anvin" <hpa@...or.com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Cc: jon@...anix.com, Khushit Shah <khushit.shah@...anix.com>
Subject: [PATCH] KVM: x86: skip userspace IOAPIC EOI exit when Directed EOI is enabled

From: Khushit Shah <khushit.shah@...anix.com>

Problem:
We observed Windows w/ HyperV getting stuck during boot because of
level triggered interrupt storm. This is because KVM currently
does not respect Directed EOI bit set by guest in split-irqchip
mode.

We observed the following ACTUAL sequence on Windows guests with
Directed EOI enabled:
  1. Guest issues an APIC EOI.
  2. The interrupt is injected into L2 and serviced.
  3. Guest issues an IOAPIC EOI.

But, with the current behavior in split-irqchip mode:
  1. Guest issues an APIC EOI.
  2. KVM exits to userspace and QEMU's ioapic_service reasserts the
     interrupt because the line is not yet deasserted.
  3. Steps 1 and 2 keeps looping, and hence no progress is made.
(logs at the bug linked below).

This is because in split-irqchip mode, KVM requests a userspace IOAPIC
EOI exit on every APIC EOI. However, if the guest sets the Directed EOI
bit in the APIC Spurious Interrupt Vector Register (SPIV, bit 12), per
the x2APIC specification, the APIC does not broadcast EOIs to the IOAPIC.
In this case, it is the guest's responsibility to explicitly EOI the
IOAPIC by writing to its EOI register.

kernel-irqchip mode already handles this similarly in
kvm_ioapic_update_eoi_one().

Link: https://lore.kernel.org/kvm/7D497EF1-607D-4D37-98E7-DAF95F099342@nutanix.com/

Signed-off-by: Khushit Shah <khushit.shah@...anix.com>
---
 arch/x86/kvm/lapic.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 0725d2cae742..a81e71ad5bda 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1473,6 +1473,10 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic *apic, int vector)
 
 	/* Request a KVM exit to inform the userspace IOAPIC. */
 	if (irqchip_split(apic->vcpu->kvm)) {
+		/* EOI the ioapic only if the Directed EOI is disabled. */
+		if (kvm_lapic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI)
+			return;
+
 		apic->vcpu->arch.pending_ioapic_eoi = vector;
 		kvm_make_request(KVM_REQ_IOAPIC_EOI_EXIT, apic->vcpu);
 		return;
-- 
2.43.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ