[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ms72g0zk.fsf@redhat.com>
Date: Wed, 10 Sep 2025 10:34:07 +0200
From: Vitaly Kuznetsov <vkuznets@...hat.com>
To: Khushit Shah <khushit.shah@...anix.com>
Cc: "seanjc@...gle.com" <seanjc@...gle.com>, "pbonzini@...hat.com"
<pbonzini@...hat.com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Shaju
Abraham <shaju.abraham@...anix.com>
Subject: Re: [BUG] [KVM/VMX] Level triggered interrupts mishandled on
Windows w/ nested virt(Credential Guard) when using split irqchip
Khushit Shah <khushit.shah@...anix.com> writes:
>> On 8 Sep 2025, at 5:12 PM, Vitaly Kuznetsov <vkuznets@...hat.com> wrote:
>>
...
>> Also, I've just recalled I fixed (well, 'workarounded') an issue similar
>> to yours a while ago in QEMU:
>>
>> commit 958a01dab8e02fc49f4fd619fad8c82a1108afdb
>> Author: Vitaly Kuznetsov <vkuznets@...hat.com>
>> Date: Tue Apr 2 10:02:15 2019 +0200
>>
>> ioapic: allow buggy guests mishandling level-triggered interrupts to make progress
>>
>> maybe something has changed and it doesn't work anymore?
>
> This is really interesting, we are facing a very similar issue, but the interrupt storm only occurs when using split-irqchip.
> Using kernel-irqchip, we do not even see consecutive level triggered interrupts of the same vector. From the logs it is
> clear that somehow with kernel-irqchip, L1 passes the interrupt to L2 to service, but with split-irqchip, L1 EOI’s without
> servicing the interrupt. As it is working properly on kernel-irqchip, we can’t really point it as an Hyper-V issue. AFAIK,
> kernel-irqchip setting should be transparent to the guest, can you think of anything that can change this?
The problem I've fixed back then was also only visible with split
irqchip. The reason was:
"""
in-kernel IOAPIC implementation has commit 184564efae4d ("kvm: ioapic: conditionally delay
irq delivery duringeoi broadcast")
"""
so even though the guest cannot really distinguish between in-kernel and
split irqchips, the small differences in implementation can make a big
difference in the observed behavior. In case we re-assert improperly
handled level-triggered interrupt too fast, the guest is not able to
make much progress but if we let it execute for even the tiniest
fraction of time, then the forward progress happens.
I don't exactly know what happens in this particular case but I'd
suggest you try to atrificially delay re-asserting level triggered
interrupts and see what happens.
--
Vitaly
Powered by blists - more mailing lists