[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <73e55aba-dc71-ae36-d491-a6afed844c9a@redhat.com>
Date: Thu, 21 May 2020 23:04:10 +0200
From: Paolo Bonzini <pbonzini@...hat.com>
To: Vitaly Kuznetsov <vkuznets@...hat.com>
Cc: wei.huang2@....com, cavery@...hat.com,
Sean Christopherson <sean.j.christopherson@...el.com>,
Oliver Upton <oupton@...gle.com>,
Jim Mattson <jmattson@...gle.com>,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [PATCH v2 03/22] KVM: SVM: immediately inject INTR vmexit
On 21/05/20 16:08, Paolo Bonzini wrote:
> On 21/05/20 14:50, Vitaly Kuznetsov wrote:
>> Sorry for reporting this late but I just found out that this commit
>> breaks Hyper-V 2016 on KVM on SVM completely (always hangs on boot). I
>> haven't investigated it yet (well, this is Windows, you know...) but
>> what's usually different about Hyper-V is that unlike KVM/Linux it has
>> handlers for some hardware interrupts in the guest and not in the
>> hypervisor.
>
> "Always hangs on boot" is easy. :) At this point I think it's easiest
> to debug it on top of the whole pending SVM patches that remove
> exit_required completely (and exit_required is not coming back anyway).
Ok so there could be two bugs, as the hang seems to happens much earlier
later in the series (try "grep int_ctl:.0x.....1.." on the trace).
As one could guess from the grep, one thing that is certainly different
between KVM and Hyper-V is that Hyper-V injects interrupts using
int_ctl; sometimes it also uses eventinj but presumably it's just
copying it from exitintinfo).
This could cause problems: for example, when L1 wants to inject a
virtual interrupt into L2 that has interrupts disabled or V_TPR >=
V_INTR_PRIO, and KVM also wants to inject an interrupt to L1, then KVM
might end up stomping on Hyper-V's int_ctl. However I cannot think
off-hand of a scenario where this could happen in this case, because
Hyper-V does set EXIT_INTR and therefore we should never get into
enable_irq_window while L2 is running. Still, that's one place where
I'd start adding some trace_printk's.
Also, if a uniprocessor guest also fails, it might be easier to debug.
Paolo
Powered by blists - more mailing lists