lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 29 Mar 2019 16:14:34 +0100
From:   Vitaly Kuznetsov <vkuznets@...hat.com>
To:     Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org
Cc:     Radim Krčmář <rkrcmar@...hat.com>,
        Liran Alon <liran.alon@...cle.com>,
        Sean Christopherson <sean.j.christopherson@...el.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC] KVM: x86: vmx: throttle immediate exit through preemtion timer to assist buggy guests

Paolo Bonzini <pbonzini@...hat.com> writes:

> On 29/03/19 15:40, Vitaly Kuznetsov wrote:
>> Paolo Bonzini <pbonzini@...hat.com> writes:
>> 
>>> On 28/03/19 21:31, Vitaly Kuznetsov wrote:
>>>>
>>>> The 'hang' scenario develops like this:
>>>> 1) Hyper-V boots and QEMU is trying to inject two irq simultaneously. One
>>>>  of them is level-triggered. KVM injects the edge-triggered one and
>>>>  requests immediate exit to inject the level-triggered:
>>>>
>>>>  kvm_set_irq:          gsi 23 level 1 source 0
>>>>  kvm_msi_set_irq:      dst 0 vec 80 (Fixed|physical|level)
>>>>  kvm_apic_accept_irq:  apicid 0 vec 80 (Fixed|edge)
>>>>  kvm_msi_set_irq:      dst 0 vec 96 (Fixed|physical|edge)
>>>>  kvm_apic_accept_irq:  apicid 0 vec 96 (Fixed|edge)
>>>>  kvm_nested_vmexit_inject: reason EXTERNAL_INTERRUPT info1 0 info2 0 int_info 80000060 int_info_err 0
>>>>
>>>> 2) Hyper-V requires one of its VMs to run to handle the situation but
>>>>  immediate exit happens:
>>>>
>>>>  kvm_entry:            vcpu 0
>>>>  kvm_exit:             reason VMRESUME rip 0xfffff80006a40115 info 0 0
>>>>  kvm_entry:            vcpu 0
>>>>  kvm_exit:             reason PREEMPTION_TIMER rip 0xfffff8022f3d8350 info 0 0
>>>>  kvm_nested_vmexit:    rip fffff8022f3d8350 reason PREEMPTION_TIMER info1 0 info2 0 int_info 0 int_info_err 0
>>>>  kvm_nested_vmexit_inject: reason EXTERNAL_INTERRUPT info1 0 info2 0 int_info 80000050 int_info_err 0
>>>
>>> I supposed before this there was an eoi for vector 96?
>> 
>> AFAIR: no, it seems that it is actually the VM it is trying to resume
>> (Windows partition?) which needs to do some work and with the preemtion
>> timer of 0 we don't allow it to.
>
> kvm_apic_accept_irq placed IRQ 96 in IRR, and Hyper-V should be running
> with "acknowledge interrupt on exit" since int_info is nonzero in
> kvm_nested_vmexit_inject.
>
> Therefore, at the kvm_nested_vmexit_inject tracepoint KVM should have
> set bit 96 in ISR; and because PPR is now 96, interrupt 80 should have
> never been delivered.  Unless 96 is an auto-EOI interrupt, in which case
> this comment would apply
>
>           /*
>            * For auto-EOI interrupts, there might be another pending
>            * interrupt above PPR, so check whether to raise another
>            * KVM_REQ_EVENT.
>            */
>
> IIRC there was an enlightenment to tell Windows "I support auto-EOI but
> please don't use it".  If this is what's happening, that would also fix it.
>

Oh, that's actually an interesting thought, thanks!

Indeed, there is CPUID 0x40000004.EAX Bit 9: Recommend deprecating
AutoEOI which we don't currently set. I'll try and report back.

-- 
Vitaly

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ