linux-kernel - Re: [PATCH RFC] KVM: x86: vmx: throttle immediate exit through preemtion timer to assist buggy guests

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87d0m93frp.fsf@vitty.brq.redhat.com>
Date:   Fri, 29 Mar 2019 15:40:42 +0100
From:   Vitaly Kuznetsov <vkuznets@...hat.com>
To:     Paolo Bonzini <pbonzini@...hat.com>, kvm@...r.kernel.org
Cc:     Radim Krčmář <rkrcmar@...hat.com>,
        Liran Alon <liran.alon@...cle.com>,
        Sean Christopherson <sean.j.christopherson@...el.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC] KVM: x86: vmx: throttle immediate exit through preemtion timer to assist buggy guests

Paolo Bonzini <pbonzini@...hat.com> writes:

> On 28/03/19 21:31, Vitaly Kuznetsov wrote:
>> 
>> The 'hang' scenario develops like this:
>> 1) Hyper-V boots and QEMU is trying to inject two irq simultaneously. One
>>  of them is level-triggered. KVM injects the edge-triggered one and
>>  requests immediate exit to inject the level-triggered:
>> 
>>  kvm_set_irq:          gsi 23 level 1 source 0
>>  kvm_msi_set_irq:      dst 0 vec 80 (Fixed|physical|level)
>>  kvm_apic_accept_irq:  apicid 0 vec 80 (Fixed|edge)
>>  kvm_msi_set_irq:      dst 0 vec 96 (Fixed|physical|edge)
>>  kvm_apic_accept_irq:  apicid 0 vec 96 (Fixed|edge)
>>  kvm_nested_vmexit_inject: reason EXTERNAL_INTERRUPT info1 0 info2 0 int_info 80000060 int_info_err 0
>> 
>> 2) Hyper-V requires one of its VMs to run to handle the situation but
>>  immediate exit happens:
>> 
>>  kvm_entry:            vcpu 0
>>  kvm_exit:             reason VMRESUME rip 0xfffff80006a40115 info 0 0
>>  kvm_entry:            vcpu 0
>>  kvm_exit:             reason PREEMPTION_TIMER rip 0xfffff8022f3d8350 info 0 0
>>  kvm_nested_vmexit:    rip fffff8022f3d8350 reason PREEMPTION_TIMER info1 0 info2 0 int_info 0 int_info_err 0
>>  kvm_nested_vmexit_inject: reason EXTERNAL_INTERRUPT info1 0 info2 0 int_info 80000050 int_info_err 0
>
> I supposed before this there was an eoi for vector 96?

AFAIR: no, it seems that it is actually the VM it is trying to resume
(Windows partition?) which needs to do some work and with the preemtion
timer of 0 we don't allow it to.

>
> The main issue with your patch is that the preemption timer is buggy on
> some processors (it runs too fast) and on those processors we shouldn't
> use it with nonzero deadline.  In particular because it runs too fast it
> may not hide the bug.
>
> I think level-triggered interrupts are required for the bug to show.
> Edge-triggered interrupts usually have to be acknowledged with a device
> register before the host device will trigger another interrupt; or at
> least the interrupt event, for example an incoming network packet, must
> happen again.  This way, when the guest hangs it puts some back pressure
> on the host.
>
> I think we should do in QEMU the same fix that was done in the in-kernel
> IOAPIC.

Yes, I have this in my plan. Stay tuned!

-- 
Vitaly