linux-kernel - Re: [PATCH v2] KVM: x86: avoid redundant REQ

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:   Sun, 18 Dec 2016 22:03:57 +0100
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     Roman Kagan <rkagan@...tuozzo.com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        Denis Plotnikov <dplotnikov@...tuozzo.com>, den@...tuozzo.com,
        kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] KVM: x86: avoid redundant REQ_EVENT



On 15/12/2016 15:56, Roman Kagan wrote:
> On Thu, Dec 15, 2016 at 03:32:45PM +0100, Paolo Bonzini wrote:
>>
>>
>> On 15/12/2016 15:30, Radim Krčmář wrote:
>>>
>>> One useless round of KVM_REQ_EVENT is not going change nested
>>> performance by much and it is not the only thing we could improve wrt.
>>> TPR ... I would just leave it for now and take care of it when we
>>>  * don't to update PPR at all with APICv -- it is already correct
>>>  * drop the KVM_REQ_EVENT with flex priority, because lower TPR cannot
>>>    unmask an interrupt
>>
>> I agree.  I still don't like the patch very much, because I feel like an
>> explicit state machine ("can KVM_REQ_EVENT do anything?") would be more
>> maintainable.
> 
> We all seem to share that feeling towards this patch :)  That's the
> reason why it was baking here internally for a long time: Denis
> discovered this scenario over a month ago while analyzing the
> performance regressions in KVM against our proprietary hypervisor, but
> pinning down a palatable and safe fix turned out to be a challenge.  
> 
> I think we did our best to stay safe; I agree that it ended up no so
> beautiful indeed.

Yes, and it's fine.  Now the thing to do is to find something that
perhaps looks less safe, but actually shows the underlying invariants
better.  It doesn't even need any kind of state machine, because the
state machine is already implicit in the vmexits.

In particular, KVM_REQ_EVENT is only needed in the following cases:

1) an interrupt is injected (we could optimize self-IPIs but we don't)

2) PPR is lowered so that old-PPR >= IRR and new-PPR < IRR; this can
only happen:

2.1) from a TPR write (CR8-write vmexit, TPR-below-threshold vmexit, TPR
write through MMIO or MSR)

2.2) from an EOI's clearing of a bit in ISR

3) the interrupt window opens


In particular, this list does not include:

- setting a bit in ISR, as in kvm_get_apic_interrupt

- clearing a bit in IRR, again as in kvm_get_apic_interrupt

- anything on EOI besides what apic_update_ppr already does

- changes to PPR that didn't cause a vmexit: if the TPR changes but it
doesn't trigger a vmexit---because of TPR_THRESHOLD on vmx or
update_cr8_intercept on svm---then the TPR change must not have unmasked
an interrupt, and KVM_REQ_EVENT is unnecessary.


So in the end, as in the APICv case but hopefully with less pitfalls,
the extra KVM_REQ_EVENT are not only slowing down the code, but the
apparent extra safety has the cost of obfuscating it.  I'll see if I can
come up with something not too ugly.

Paolo