linux-kernel - Re: [PATCH RFC 10/39] KVM: x86/xen: support upcall vector

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <35165dbc-73d0-21cd-0baf-db4ffb55fc47@oracle.com>
Date:   Wed, 9 Dec 2020 13:26:55 +0000
From:   Joao Martins <joao.m.martins@...cle.com>
To:     David Woodhouse <dwmw2@...radead.org>,
        Ankur Arora <ankur.a.arora@...cle.com>, karahmed@...zon.de
Cc:     Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
        kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC 10/39] KVM: x86/xen: support upcall vector

On 12/9/20 11:39 AM, David Woodhouse wrote:
> On Wed, 2020-12-09 at 10:51 +0000, Joao Martins wrote:
>> Isn't this what the first half of this patch was doing initially (minus the
>> irq routing) ? Looks really similar:
>>
>> https://lore.kernel.org/kvm/20190220201609.28290-11-joao.m.martins@oracle.com/
> 
> Absolutely! This thread is in reply to your original posting of
> precisely that patch, and I've had your tree open in gitk to crib from
> for most of the last week.
> 
I forgot about this patch given all the discussion so far and I had to re-look given that
it resembled me from your snippet. But I ended up being a little pedantic -- sorry about that.

> There's a *reason* why my tree looks like yours, and why more than half
> of the patches in it still show you as being the author :)
> 
Btw, in this patch it would be Ankur's.

More importantly, thanks a lot for picking it up and for all the awesome stuff you're
doing with it.

>> Albeit, I gotta after seeing the irq routing removed it ends much simpler, if we just
>> replace the irq routing with a domain-wide upcall vector ;)
> 
> I like "simpler".
> 
> I also killed the ->cb.queued flag you had because I think it's
> redundant with evtchn_upcall_pending anyway.
> 
Yeap, indeed.

>> Albeit it wouldn't cover the Windows Xen open source drivers which use the EVTCHN method
>> (which is a regular LAPIC vector) [to answer your question on what uses it] For the EVTCHN
>> you can just inject a regular vector through lapic deliver, and guest acks it. Sadly Linux
>> doesn't use it,
>> and if it was the case we would probably get faster upcalls with APICv/AVIC.
> 
> It doesn't need to, because those can just be injected with
> KVM_SIGNAL_MSI.
> 
/me nods

> At most, we just need to make sure that kvm_xen_has_interrupt() returns
> false if the per-vCPU LAPIC vector is configured. But I didn't do that
> because I checked Xen and it doesn't do it either.
> 
Oh! I have this strange recollection that it was, when we were looking at the Xen
implementation.

> As far as I can tell, Xen's hvm_vcpu_has_pending_irq() will still
> return the domain-wide vector in preference to the one in the LAPIC, if
> it actually gets invoked. 

Only if the callback installed is HVMIRQ_callback_vector IIUC.

Otherwise the vector would be pending like any other LAPIC vector.

> And if the guest sets ->evtchn_upcall_pending
> to reinject the IRQ (as Linux does on unmask) Xen won't use the per-
> vCPU vector to inject that; it'll use the domain-wide vector.
> Right -- I don't think Linux even installs a per-CPU upcall LAPIC vector other than the
domain's callback irq.

>>> I'm not sure that condition wasn't *already* broken some cases for
>>> KVM_INTERRUPT anyway. In kvm_vcpu_ioctl_interrupt() we set
>>> vcpu->arch.pending_userspace_vector and we *do* request KVM_REQ_EVENT,
>>> sure.
>>>
>>> But what if vcpu_enter_guest() processes that the first time (and
>>> clears KVM_REQ_EVENT), and then exits for some *other* reason with
>>> interrupts still disabled? Next time through vcpu_enter_guest(), even
>>> though kvm_cpu_has_injectable_intr() is still true, we don't enable the
>>> IRQ windowvmexit because KVM_REQ_EVENT got lost so we don't even call
>>> inject_pending_event().
>>>
>>> So instead of just kvm_xen_has_interrupt() in my patch below, I wonder
>>> if we need to use kvm_cpu_has_injectable_intr() to fix the existing
>>> bug? Or am I missing something there and there isn't a bug after all?
>>
>> Given that the notion of an event channel pending is Xen specific handling, I am not sure
>> we can remove the kvm_xen_has_interrupt()/kvm_xen_get_interrupt() logic. Much of the
>> reason that we ended up checking on vmenter that checks event channels pendings..
> 
> Sure, we definitely need the check I added in vcpu_enter_guest() for
> Xen unless I'm going to come up with a way to set KVM_REQ_EVENT at the
> appropriate time.
> 
> But I'm looking at the ExtINT handling and as far as I can tell it's
> buggy. So I didn't want to use it as a model for setting KVM_REQ_EVENT
> for Xen events.
> 
/me nods