linux-kernel - Re: [PATCH RFC 10/39] KVM: x86/xen: support upcall vector

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1af00fa4-03b8-a059-d859-5cfd71ef10f4@oracle.com>
Date:   Tue, 8 Dec 2020 22:35:45 -0800
From:   Ankur Arora <ankur.a.arora@...cle.com>
To:     David Woodhouse <dwmw2@...radead.org>,
        Joao Martins <joao.m.martins@...cle.com>, karahmed@...zon.de
Cc:     Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
        kvm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC 10/39] KVM: x86/xen: support upcall vector

On 2020-12-08 8:08 a.m., David Woodhouse wrote:
> On Wed, 2020-12-02 at 19:02 +0000, David Woodhouse wrote:
>>
>>> I feel we could just accommodate it as subtype in KVM_XEN_ATTR_TYPE_CALLBACK_VIA.
>>> Don't see the adavantage in having another xen attr type.
>>
>> Yeah, fair enough.
>>
>>> But kinda have mixed feelings in having kernel handling all event channels ABI,
>>> as opposed to only the ones userspace asked to offload. It looks a tad unncessary besides
>>> the added gain to VMMs that don't need to care about how the internals of event channels.
>>> But performance-wise it wouldn't bring anything better. But maybe, the former is reason
>>> enough to consider it.
>>
>> Yeah, we'll see. Especially when it comes to implementing FIFO event
>> channels, I'd rather just do it in one place — and if the kernel does
>> it anyway then it's hardly difficult to hook into that.
>>
>> But I've been about as coherent as I can be in email, and I think we're
>> generally aligned on the direction. I'll do some more experiments and
>> see what I can get working, and what it looks like.
> 
> 
> So... I did some more typing, and revived our completely userspace
> based implementation of event channels. I wanted to declare that such
> was *possible*, and that using the kernel for IPI and VIRQ was just a
> very desirable optimisation.
> 
> It looks like Linux doesn't use the per-vCPU upcall vector that you
> called 'KVM_XEN_CALLBACK_VIA_EVTCHN'. So I'm delivering interrupts via
> KVM_INTERRUPT as if they were ExtINT....
> 
> ... except I'm not. Because the kernel really does expect that to be an
> ExtINT from a legacy PIC, and kvm_apic_accept_pic_intr() only returns
> true if LVT0 is set up for EXTINT and unmasked.
> 
> I messed around with this hack and increasingly desperate variations on
> the theme (since this one doesn't cause the IRQ window to be opened to
> userspace in the first place), but couldn't get anything working:

Increasingly desperate variations,  about sums up my process as well while
trying to get the upcall vector working.

> 
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -2380,6 +2380,9 @@ int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu)
>          if ((lvt0 & APIC_LVT_MASKED) == 0 &&
>              GET_APIC_DELIVERY_MODE(lvt0) == APIC_MODE_EXTINT)
>                  r = 1;
> +       /* Shoot me. */
> +       if (vcpu->arch.pending_external_vector == 243)
> +               r = 1;
>          return r;
>   }
>   
> 
> Eventually I resorted to delivering the interrupt through the lapic
> *anyway* (through KVM_SIGNAL_MSI with an MSI message constructed for
> the appropriate vCPU/vector) and the following hack to auto-EOI:
> 
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -2416,7 +2419,7 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
>           */
>   
>          apic_clear_irr(vector, apic);
> -       if (test_bit(vector, vcpu_to_synic(vcpu)->auto_eoi_bitmap)) {
> +       if (vector == 243 || test_bit(vector, vcpu_to_synic(vcpu)->auto_eoi_bitmap)) {
>                  /*
>                   * For auto-EOI interrupts, there might be another pending
>                   * interrupt above PPR, so check whether to raise another
> 
> 
> That works, and now my guest finishes the SMP bringup (and gets as far
> as waiting on the XenStore implementation that I haven't put back yet).
> 
> So I think we need at least a tiny amount of support in-kernel for
> delivering event channel interrupt vectors, even if we wanted to allow
> for a completely userspace implementation.
> 
> Unless I'm missing something?

I did use the auto_eoi hack as well. So, yeah, I don't see any way of
getting around this.

Also, IIRC we had eventually gotten rid of the auto_eoi approach
because that wouldn't work with APICv. At that point we resorted to
direct queuing for vectored callbacks which was a hack that I never
grew fond of...
  
> I will get on with implementing the in-kernel handling with IRQ routing
> entries targeting a given { port, vcpu }. And I'm kind of vacillating
> about whether the mode/vector should be separately configured, or
> whether they might as well be in the IRQ routing table too, even if
> it's kind of redundant because it's specified the same for *every* port
> targeting the same vCPU. I *think* I prefer that redundancy over having
> a separate configuration mechanism to set the vector for each vCPU. But
> we'll see what happens when my fingers do the typing...
> 

Good luck to your fingers!

Ankur