[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.02.1305212242540.4799@kaball.uk.xensource.com>
Date: Tue, 21 May 2013 22:50:09 +0100
From: Stefano Stabellini <stefano.stabellini@...citrix.com>
To: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
CC: Stefano Stabellini <stefano.stabellini@...citrix.com>,
David Vrabel <david.vrabel@...rix.com>,
"xen-devel@...ts.xensource.com" <xen-devel@...ts.xensource.com>,
Feng Jin <joe.jin@...cle.com>,
Zhenzhong Duan <zhenzhong.duan@...cle.com>,
Yuval Shaia <yuval.shaia@...cle.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Chien Yen <chien.yen@...cle.com>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocated when
driver load first time
On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
> On Tue, May 21, 2013 at 05:51:02PM +0100, Stefano Stabellini wrote:
> > On Tue, 21 May 2013, Konrad Rzeszutek Wilk wrote:
> > > > Looking at the hypervisor code I couldn't see anything obviously wrong.
> > >
> > > I think the culprit is "physdev_unmap_pirq":
> > >
> > > if ( is_hvm_domain(d) )
> > > {
> > > spin_lock(&d->event_lock);
> > > gdprintk(XENLOG_WARNING,"d%d, pirq: %d is %x %s, irq: %d\n",
> > > d->domain_id, pirq, domain_pirq_to_emuirq(d, pirq),
> > > domain_pirq_to_emuirq(d, pirq) == IRQ_UNBOUND ? "unbound" : "",
> > > domain_pirq_to_irq(d, pirq));
> > >
> > > if ( domain_pirq_to_emuirq(d, pirq) != IRQ_UNBOUND )
> > > ret = unmap_domain_pirq_emuirq(d, pirq);
> > > spin_unlock(&d->event_lock);
> > > if ( domid == DOMID_SELF || ret )
> > > goto free_domain;
> > >
> > > It always tells me unbound:
> > >
> > > (XEN) physdev.c:237:d14 14, pirq: 54 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (XEN) physdev.c:237:d14 14, pirq: 53 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (XEN) physdev.c:237:d14 14, pirq: 52 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (XEN) physdev.c:237:d14 14, pirq: 51 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (XEN) physdev.c:237:d14 14, pirq: 50 is ffffffff
> > > (XEN) irq.c:1873:d14 14, nr_pirqs: 56
> > > (a bit older debug code, so the 'unbound' does not show up here).
> > >
> > > Which means that the call to unmap_domain_pirq_emuirq does not happen.
> > > The checks in unmap_domain_pirq_emuirq also look to be depend
> > > on the code being IRQ_UNBOUND.
> > >
> > > In other words, all of that code looks to only clear things when
> > > they are !IRQ_UNBOUND.
> > >
> > > But the other logic (IRQ_UNBOUND) looks to be missing a removal
> > > in the radix tree:
> > >
> > > if ( emuirq != IRQ_PT )
> > > radix_tree_delete(&d->arch.hvm_domain.emuirq_pirq, emuirq);
> > >
> > > And I think that is what is causing the leak - the radix tree
> > > needs to be pruned? Or perhaps the allocate_pirq should check
> > > the radix tree for IRQ_UNBOUND ones and re-use them?
> >
> > I think that you are looking in the wrong place.
> > The issue is that QEMU doesn't call pt_msi_disable in
> > pt_msgctrl_reg_write if (!val & PCI_MSI_FLAGS_ENABLE).
>
> In my test-case I am not even calling QEMU. I am just doing two hypercalls
> hypercall - get_free_pirq and unmap.
> >
> > The code above is correct as is because it is trying to handle emulated
> > IRQs and MSIs, not real passthrough MSIs. They latter are not added to
> > that radix tree, see physdev_hvm_map_pirq and physdev_map_pirq.
>
> The bug is in the hypervisor. This little patch solves the test-case
> (I hadn't tried to do the PCI passthrough yet)
>
>
> diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> index b0b0c65..b78717a 100644
> --- a/xen/arch/x86/irq.c
> +++ b/xen/arch/x86/irq.c
> @@ -1851,8 +1851,8 @@ static int pirq_guest_force_unbind(struct domain *d, struct pirq *pirq)
> static inline bool_t is_free_pirq(const struct domain *d,
> const struct pirq *pirq)
> {
> - return !pirq || (!pirq->arch.irq && (!is_hvm_domain(d) ||
> - pirq->arch.hvm.emuirq == IRQ_UNBOUND));
> + return !pirq || ((pirq->arch.irq == 0 || (pirq->arch.irq == PIRQ_ALLOCATED)) &&
> + (!is_hvm_domain(d) || pirq->arch.hvm.emuirq == IRQ_UNBOUND));
> }
>
> int get_free_pirq(struct domain *d, int type)
>
>
> The reason is that pirq->arch.irq in PHYSDEVOP_get_free_pirq is set to
> from the value of zero to -1 (PIRQ_ALLOCATED). Then in map_domain_pirq
> we check it first:
>
> 904 old_irq = domain_pirq_to_irq(d, pirq);
> .. snip..
> 1907 if ( (old_irq > 0 && (old_irq != irq) ) ||
>
> and since the 'old_irq' is -1 (or zero), and the irq passed in
> is different, then all checks pass and the value is over-written:
>
> 1988 set_domain_irq_pirq(d, irq, info);
>
> And that is it.
We have to be careful about this: the point of PHYSDEVOP_get_free_pirq is
that Linux can know for sure the pirq that is going to be used to map the
MSI by QEMU. If you modify is_free_pirq that way, suddenly the pirq
could be allocated for something else after Linux called
PHYSDEVOP_get_free_pirq and before QEMU called xc_physdev_map_pirq_msi.
Right now the unmap is supposed to be done by QEMU, not Linux. So I
think that it is "normal" (although counterintuitive) that your little
test works that way.
pirq allocated via PHYSDEVOP_get_free_pirq should be passed to QEMU,
mapped by QEMU, unmapped by QEMU and eventually freed by QEMU.
This is not the bestest interface ever written of course but that's how
it works now.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists