[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.1908151622410.1908@nanos.tec.linutronix.de>
Date: Thu, 15 Aug 2019 16:58:27 +0200 (CEST)
From: Thomas Gleixner <tglx@...utronix.de>
To: Ben Luo <luoben@...ux.alibaba.com>
cc: alex.williamson@...hat.com, linux-kernel@...r.kernel.org,
tao.ma@...ux.alibaba.com, gerry@...ux.alibaba.com,
nanhai.zou@...ux.alibaba.com, linyunsheng@...wei.com
Subject: Re: [PATCH v3 2/3] genirq: introduce update_irq_devid()
Ben,
On Thu, 15 Aug 2019, Ben Luo wrote:
> Sometimes, only the dev_id field of irqaction need to be changed.
> E.g. KVM VM with device passthru via VFIO may switch irq injection
> path between KVM irqfd and userspace eventfd. These two paths
> share the same irq num and handler for the same vector of a device,
s/irq num/interrupt number/
Changelogs are text and should not contain cryptic abbreviations.
> only with different dev_id referencing to different fds' contexts.
>
> So, instead of free/request irq, only update dev_id of irqaction.
Please write functions like this: function_name() so they can be easily
identified in the text as such.
> This narrows the gap for setting up new irq (and irte, if has)
What does that mean: "narrows the gap"
What's the gap and why is it only made smaller and not closed?
> and also gains some performance benefit.
>
> Signed-off-by: Ben Luo <luoben@...ux.alibaba.com>
> Reviewed-by: Liu Jiang <gerry@...ux.alibaba.com>
I prefer to see the 'reviewed-by' as a reply on LKML rather than coming
from some internal process.
> Reviewed-by: Thomas Gleixner <tglx@...utronix.de>
While I reviewed the previous version, I surely did not give a
'Reviewed-by' tag. That tag means that the person did review the patch and
did not find an issue. I surely found issues, right?
> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> index 6f9b20f..a76ef61 100644
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -2064,6 +2064,84 @@ int request_threaded_irq(unsigned int irq, irq_handler_t handler,
> EXPORT_SYMBOL(request_threaded_irq);
>
> /**
> + * update_irq_devid - update irq dev_id to a new one
Can you please name this: irq_update_devid(). We try to have a consistent
name space for new functions.
> + * @irq: Interrupt line to update
> + * @dev_id: A cookie to find the irqaction to update
> + * @new_dev_id: New cookie passed to the handler function
Can you please arrange these in tabular fashion:
* @irq: Interrupt line to update
* @dev_id: A cookie to find the irqaction to update
* @new_dev_id: New cookie passed to the handler function
> + * Sometimes, only the cookie data need to be changed.
> + * Instead of free/request irq, only update dev_id here
> + * Not only to gain some performance benefit, but also
> + * reduce the risk of losing interrupt.
> + *
> + * E.g. irq affinity setting in a VM with VFIO passthru
Again. Please write it out 'interrupt' and everything else.
> + * device is carried out in a free-then-request-irq way
> + * since lack of this very function. The free_irq()
That does not make sense. The function is there for such a use case. So
immediately when the VFIO change is merged this comment is stale and bogus.
> + * eventually triggers deactivation of IR domain, which
> + * will cleanup IRTE. There is a gap before request_irq()
> + * finally setup the IRTE again. In this gap, a in-flight
> + * irq buffering in hardware layer may trigger DMAR fault
> + * and be lost.
Exactly this information wants to be in the changelog.
> + *
> + * On failure, it returns a negative value. On success,
> + * it returns 0
> + */
> +int update_irq_devid(unsigned int irq, void *dev_id, void *new_dev_id)
> +{
> + struct irq_desc *desc = irq_to_desc(irq);
> + struct irqaction *action, **action_ptr;
> + unsigned long flags;
> +
> + if (in_interrupt()) {
> + WARN(1, "Trying to update IRQ %d (dev_id %p to %p) from IRQ context!\n",
> + irq, dev_id, new_dev_id);
> + return -EPERM;
> + }
if (WARN(....)
> +
> + if (!desc)
> + return -EINVAL;
> +
> + chip_bus_lock(desc);
This does not need to take bus lock. The action pointer is sufficiently
protected by desc->lock.
> + raw_spin_lock_irqsave(&desc->lock, flags);
> +
> + /*
> + * There can be multiple actions per IRQ descriptor, find the right
> + * one based on the dev_id:
> + */
> + action_ptr = &desc->action;
> + for (;;) {
> + action = *action_ptr;
> +
> + if (!action) {
> + raw_spin_unlock_irqrestore(&desc->lock, flags);
> + chip_bus_sync_unlock(desc);
> + WARN(1, "Trying to update already-free IRQ %d (dev_id %p to %p)\n",
> + irq, dev_id, new_dev_id);
> + return -ENXIO;
> + }
> +
> + if (action->dev_id == dev_id) {
> + action->dev_id = new_dev_id;
> + break;
> + }
> + action_ptr = &action->next;
> + }
> +
> + raw_spin_unlock_irqrestore(&desc->lock, flags);
> + chip_bus_sync_unlock(desc);
> +
> + /*
> + * Make sure it's not being used on another CPU:
> + * There is a risk of UAF for old *dev_id, if it is
> + * freed in a short time after this func returns
function please. Also it does not matter whether the time is short or
not. The point is:
Ensure that an interrupt in flight on another CPU which uses the
old 'dev_id' has completed because the caller can free the memory
to which it points after this function returns.
But this has another twist:
CPU0 CPU1
interrupt
primary_handler(old_dev_id)
do_stuff_on(old_dev_id)
return WAKE_THREAD; update_dev_id()
wakeup_thread();
action->dev_id = new_dev_id;
irq_thread()
secondary_handler(new_dev_id)
That's broken and synchronize_irq() does not protect against it.
> + */
> + synchronize_irq(irq);
Thanks,
tglx
Powered by blists - more mailing lists