linux-kernel - Re: [PATCH RT] vfio-pci: Set MSI/MSI-X ISR to non-threaded

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20171130210857.53b81ba3@gandalf.local.home>
Date:   Thu, 30 Nov 2017 21:08:57 -0500
From:   Steven Rostedt <rostedt@...dmis.org>
To:     David Su <david.w.su@...el.com>
Cc:     linux-kernel@...r.kernel.org, bigeasy@...utronix.de,
        tglx@...utronix.de
Subject: Re: [PATCH RT] vfio-pci: Set MSI/MSI-X ISR to non-threaded

On Thu, 30 Nov 2017 17:05:35 -0800
David Su <david.w.su@...el.com> wrote:

> Setting MSI/MSI-X ISR to be non-threaded will result in shorter and more
> deterministic IRQ delivery latencies to VFIO applications, because
> context switches to the ISR thread are eliminated.  This is important
> for applications with low latency requirement running in virtual
> machines on RT Linux host with assigned devices through vfio-pci.
> 
> A FPGA based interrupt testing device was used to compare latencies with
> threaded and non-threaded vfio-pci ISR.  The device has a free running
> time stamp counter and a register recording the time an interrupt was
> sent to the host.  With these registers the device driver and test
> application for the device are able to calculate and record the latency
> between the time an interrupt was sent and the time the ISR in the
> device's driver was invoked.
> 
> The result is with non-threaded vfio-pci ISR the average latency is
> reduced by about 54% and the maximum-minimum latency range is reduced by
> about 65%.
> 
> Non-threaded vfio-pci ISR:
> Minimum 4.18us, Average 4.47us, Maximum 10.26us
> 
> Threaded vfio-pci ISR:
> Minimum 8.97us, Average 9.65us, Maximum 26.11us
> 
> Signed-off-by: David Su <david.w.su@...el.com>
> ---
>  drivers/vfio/pci/vfio_pci_intrs.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_intrs.c b/drivers/vfio/pci/vfio_pci_intrs.c
> index 1c46045..4c54e56 100644
> --- a/drivers/vfio/pci/vfio_pci_intrs.c
> +++ b/drivers/vfio/pci/vfio_pci_intrs.c
> @@ -333,7 +333,7 @@ static int vfio_msi_set_vector_signal(struct vfio_pci_device *vdev,
>  		pci_write_msi_msg(irq, &msg);
>  	}
>  
> -	ret = request_irq(irq, vfio_msihandler, 0,
> +	ret = request_irq(irq, vfio_msihandler, IRQF_NO_THREAD,
>  			  vdev->ctx[vector].name, trigger);

Hmm, but we have this:
static irqreturn_t vfio_msihandler(int irq, void *arg)
{
	struct eventfd_ctx *trigger = arg;

	eventfd_signal(trigger, 1);
	return IRQ_HANDLED;
}

__u64 eventfd_signal(struct eventfd_ctx *ctx, __u64 n)
{
	unsigned long flags;

	spin_lock_irqsave(&ctx->wqh.lock, flags);
	if (ULLONG_MAX - ctx->count < n)
		n = ULLONG_MAX - ctx->count;
	ctx->count += n;
	if (waitqueue_active(&ctx->wqh))
		wake_up_locked_poll(&ctx->wqh, POLLIN);
	spin_unlock_irqrestore(&ctx->wqh.lock, flags);

	return n;
}

And spin_lock() turns into a mutex in PREEMPT_RT, which means it can
sleep. You can't sleep in hard interrupt context. This will eventually
crash the kernel.

And no, we are not going to convert the ctx->wqh.lock into a
raw_spin_lock.

-- Steve

>  	if (ret) {
>  		kfree(vdev->ctx[vector].name);