linux-kernel - Re: [PATCH 0/5] KVM&genirq: Enable adaptive IRQ sharing for passed-through devices

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4CFA2763.4060102@web.de>
Date:	Sat, 04 Dec 2010 12:34:59 +0100
From:	Jan Kiszka <jan.kiszka@....de>
To:	Thomas Gleixner <tglx@...utronix.de>
CC:	Avi Kivity <avi@...hat.com>, Marcelo Tosatti <mtosatti@...hat.com>,
	linux-kernel@...r.kernel.org, kvm <kvm@...r.kernel.org>,
	Tom Lyon <pugs@...co.com>,
	Alex Williamson <alex.williamson@...hat.com>,
	"Michael S. Tsirkin" <mst@...hat.com>
Subject: Re: [PATCH 0/5] KVM&genirq: Enable adaptive IRQ sharing for passed-through
 devices

Am 04.12.2010 11:37, Thomas Gleixner wrote:
> On Sat, 4 Dec 2010, Jan Kiszka wrote:
> 
>> Besides 3 cleanup patches, this series consists of two major changes.
>> The first introduces an interrupt sharing notifier to the genirq
>> subsystem. It fires when an interrupt line is about to be use by more
>> than one driver or the last but one user called free_irq.
>>
>> The second major change makes use of this interface in KVM's PCI pass-
>> through subsystem. KVM has to keep the interrupt source disabled while
>> calling into the guest to handle the event. This can be done at device
>> or line level. The former is required to share the interrupt line, the
>> latter is an order of magnitude faster (see patch 3 for details).
>>
>> Beside pass-through support of KVM, further users of the IRQ notifier
>> could become VFIO (not yet mainline) and uio_pci_generic which have to
>> resolve the same conflict.
> 
> Hmm. You basically want to have the following functionality:
> 
> If interrupt is shared, then you want to keep the current behaviour:
> 
>    disable at line level (IRQF_ONESHOT)
>    run handler thread (PCI level masking)
>    reenable at line level in irq_finalize_oneshot()
>    reenable at PCI level when guest is done

If the interrupt is shared, we must mask at PCI level inside the primary
handler as we also have to support non-threaded users of the same line.
So we always have a transition line-level -> device-level
masking in a primary handler.

> 
> If interrupt is not shared:
> 
>    disable at line level (IRQF_ONESHOT)
>    run handler thread (no PCI level masking)
>    no reenable at line level
>    reenable at line level when guest is done

We only use threaded handlers so far for hairy lock-dependency reasons
and as certain injection complexity is too much guest-controllable. We
hope to move most injection cases (ie. any IRQ directed to a single VCPU
/ user space task) directly into a primary handler in the future to
reduce the latency. So both threaded and non-threaded cases should be
addressable by any approach.

> 
> I think the whole notifier approach including the extra irq handlers
> plus the requirement to call disable_irq_nosync() from the non shared
> handler is overkill. We can be more clever.

Always welcome.

> 
> The genirq code knows whether you have one or more handler
> registered. So we can add IRQF_ONESHOT_UNMASK_SHARED and add a status
> field to irq_data (which I was going to do anyway for other
> reasons). In that status field you get a bit which says IRQ_MASK_DEVICE.
> 
> So with IRQF_ONESHOT_UNMASK_SHARED == 0 we keep the current behaviour.
> 
> If IRQF_ONESHOT_UNMASK_SHARED== 1 and IRQ_MASK_DEVICE == 1 we keep the
> current behaviour.
> 
> If IRQF_ONESHOT_UNMASK_SHARED== 1 and IRQ_MASK_DEVICE == 0 then then
> irq_finalize_oneshot() simply marks the interrupt disabled (equivalent
> to disable_irq_nosync()) and returns.

That sounds good.

> 
> Now in your primary irq handler you simply check the IRQ_MASK_DEVICE
> status flag and decide whether you need to mask at PCI level or not.
> 
> Your threaded handler gets the same information via IRQ_MASK_DEVICE so
> it can issue the appropriate user space notification depending on that
> flag.
> 
> This works fully transparent across adding and removing handlers. On
> request_irq/free_irq we update the IRQ_MASK_DEVICE flag with the
> following logic:
> 
>     nr_actions	IRQF_ONESHOT_UNMASK_SHARED	IRQ_MASK_DEVICE
>     1		0				1
>     1		1				0
>     >1		don't care			1
> 
> If interrupts are in flight accross request/free then this change
> takes effect when the next interrupt comes in.

For registration, that might be too late. We need to synchronize on
in-flight interrupts for that line and then ensure that it gets enabled
independently of the registered user. That user may have applied
outdated information, thus would block the line for too long if user
space decides to do something else.

I think this will be the trickier part of the otherwise nice & simple
concept.

> 
> No notifiers, no disable_irq_nosync() magic, less and simpler code.
> 
> Thoughts ?

Pulling the effect of disable_irq_nosync into genirq also for threaded
handling would remove the need for re-registering handlers. That's good.
But we need to think about the hand-over scenarios again. Maybe
solvable, but non-trivial.

Jan


Download attachment "signature.asc" of type "application/pgp-signature" (260 bytes)