lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e4efdf4e-6454-6d51-50fd-5113999137e7@redhat.com>
Date:	Fri, 12 Aug 2016 13:43:35 +0200
From:	Paolo Bonzini <pbonzini@...hat.com>
To:	alex.popov@...ux.com
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	Christoph Hellwig <hch@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	Marc Zyngier <marc.zyngier@....com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Kees Cook <keescook@...omium.org>,
	Dmitry Vyukov <dvyukov@...gle.com>,
	Jiang Liu <jiang.liu@...ux.intel.com>,
	Jason Cooper <jason@...edaemon.net>,
	Radim Krcmar <rkrcmar@...hat.com>,
	Joerg Roedel <joro@...tes.org>, linux-kernel@...r.kernel.org,
	x86@...nel.org, kvm@...r.kernel.org
Subject: Re: [PATCH 1/1] x86/apic: Introduce paravirq irq_domain



On 12/08/2016 12:56, Alexander Popov wrote:
> Maybe the name "paravirq" is not very good, I'll try to describe the idea.
> 
> There is some kernel module for special interactions between guest VMs.
> Currently it has to register a MSI-capable PCI device to handle interrupts
> injected by the hypervisor. And the bare-metal hypervisor has to emulate
> such a device for guest VMs.
> 
> So I've implemented paravirq irq_domain to avoid this redundant emulation.
> With it we can just call:
>  - paravirq_alloc_irq() to allocate a LAPIC irq;
>  - request_irq() for it;
>  - irqd_cfg(irq_get_irq_data()) to get the corresponding interrupt vector
> and inform the hypervisor about it.
> Now we happily handle the irq from the hypervisor when it injects this vector.
> 
> The irq_mask/irq_unmask parameters of paravirq_init_chip() are the pointers
> to the functions from the interaction module which ask the hypervisor to
> start/stop injecting interrupts to the guest VM.
> 
> Paravirq irq_domain allows to avoid the PCI device emulation in the hypervisor
> and provides the ability to run slimmer Linux guests without precompiled
> PCI and MSI support.
> 
> Did I manage to answer your questions?

It's a bit clearer.  My doubt is that the caller of paravirq_init_chip
has to provide irq_mask and irq_unmask, but it doesn't know who will
call paravirq_alloc_irq.  So there are two cases:

1) there is only one device, and then your solution doesn't scale well
to multiple devices

2) there is some kind of commonality between all devices using
paravirq_alloc_irq, and then it should be abstracted in a bus.

The latter would be similar to what Xen and Hyper-V do, for example.
Using PCI is more similar to the KVM approach.

Paolo

> Please correct me if the idea is wrong or there's a better way to do that.
> Thanks.
> 
>>> Signed-off-by: Alexander Popov <alex.popov@...ux.com>
>>> ---
>>>  arch/x86/Kconfig                 |   8 +++
>>>  arch/x86/include/asm/irqdomain.h |   6 ++
>>>  arch/x86/include/asm/paravirq.h  |   9 +++
>>>  arch/x86/kernel/apic/Makefile    |   2 +
>>>  arch/x86/kernel/apic/paravirq.c  | 128
>>>  +++++++++++++++++++++++++++++++++++++++
>>>  arch/x86/kernel/apic/vector.c    |   1 +
>>>  6 files changed, 154 insertions(+)
>>>  create mode 100644 arch/x86/include/asm/paravirq.h
>>>  create mode 100644 arch/x86/kernel/apic/paravirq.c
>>>
>>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>>> index 5c6e747..209bd88 100644
>>> --- a/arch/x86/Kconfig
>>> +++ b/arch/x86/Kconfig
>>> @@ -760,6 +760,14 @@ config PARAVIRT_TIME_ACCOUNTING
>>>  
>>>  	  If in doubt, say N here.
>>>  
>>> +config X86_PARAVIRQ
>>> +	bool "Enable paravirq irq_domain"
>>> +	depends on PARAVIRT && X86_LOCAL_APIC
>>> +	default n
>>> +	---help---
>>> +	  This option enables paravirq irq_domain for interrupts injected
>>> +	  by the hypervisor using Intel VT-x technology.
>>> +
>>>  config PARAVIRT_CLOCK
>>>  	bool
>>>  
>>> diff --git a/arch/x86/include/asm/irqdomain.h
>>> b/arch/x86/include/asm/irqdomain.h
>>> index d26075b..e3192f6 100644
>>> --- a/arch/x86/include/asm/irqdomain.h
>>> +++ b/arch/x86/include/asm/irqdomain.h
>>> @@ -60,4 +60,10 @@ extern void arch_init_htirq_domain(struct irq_domain
>>> *domain);
>>>  static inline void arch_init_htirq_domain(struct irq_domain *domain) { }
>>>  #endif
>>>  
>>> +#ifdef CONFIG_X86_PARAVIRQ
>>> +extern void arch_init_paravirq_domain(struct irq_domain *domain);
>>> +#else
>>> +static inline void arch_init_paravirq_domain(struct irq_domain *domain) { }
>>> +#endif
>>> +
>>>  #endif
>>> diff --git a/arch/x86/include/asm/paravirq.h
>>> b/arch/x86/include/asm/paravirq.h
>>> new file mode 100644
>>> index 0000000..a137de2
>>> --- /dev/null
>>> +++ b/arch/x86/include/asm/paravirq.h
>>> @@ -0,0 +1,9 @@
>>> +#ifndef _ASM_X86_PARAVIRQ_H
>>> +#define _ASM_X86_PARAVIRQ_H
>>> +
>>> +int paravirq_init_chip(void (*irq_mask)(struct irq_data *data),
>>> +				void (*irq_unmask)(struct irq_data *data));
>>> +int paravirq_alloc_irq(void);
>>> +void paravirq_free_irq(unsigned int irq);
>>> +
>>> +#endif /* _ASM_X86_PARAVIRQ_H */
>>> diff --git a/arch/x86/kernel/apic/Makefile b/arch/x86/kernel/apic/Makefile
>>> index 8e63ebd..84f9ce0 100644
>>> --- a/arch/x86/kernel/apic/Makefile
>>> +++ b/arch/x86/kernel/apic/Makefile
>>> @@ -28,3 +28,5 @@ obj-$(CONFIG_X86_BIGSMP)	+= bigsmp_32.o
>>>  
>>>  # For 32bit, probe_32 need to be listed last
>>>  obj-$(CONFIG_X86_LOCAL_APIC)	+= probe_$(BITS).o
>>> +
>>> +obj-$(CONFIG_X86_PARAVIRQ)	+= paravirq.o
>>> diff --git a/arch/x86/kernel/apic/paravirq.c
>>> b/arch/x86/kernel/apic/paravirq.c
>>> new file mode 100644
>>> index 0000000..430b819
>>> --- /dev/null
>>> +++ b/arch/x86/kernel/apic/paravirq.c
>>> @@ -0,0 +1,128 @@
>>> +/*
>>> + * An irq_domain for interrupts injected by the hypervisor using
>>> + * Intel VT-x technology.
>>> + *
>>> + * Copyright (C) 2016 Alexander Popov <alex.popov@...ux.com>.
>>> + *
>>> + * This file is released under the GPLv2.
>>> + */
>>> +
>>> +#include <linux/init.h>
>>> +#include <linux/irq.h>
>>> +#include <asm/irqdomain.h>
>>> +#include <asm/paravirq.h>
>>> +
>>> +static struct irq_domain *paravirq_domain;
>>> +
>>> +static struct irq_chip paravirq_chip = {
>>> +	.name			= "PARAVIRQ",
>>> +	.irq_ack		= irq_chip_ack_parent,
>>> +};
>>> +
>>> +static int paravirq_domain_alloc(struct irq_domain *domain,
>>> +			unsigned int virq, unsigned int nr_irqs, void *arg)
>>> +{
>>> +	int ret = 0;
>>> +
>>> +	BUG_ON(domain != paravirq_domain);
>>> +
>>> +	if (nr_irqs != 1)
>>> +		return -EINVAL;
>>> +
>>> +	ret = irq_domain_set_hwirq_and_chip(paravirq_domain,
>>> +					virq, virq, &paravirq_chip, NULL);
>>> +	if (ret) {
>>> +		pr_warn("setting chip, hwirq for irq %u failed\n", virq);
>>> +		return ret;
>>> +	}
>>> +
>>> +	__irq_set_handler(virq, handle_edge_irq, 0, "edge");
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static void paravirq_domain_free(struct irq_domain *domain,
>>> +					unsigned int virq, unsigned int nr_irqs)
>>> +{
>>> +	struct irq_data *irq_data;
>>> +
>>> +	BUG_ON(domain != paravirq_domain);
>>> +	BUG_ON(nr_irqs != 1);
>>> +
>>> +	irq_data = irq_domain_get_irq_data(paravirq_domain, virq);
>>> +	if (irq_data)
>>> +		irq_domain_reset_irq_data(irq_data);
>>> +	else
>>> +		pr_warn("irq %u is not in paravirq irq_domain\n", virq);
>>> +}
>>> +
>>> +static const struct irq_domain_ops paravirq_domain_ops = {
>>> +	.alloc	= paravirq_domain_alloc,
>>> +	.free	= paravirq_domain_free,
>>> +};
>>> +
>>> +int paravirq_alloc_irq(void)
>>> +{
>>> +	struct irq_alloc_info info;
>>> +
>>> +	if (!paravirq_domain)
>>> +		return -ENODEV;
>>> +
>>> +	if (!paravirq_chip.irq_mask || !paravirq_chip.irq_unmask)
>>> +		return -EINVAL;
>>> +
>>> +	init_irq_alloc_info(&info, NULL);
>>> +
>>> +	return irq_domain_alloc_irqs(paravirq_domain, 1, NUMA_NO_NODE, &info);
>>> +}
>>> +EXPORT_SYMBOL(paravirq_alloc_irq);
>>> +
>>> +void paravirq_free_irq(unsigned int virq)
>>> +{
>>> +	struct irq_data *irq_data;
>>> +
>>> +	if (!paravirq_domain) {
>>> +		pr_warn("paravirq irq_domain is not initialized\n");
>>> +		return;
>>> +	}
>>> +
>>> +	irq_data = irq_domain_get_irq_data(paravirq_domain, virq);
>>> +	if (irq_data)
>>> +		irq_domain_free_irqs(virq, 1);
>>> +	else
>>> +		pr_warn("irq %u is not in paravirq irq_domain\n", virq);
>>> +}
>>> +EXPORT_SYMBOL(paravirq_free_irq);
>>> +
>>> +int paravirq_init_chip(void (*irq_mask)(struct irq_data *data),
>>> +				void (*irq_unmask)(struct irq_data *data))
>>> +{
>>> +	if (!paravirq_domain)
>>> +		return -ENODEV;
>>> +
>>> +	if (paravirq_chip.irq_mask || paravirq_chip.irq_unmask)
>>> +		return -EEXIST;
>>> +
>>> +	if (!irq_mask || !irq_unmask)
>>> +		return -EINVAL;
>>> +
>>> +	paravirq_chip.irq_mask = irq_mask;
>>> +	paravirq_chip.irq_unmask = irq_unmask;
>>> +
>>> +	return 0;
>>> +}
>>> +EXPORT_SYMBOL(paravirq_init_chip);
>>> +
>>> +void arch_init_paravirq_domain(struct irq_domain *parent)
>>> +{
>>> +	paravirq_domain = irq_domain_add_tree(NULL, &paravirq_domain_ops, NULL);
>>> +	if (!paravirq_domain) {
>>> +		pr_warn("failed to initialize paravirq irq_domain\n");
>>> +		return;
>>> +	}
>>> +
>>> +	paravirq_domain->name = paravirq_chip.name;
>>> +	paravirq_domain->parent = parent;
>>> +	paravirq_domain->flags |= IRQ_DOMAIN_FLAG_AUTO_RECURSIVE;
>>> +}
>>> +
>>> diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
>>> index 6066d94..878b440 100644
>>> --- a/arch/x86/kernel/apic/vector.c
>>> +++ b/arch/x86/kernel/apic/vector.c
>>> @@ -438,6 +438,7 @@ int __init arch_early_irq_init(void)
>>>  
>>>  	arch_init_msi_domain(x86_vector_domain);
>>>  	arch_init_htirq_domain(x86_vector_domain);
>>> +	arch_init_paravirq_domain(x86_vector_domain);
>>>  
>>>  	BUG_ON(!alloc_cpumask_var(&vector_cpumask, GFP_KERNEL));
>>>  	BUG_ON(!alloc_cpumask_var(&vector_searchmask, GFP_KERNEL));
>>> --
>>> 2.5.5
>>>
>>>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ