lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <97e63191-e2ba-34f6-ca6c-99b9e9841587@epam.com>
Date:   Thu, 13 Jul 2023 14:40:08 +0000
From:   Oleksandr Tyshchenko <Oleksandr_Tyshchenko@...m.com>
To:     Juergen Gross <jgross@...e.com>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Stefano Stabellini <sstabellini@...nel.org>
CC:     Vincent Guittot <vincent.guittot@...aro.org>,
        Alex Bennée <alex.bennee@...aro.org>,
        "stratos-dev@...lists.linaro.org" <stratos-dev@...lists.linaro.org>,
        Erik Schilling <erik.schilling@...aro.org>,
        Manos Pitsidianakis <manos.pitsidianakis@...aro.org>,
        Mathieu Poirier <mathieu.poirier@...aro.org>,
        "xen-devel@...ts.xenproject.org" <xen-devel@...ts.xenproject.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] xen: privcmd: Add support for irqfd



On 13.07.23 10:44, Juergen Gross wrote:

Hello all.


> On 12.07.23 10:48, Viresh Kumar wrote:
>> Xen provides support for injecting interrupts to the guests via the
>> HYPERVISOR_dm_op() hypercall. The same is used by the Virtio based
>> device backend implementations, in an inefficient manner currently.
>>
>> Generally, the Virtio backends are implemented to work with the Eventfd
>> based mechanism. In order to make such backends work with Xen, another
>> software layer needs to poll the Eventfds and raise an interrupt to the
>> guest using the Xen based mechanism. This results in an extra context
>> switch.
>>
>> This is not a new problem in Linux though. It is present with other
>> hypervisors like KVM, etc. as well. The generic solution implemented in
>> the kernel for them is to provide an IOCTL call to pass the interrupt
>> details and eventfd, which lets the kernel take care of polling the
>> eventfd and raising of the interrupt, instead of handling this in user
>> space (which involves an extra context switch).
>>
>> This patch adds support to inject a specific interrupt to guest using
>> the eventfd mechanism, by preventing the extra context switch.
>>
>> Inspired by existing implementations for KVM, etc..


Viresh, great work!

Do you perhaps have corresponding users-space (virtio backend) example 
adopted for that feature (I would like to take a look at it if possible)?



>>
>> Signed-off-by: Viresh Kumar <viresh.kumar@...aro.org>
>> ---
>>   drivers/xen/privcmd.c      | 285 ++++++++++++++++++++++++++++++++++++-
>>   include/uapi/xen/privcmd.h |  14 ++
>>   2 files changed, 297 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
>> index e2f580e30a86..e8096b09c113 100644
>> --- a/drivers/xen/privcmd.c
>> +++ b/drivers/xen/privcmd.c
>> @@ -9,11 +9,16 @@
>>   #define pr_fmt(fmt) "xen:" KBUILD_MODNAME ": " fmt
>> +#include <linux/eventfd.h>
>> +#include <linux/file.h>
>>   #include <linux/kernel.h>
>>   #include <linux/module.h>
>> +#include <linux/mutex.h>
>> +#include <linux/poll.h>
>>   #include <linux/sched.h>
>>   #include <linux/slab.h>
>>   #include <linux/string.h>
>> +#include <linux/workqueue.h>
>>   #include <linux/errno.h>
>>   #include <linux/mm.h>
>>   #include <linux/mman.h>
>> @@ -833,6 +838,266 @@ static long privcmd_ioctl_mmap_resource(struct 
>> file *file,
>>       return rc;
>>   }
>> +/* Irqfd support */
>> +static struct workqueue_struct *irqfd_cleanup_wq;
>> +static DEFINE_MUTEX(irqfds_lock);
>> +static LIST_HEAD(irqfds_list);
>> +
>> +struct privcmd_kernel_irqfd {
>> +    domid_t dom;
>> +    u8 level;
>> +    u32 irq;
>> +    struct eventfd_ctx *eventfd;
>> +    struct work_struct shutdown;
>> +    wait_queue_entry_t wait;
>> +    struct list_head list;
>> +    poll_table pt;
>> +};
>> +
>> +/* From xen/include/public/hvm/dm_op.h */
>> +#define XEN_DMOP_set_irq_level 19
>> +
>> +struct xen_dm_op_set_irq_level {
>> +    u32 irq;
>> +    /* IN - Level: 0 -> deasserted, 1 -> asserted */
>> +    u8 level;
>> +    u8 pad[3];
>> +};
>> +
>> +struct xen_dm_op {
>> +    u32 op;
>> +    u32 pad;
>> +    union {
>> +        /*
>> +         * There are more structures here, we won't be using them, so
>> +         * can skip adding them here.
>> +         */
>> +        struct xen_dm_op_set_irq_level set_irq_level;
>> +    } u;
>> +};
> 
> Instead of copying definitions over from Xen into privcmd.c, please just 
> update
> the related linux header include/xen/interface/dm_op.h from the Xen public
> header.
> 
>> +
>> +static void irqfd_deactivate(struct privcmd_kernel_irqfd *kirqfd)
>> +{
>> +    lockdep_assert_held(&irqfds_lock);
>> +
>> +    list_del_init(&kirqfd->list);
>> +    queue_work(irqfd_cleanup_wq, &kirqfd->shutdown);
>> +}
>> +
>> +static void irqfd_shutdown(struct work_struct *work)
>> +{
>> +    struct privcmd_kernel_irqfd *kirqfd =
>> +        container_of(work, struct privcmd_kernel_irqfd, shutdown);
>> +    u64 cnt;
>> +
>> +    eventfd_ctx_remove_wait_queue(kirqfd->eventfd, &kirqfd->wait, &cnt);
>> +    eventfd_ctx_put(kirqfd->eventfd);
>> +    kfree(kirqfd);
>> +}
>> +
>> +static void irqfd_inject(struct privcmd_kernel_irqfd *kirqfd)
>> +{
>> +    struct xen_dm_op dm_op = {
>> +        .op = XEN_DMOP_set_irq_level,
>> +        .u.set_irq_level.irq = kirqfd->irq,
>> +        .u.set_irq_level.level = kirqfd->level,
>> +    };
>> +    struct xen_dm_op_buf xbufs = {
>> +        .size = sizeof(dm_op),
>> +    };
>> +    u64 cnt;
>> +
>> +    eventfd_ctx_do_read(kirqfd->eventfd, &cnt);
>> +    set_xen_guest_handle(xbufs.h, &dm_op);
>> +
>> +    xen_preemptible_hcall_begin();
>> +    HYPERVISOR_dm_op(kirqfd->dom, 1, &xbufs);
> 
> Please add some error handling, e.g. by issuing a message in case this 
> hypercall
> was failing. Adding a bool "error" to struct privcmd_kernel_irqfd in 
> order to
> avoid multiple error messages for the same device might be a good idea.


In addition to provided comments, I would like to mention that this 
particular dm_op has Arm implementation only in vanilla hypervisor.

So this feature cannot be immediately reused on x86 because of 
XEN_DMOP_set_irq_level at least. As I understand, the x86's variant is 
XEN_DMOP_set_isa_irq_level (there is also XEN_DMOP_set_pci_intx_level 
for legacy PCI interrupts).

Please note, I am not asking to wire on x86, but maybe it is worth 
considering to put this new feature under something like Kconfig's 
XEN_PRIVCMD_IRQFD which depends on Arm for now?  Or maybe to put dm_op 
specific part of irqfd_inject() under CONFIG_ARCH_XXX here or introduce 
per-ARCH helpers to form a dm_op (Arm's variant will use 
XEN_DMOP_set_irq_level like in current commit). If not, then at least a 
sentence in the description mentioning that this works on Arm only needs 
to be added, I think.

Also, I am wondering whether we should foresee the IOCTL_PRIVCMD_IRQFD 
interface to be suitable for all existing irq related dm_ops (not only 
XEN_DMOP_set_irq_level) right now or this can be extended/updated later 
on (when there is a real need)?



[snip]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ