lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 3 Jul 2019 20:16:23 +0800
From:   Jason Wang <jasowang@...hat.com>
To:     Tiwei Bie <tiwei.bie@...el.com>
Cc:     mst@...hat.com, alex.williamson@...hat.com,
        maxime.coquelin@...hat.com, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org, virtualization@...ts.linux-foundation.org,
        netdev@...r.kernel.org, dan.daly@...el.com,
        cunming.liang@...el.com, zhihong.wang@...el.com
Subject: Re: [RFC v2] vhost: introduce mdev based hardware vhost backend


On 2019/7/3 下午7:52, Tiwei Bie wrote:
> On Wed, Jul 03, 2019 at 06:09:51PM +0800, Jason Wang wrote:
>> On 2019/7/3 下午5:13, Tiwei Bie wrote:
>>> Details about this can be found here:
>>>
>>> https://lwn.net/Articles/750770/
>>>
>>> What's new in this version
>>> ==========================
>>>
>>> A new VFIO device type is introduced - vfio-vhost. This addressed
>>> some comments from here: https://patchwork.ozlabs.org/cover/984763/
>>>
>>> Below is the updated device interface:
>>>
>>> Currently, there are two regions of this device: 1) CONFIG_REGION
>>> (VFIO_VHOST_CONFIG_REGION_INDEX), which can be used to setup the
>>> device; 2) NOTIFY_REGION (VFIO_VHOST_NOTIFY_REGION_INDEX), which
>>> can be used to notify the device.
>>>
>>> 1. CONFIG_REGION
>>>
>>> The region described by CONFIG_REGION is the main control interface.
>>> Messages will be written to or read from this region.
>>>
>>> The message type is determined by the `request` field in message
>>> header. The message size is encoded in the message header too.
>>> The message format looks like this:
>>>
>>> struct vhost_vfio_op {
>>> 	__u64 request;
>>> 	__u32 flags;
>>> 	/* Flag values: */
>>>    #define VHOST_VFIO_NEED_REPLY 0x1 /* Whether need reply */
>>> 	__u32 size;
>>> 	union {
>>> 		__u64 u64;
>>> 		struct vhost_vring_state state;
>>> 		struct vhost_vring_addr addr;
>>> 	} payload;
>>> };
>>>
>>> The existing vhost-kernel ioctl cmds are reused as the message
>>> requests in above structure.
>>
>> Still a comments like V1. What's the advantage of inventing a new protocol?
> I'm trying to make it work in VFIO's way..
>
>> I believe either of the following should be better:
>>
>> - using vhost ioctl,  we can start from SET_VRING_KICK/SET_VRING_CALL and
>> extend it with e.g notify region. The advantages is that all exist userspace
>> program could be reused without modification (or minimal modification). And
>> vhost API hides lots of details that is not necessary to be understood by
>> application (e.g in the case of container).
> Do you mean reusing vhost's ioctl on VFIO device fd directly,
> or introducing another mdev driver (i.e. vhost_mdev instead of
> using the existing vfio_mdev) for mdev device?


Can we simply add them into ioctl of mdev_parent_ops?


>
>> - using PCI layout, then you don't even need to re-invent notifiy region at
>> all and we can pass-through them to guest.
> Like what you said previously, virtio has transports other than PCI.
> And it will look a bit odd when using transports other than PCI..


Yes.


>
>> Personally, I prefer vhost ioctl.
> +1
>
>>
> [...]
>>> 3. VFIO interrupt ioctl API
>>>
>>> VFIO interrupt ioctl API is used to setup device interrupts.
>>> IRQ-bypass can also be supported.
>>>
>>> Currently, the data path interrupt can be configured via the
>>> VFIO_VHOST_VQ_IRQ_INDEX with virtqueue's callfd.
>>
>> How about DMA API? Do you expect to use VFIO IOMMU API or using vhost
>> SET_MEM_TABLE? VFIO IOMMU API is more generic for sure but with
>> SET_MEM_TABLE DMA can be done at the level of parent device which means it
>> can work for e.g the card with on-chip IOMMU.
> Agree. In this RFC, it assumes userspace will use VFIO IOMMU API
> to do the DMA programming. But like what you said, there could be
> a problem when using cards with on-chip IOMMU.


Yes, another issue is SET_MEM_TABLE can not be used to update just a 
part of the table. This seems less flexible than VFIO API but it could 
be extended.


>
>> And what's the plan for vIOMMU?
> As this RFC assumes userspace will use VFIO IOMMU API, userspace
> just needs to follow the same way like what vfio-pci device does
> in QEMU to support vIOMMU.


Right, this is more a question for the qemu part. It means it needs to 
go for ordinary VFIO path to get all notifiers/listeners support from 
vIOMMU.


>
>>
>>> Signed-off-by: Tiwei Bie <tiwei.bie@...el.com>
>>> ---
>>>    drivers/vhost/Makefile     |   2 +
>>>    drivers/vhost/vdpa.c       | 770 +++++++++++++++++++++++++++++++++++++
>>>    include/linux/vdpa_mdev.h  |  72 ++++
>>>    include/uapi/linux/vfio.h  |  19 +
>>>    include/uapi/linux/vhost.h |  25 ++
>>>    5 files changed, 888 insertions(+)
>>>    create mode 100644 drivers/vhost/vdpa.c
>>>    create mode 100644 include/linux/vdpa_mdev.h
>>
>> We probably need some sample parent device implementation. It could be a
>> software datapath like e.g we can start from virtio-net device in guest or a
>> vhost/tap on host.
> Yeah, something like this would be interesting!


Plan to do something like that :) ?

Thanks


>
> Thanks,
> Tiwei
>
>> Thanks
>>
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ