[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <181ae83d-edeb-9406-27cc-1195fe29ae95@ti.com>
Date: Tue, 15 Sep 2020 21:17:51 +0530
From: Kishon Vijay Abraham I <kishon@...com>
To: Jason Wang <jasowang@...hat.com>, Cornelia Huck <cohuck@...hat.com>
CC: "Michael S. Tsirkin" <mst@...hat.com>,
Ohad Ben-Cohen <ohad@...ery.com>,
Bjorn Andersson <bjorn.andersson@...aro.org>,
Jon Mason <jdmason@...zu.us>,
Dave Jiang <dave.jiang@...el.com>,
Allen Hubbe <allenbh@...il.com>,
Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Stefan Hajnoczi <stefanha@...hat.com>,
Stefano Garzarella <sgarzare@...hat.com>,
<linux-doc@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<linux-remoteproc@...r.kernel.org>, <linux-ntb@...glegroups.com>,
<linux-pci@...r.kernel.org>, <kvm@...r.kernel.org>,
<virtualization@...ts.linux-foundation.org>,
<netdev@...r.kernel.org>
Subject: Re: [RFC PATCH 00/22] Enhance VHOST to enable SoC-to-SoC
communication
Hi Jason,
On 15/09/20 1:48 pm, Jason Wang wrote:
> Hi Kishon:
>
> On 2020/9/14 下午3:23, Kishon Vijay Abraham I wrote:
>>> Then you need something that is functional equivalent to virtio PCI
>>> which is actually the concept of vDPA (e.g vDPA provides alternatives if
>>> the queue_sel is hard in the EP implementation).
>> Okay, I just tried to compare the 'struct vdpa_config_ops' and 'struct
>> vhost_config_ops' ( introduced in [RFC PATCH 03/22] vhost: Add ops for
>> the VHOST driver to configure VHOST device).
>>
>> struct vdpa_config_ops {
>> /* Virtqueue ops */
>> int (*set_vq_address)(struct vdpa_device *vdev,
>> u16 idx, u64 desc_area, u64 driver_area,
>> u64 device_area);
>> void (*set_vq_num)(struct vdpa_device *vdev, u16 idx, u32 num);
>> void (*kick_vq)(struct vdpa_device *vdev, u16 idx);
>> void (*set_vq_cb)(struct vdpa_device *vdev, u16 idx,
>> struct vdpa_callback *cb);
>> void (*set_vq_ready)(struct vdpa_device *vdev, u16 idx, bool ready);
>> bool (*get_vq_ready)(struct vdpa_device *vdev, u16 idx);
>> int (*set_vq_state)(struct vdpa_device *vdev, u16 idx,
>> const struct vdpa_vq_state *state);
>> int (*get_vq_state)(struct vdpa_device *vdev, u16 idx,
>> struct vdpa_vq_state *state);
>> struct vdpa_notification_area
>> (*get_vq_notification)(struct vdpa_device *vdev, u16 idx);
>> /* vq irq is not expected to be changed once DRIVER_OK is set */
>> int (*get_vq_irq)(struct vdpa_device *vdv, u16 idx);
>>
>> /* Device ops */
>> u32 (*get_vq_align)(struct vdpa_device *vdev);
>> u64 (*get_features)(struct vdpa_device *vdev);
>> int (*set_features)(struct vdpa_device *vdev, u64 features);
>> void (*set_config_cb)(struct vdpa_device *vdev,
>> struct vdpa_callback *cb);
>> u16 (*get_vq_num_max)(struct vdpa_device *vdev);
>> u32 (*get_device_id)(struct vdpa_device *vdev);
>> u32 (*get_vendor_id)(struct vdpa_device *vdev);
>> u8 (*get_status)(struct vdpa_device *vdev);
>> void (*set_status)(struct vdpa_device *vdev, u8 status);
>> void (*get_config)(struct vdpa_device *vdev, unsigned int offset,
>> void *buf, unsigned int len);
>> void (*set_config)(struct vdpa_device *vdev, unsigned int offset,
>> const void *buf, unsigned int len);
>> u32 (*get_generation)(struct vdpa_device *vdev);
>>
>> /* DMA ops */
>> int (*set_map)(struct vdpa_device *vdev, struct vhost_iotlb *iotlb);
>> int (*dma_map)(struct vdpa_device *vdev, u64 iova, u64 size,
>> u64 pa, u32 perm);
>> int (*dma_unmap)(struct vdpa_device *vdev, u64 iova, u64 size);
>>
>> /* Free device resources */
>> void (*free)(struct vdpa_device *vdev);
>> };
>>
>> +struct vhost_config_ops {
>> + int (*create_vqs)(struct vhost_dev *vdev, unsigned int nvqs,
>> + unsigned int num_bufs, struct vhost_virtqueue *vqs[],
>> + vhost_vq_callback_t *callbacks[],
>> + const char * const names[]);
>> + void (*del_vqs)(struct vhost_dev *vdev);
>> + int (*write)(struct vhost_dev *vdev, u64 vhost_dst, void *src,
>> int len);
>> + int (*read)(struct vhost_dev *vdev, void *dst, u64 vhost_src, int
>> len);
>> + int (*set_features)(struct vhost_dev *vdev, u64 device_features);
>> + int (*set_status)(struct vhost_dev *vdev, u8 status);
>> + u8 (*get_status)(struct vhost_dev *vdev);
>> +};
>> +
>> struct virtio_config_ops
>> I think there's some overlap here and some of the ops tries to do the
>> same thing.
>>
>> I think it differs in (*set_vq_address)() and (*create_vqs)().
>> [create_vqs() introduced in struct vhost_config_ops provides
>> complimentary functionality to (*find_vqs)() in struct
>> virtio_config_ops. It seemingly encapsulates the functionality of
>> (*set_vq_address)(), (*set_vq_num)(), (*set_vq_cb)(),..].
>>
>> Back to the difference between (*set_vq_address)() and (*create_vqs)(),
>> set_vq_address() directly provides the virtqueue address to the vdpa
>> device but create_vqs() only provides the parameters of the virtqueue
>> (like the number of virtqueues, number of buffers) but does not directly
>> provide the address. IMO the backend client drivers (like net or vhost)
>> shouldn't/cannot by itself know how to access the vring created on
>> virtio front-end. The vdpa device/vhost device should have logic for
>> that. That will help the client drivers to work with different types of
>> vdpa device/vhost device and can access the vring created by virtio
>> irrespective of whether the vring can be accessed via mmio or kernel
>> space or user space.
>>
>> I think vdpa always works with client drivers in userspace and providing
>> userspace address for vring.
>
>
> Sorry for being unclear. What I meant is not replacing vDPA with the
> vhost(bus) you proposed but the possibility of replacing virtio-pci-epf
> with vDPA in:
Okay, so the virtio back-end still use vhost and front end should use
vDPA. I see. So the host side PCI driver for EPF should populate
vdpa_config_ops and invoke vdpa_register_device().
>
> My question is basically for the part of virtio_pci_epf_send_command(),
> so it looks to me you have a vendor specific API to replace the
> virtio-pci layout of the BAR:
Even when we use vDPA, we have to use some sort of
virtio_pci_epf_send_command() to communicate with virtio backend right?
Right, the layout is slightly different from the standard layout.
This is the layout
struct epf_vhost_reg_queue {
u8 cmd;
u8 cmd_status;
u16 status;
u16 num_buffers;
u16 msix_vector;
u64 queue_addr;
} __packed;
struct epf_vhost_reg {
u64 host_features;
u64 guest_features;
u16 msix_config;
u16 num_queues;
u8 device_status;
u8 config_generation;
u32 isr;
u8 cmd;
u8 cmd_status;
struct epf_vhost_reg_queue vq[MAX_VQS];
} __packed;
>
>
> +static int virtio_pci_epf_send_command(struct virtio_pci_device *vp_dev,
> + u32 command)
> +{
> + struct virtio_pci_epf *pci_epf;
> + void __iomem *ioaddr;
> + ktime_t timeout;
> + bool timedout;
> + int ret = 0;
> + u8 status;
> +
> + pci_epf = to_virtio_pci_epf(vp_dev);
> + ioaddr = vp_dev->ioaddr;
> +
> + mutex_lock(&pci_epf->lock);
> + writeb(command, ioaddr + HOST_CMD);
> + timeout = ktime_add_ms(ktime_get(), COMMAND_TIMEOUT);
> + while (1) {
> + timedout = ktime_after(ktime_get(), timeout);
> + status = readb(ioaddr + HOST_CMD_STATUS);
> +
>
> Several questions:
>
> - It's not clear to me how the synchronization is done between the RC
> and EP. E.g how and when the value of HOST_CMD_STATUS can be changed.
The HOST_CMD (commands sent to the EP) is serialized by using mutex.
Once the EP reads the command, it resets the value in HOST_CMD. So
HOST_CMD is less likely an issue.
A sufficiently large time is given for the EP to complete it's operation
(1 Sec) where the EP provides the status in HOST_CMD_STATUS. After it
expires, HOST_CMD_STATUS_NONE is written to HOST_CMD_STATUS. There could
be case where EP updates HOST_CMD_STATUS after RC writes
HOST_CMD_STATUS_NONE, but by then HOST has already detected this as
failure and error-ed out.
> If you still want to introduce a new transport, a virtio spec patch
> would be helpful for us to understand the device API.
Okay, that should be on https://github.com/oasis-tcs/virtio-spec.git?
> - You have you vendor specific layout (according to
> virtio_pci_epb_table()), so I guess you it's better to have a vendor
> specific vDPA driver instead
Okay, with vDPA, we are free to define our own layouts.
> - The advantage of vendor specific vDPA driver is that it can 1) have
> less codes 2) support userspace drivers through vhost-vDPA (instead of
> inventing new APIs since we can't use vfio-pci here).
I see there's an additional level of indirection from virtio to vDPA and
probably no need for spec update but don't exactly see how it'll reduce
code.
For 2, Isn't vhost-vdpa supposed to run on virtio backend?
>From a high level, I think I should be able to use vDPA for
virtio_pci_epf.c. Would you also suggest using vDPA for ntb_virtio.c?
([RFC PATCH 20/22] NTB: Add a new NTB client driver to implement VIRTIO
functionality).
Thanks
Kishon
Powered by blists - more mailing lists