[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <366f2dcf-51ab-4d66-9c94-517349ef0bdd@redhat.com>
Date: Thu, 4 Mar 2021 14:39:49 +0800
From: Jason Wang <jasowang@...hat.com>
To: Xie Yongji <xieyongji@...edance.com>, mst@...hat.com,
stefanha@...hat.com, sgarzare@...hat.com, parav@...dia.com,
bob.liu@...cle.com, hch@...radead.org, rdunlap@...radead.org,
willy@...radead.org, viro@...iv.linux.org.uk, axboe@...nel.dk,
bcrl@...ck.org, corbet@....net
Cc: virtualization@...ts.linux-foundation.org, netdev@...r.kernel.org,
kvm@...r.kernel.org, linux-aio@...ck.org,
linux-fsdevel@...r.kernel.org
Subject: Re: [RFC v4 09/11] Documentation: Add documentation for VDUSE
On 2021/2/23 7:50 下午, Xie Yongji wrote:
> VDUSE (vDPA Device in Userspace) is a framework to support
> implementing software-emulated vDPA devices in userspace. This
> document is intended to clarify the VDUSE design and usage.
>
> Signed-off-by: Xie Yongji <xieyongji@...edance.com>
> ---
> Documentation/userspace-api/index.rst | 1 +
> Documentation/userspace-api/vduse.rst | 112 ++++++++++++++++++++++++++++++++++
> 2 files changed, 113 insertions(+)
> create mode 100644 Documentation/userspace-api/vduse.rst
>
> diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst
> index acd2cc2a538d..f63119130898 100644
> --- a/Documentation/userspace-api/index.rst
> +++ b/Documentation/userspace-api/index.rst
> @@ -24,6 +24,7 @@ place where this information is gathered.
> ioctl/index
> iommu
> media/index
> + vduse
>
> .. only:: subproject and html
>
> diff --git a/Documentation/userspace-api/vduse.rst b/Documentation/userspace-api/vduse.rst
> new file mode 100644
> index 000000000000..2a20e686bb59
> --- /dev/null
> +++ b/Documentation/userspace-api/vduse.rst
> @@ -0,0 +1,112 @@
> +==================================
> +VDUSE - "vDPA Device in Userspace"
> +==================================
> +
> +vDPA (virtio data path acceleration) device is a device that uses a
> +datapath which complies with the virtio specifications with vendor
> +specific control path. vDPA devices can be both physically located on
> +the hardware or emulated by software. VDUSE is a framework that makes it
> +possible to implement software-emulated vDPA devices in userspace.
> +
> +How VDUSE works
> +------------
> +Each userspace vDPA device is created by the VDUSE_CREATE_DEV ioctl on
> +the character device (/dev/vduse/control). Then a device file with the
> +specified name (/dev/vduse/$NAME) will appear, which can be used to
> +implement the userspace vDPA device's control path and data path.
It's better to mention that in order to le thte device to be registered
on the bus, admin need to use the management API(netlink) to create the
vDPA device.
Some codes to demnonstrate how to create the device will be better.
> +
> +To implement control path, a message-based communication protocol and some
> +types of control messages are introduced in the VDUSE framework:
> +
> +- VDUSE_SET_VQ_ADDR: Set the vring address of virtqueue.
> +
> +- VDUSE_SET_VQ_NUM: Set the size of virtqueue
> +
> +- VDUSE_SET_VQ_READY: Set ready status of virtqueue
> +
> +- VDUSE_GET_VQ_READY: Get ready status of virtqueue
> +
> +- VDUSE_SET_VQ_STATE: Set the state for virtqueue
> +
> +- VDUSE_GET_VQ_STATE: Get the state for virtqueue
> +
> +- VDUSE_SET_FEATURES: Set virtio features supported by the driver
> +
> +- VDUSE_GET_FEATURES: Get virtio features supported by the device
> +
> +- VDUSE_SET_STATUS: Set the device status
> +
> +- VDUSE_GET_STATUS: Get the device status
> +
> +- VDUSE_SET_CONFIG: Write to device specific configuration space
> +
> +- VDUSE_GET_CONFIG: Read from device specific configuration space
> +
> +- VDUSE_UPDATE_IOTLB: Notify userspace to update the memory mapping in device IOTLB
> +
> +Those control messages are mostly based on the vdpa_config_ops in
> +include/linux/vdpa.h which defines a unified interface to control
> +different types of vdpa device. Userspace needs to read()/write()
> +on the VDUSE device file to receive/reply those control messages
> +from/to VDUSE kernel module as follows:
> +
> +.. code-block:: c
> +
> + static int vduse_message_handler(int dev_fd)
> + {
> + int len;
> + struct vduse_dev_request req;
> + struct vduse_dev_response resp;
> +
> + len = read(dev_fd, &req, sizeof(req));
> + if (len != sizeof(req))
> + return -1;
> +
> + resp.request_id = req.unique;
> +
> + switch (req.type) {
> +
> + /* handle different types of message */
> +
> + }
> +
> + len = write(dev_fd, &resp, sizeof(resp));
> + if (len != sizeof(resp))
> + return -1;
> +
> + return 0;
> + }
> +
> +In the deta path, vDPA device's iova regions will be mapped into userspace
> +with the help of VDUSE_IOTLB_GET_FD ioctl on the VDUSE device file:
> +
> +- VDUSE_IOTLB_GET_FD: get the file descriptor to iova region. Userspace can
> + access this iova region by passing the fd to mmap().
It would be better to have codes to explain how it is expected to work here.
> +
> +Besides, the following ioctls on the VDUSE device file are provided to support
> +interrupt injection and setting up eventfd for virtqueue kicks:
> +
> +- VDUSE_VQ_SETUP_KICKFD: set the kickfd for virtqueue, this eventfd is used
> + by VDUSE kernel module to notify userspace to consume the vring.
> +
> +- VDUSE_INJECT_VQ_IRQ: inject an interrupt for specific virtqueue
> +
> +- VDUSE_INJECT_CONFIG_IRQ: inject a config interrupt
> +
> +MMU-based IOMMU Driver
> +----------------------
> +In virtio-vdpa case, VDUSE framework implements an MMU-based on-chip IOMMU
> +driver to support mapping the kernel DMA buffer into the userspace iova
> +region dynamically.
> +
> +The basic idea behind this driver is treating MMU (VA->PA) as IOMMU (IOVA->PA).
> +The driver will set up MMU mapping instead of IOMMU mapping for the DMA transfer
> +so that the userspace process is able to use its virtual address to access
> +the DMA buffer in kernel.
> +
> +And to avoid security issue, a bounce-buffering mechanism is introduced to
> +prevent userspace accessing the original buffer directly which may contain other
> +kernel data.
It's worth to mention this is designed for virtio-vdpa (kernel virtio
drivers).
Thanks
> During the mapping, unmapping, the driver will copy the data from
> +the original buffer to the bounce buffer and back, depending on the direction of
> +the transfer. And the bounce-buffer addresses will be mapped into the user address
> +space instead of the original one.
Powered by blists - more mailing lists