[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <182708df-1082-0678-49b2-15d0199f20df@redhat.com>
Date: Mon, 30 Nov 2020 11:36:18 +0800
From: Jason Wang <jasowang@...hat.com>
To: Yongji Xie <xieyongji@...edance.com>,
Parav Pandit <parav@...dia.com>
Cc: virtualization@...ts.linux-foundation.org,
"Michael S. Tsirkin" <mst@...hat.com>, elic@...dia.com,
netdev@...r.kernel.org
Subject: Re: [External] Re: [PATCH 0/7] Introduce vdpa management tool
On 2020/11/27 下午1:52, Yongji Xie wrote:
> On Fri, Nov 27, 2020 at 11:53 AM Jason Wang <jasowang@...hat.com
> <mailto:jasowang@...hat.com>> wrote:
>
>
> On 2020/11/12 下午2:39, Parav Pandit wrote:
> > This patchset covers user requirements for managing existing
> vdpa devices,
> > using a tool and its internal design notes for kernel drivers.
> >
> > Background and user requirements:
> > ----------------------------------
> > (1) Currently VDPA device is created by driver when driver is
> loaded.
> > However, user should have a choice when to create or not create
> a vdpa device
> > for the underlying parent device.
> >
> > For example, mlx5 PCI VF and subfunction device supports
> multiple classes of
> > device such netdev, vdpa, rdma. Howevever it is not required to
> always created
> > vdpa device for such device.
> >
> > (2) In another use case, a device may support creating one or
> multiple vdpa
> > device of same or different class such as net and block.
> > Creating vdpa devices at driver load time further limits this
> use case.
> >
> > (3) A user should be able to monitor and query vdpa queue level
> or device level
> > statistics for a given vdpa device.
> >
> > (4) A user should be able to query what class of vdpa devices
> are supported
> > by its parent device.
> >
> > (5) A user should be able to view supported features and
> negotiated features
> > of the vdpa device.
> >
> > (6) A user should be able to create a vdpa device in vendor
> agnostic manner
> > using single tool.
> >
> > Hence, it is required to have a tool through which user can
> create one or more
> > vdpa devices from a parent device which addresses above user
> requirements.
> >
> > Example devices:
> > ----------------
> > +-----------+ +-----------+ +---------+ +--------+ +-----------+
> > |vdpa dev 0 | |vdpa dev 1 | |rdma dev | |netdev | |vdpa dev 3 |
> > |type=net | |type=block | |mlx5_0 | |ens3f0 | |type=net |
> > +----+------+ +-----+-----+ +----+----+ +-----+--+ +----+------+
> > | | | | |
> > | | | | |
> > +----+-----+ | +----+----+ | +----+----+
> > | mlx5 +--------+ |mlx5 +-------+ |mlx5 |
> > |pci vf 2 | |pci vf 4 | |pci sf 8 |
> > |03:00:2 | |03:00.4 | |mlx5_sf.8|
> > +----+-----+ +----+----+ +----+----+
> > | | |
> > | +----+-----+ |
> > +----------------------+mlx5 +----------------+
> > |pci pf 0 |
> > |03:00.0 |
> > +----------+
> >
> > vdpa tool:
> > ----------
> > vdpa tool is a tool to create, delete vdpa devices from a parent
> device. It is a
> > tool that enables user to query statistics, features and may be
> more attributes
> > in future.
> >
> > vdpa tool command draft:
> > ------------------------
> > (a) List parent devices which supports creating vdpa devices.
> > It also shows which class types supported by this parent device.
> > In below command example two parent devices support vdpa device
> creation.
> > First is PCI VF whose bdf is 03.00:2.
> > Second is PCI VF whose name is 03:00.4.
> > Third is PCI SF whose name is mlx5_core.sf.8
> >
> > $ vdpa parentdev list
> > vdpasim
> > supported_classes
> > net
> > pci/0000:03.00:3
> > supported_classes
> > net block
> > pci/0000:03.00:4
> > supported_classes
> > net block
> > auxiliary/mlx5_core.sf.8
> > supported_classes
> > net
> >
> > (b) Now add a vdpa device of networking class and show the device.
> > $ vdpa dev add parentdev pci/0000:03.00:2 type net name foo0 $
> vdpa dev show foo0
> > foo0: parentdev pci/0000:03.00:2 type network parentdev vdpasim
> vendor_id 0 max_vqs 2 max_vq_size 256
> >
> > (c) Show features of a vdpa device
> > $ vdpa dev features show foo0
> > supported
> > iommu platform
> > version 1
> >
> > (d) Dump vdpa device statistics
> > $ vdpa dev stats show foo0
> > kickdoorbells 10
> > wqes 100
> >
> > (e) Now delete a vdpa device previously created.
> > $ vdpa dev del foo0
> >
> > vdpa tool support in this patchset:
> > -----------------------------------
> > vdpa tool is created to create, delete and query vdpa devices.
> > examples:
> > Show vdpa parent device that supports creating, deleting vdpa
> devices.
> >
> > $ vdpa parentdev show
> > vdpasim:
> > supported_classes
> > net
> >
> > $ vdpa parentdev show -jp
> > {
> > "show": {
> > "vdpasim": {
> > "supported_classes": {
> > "net"
> > }
> > }
> > }
> >
> > Create a vdpa device of type networking named as "foo2" from the
> parent device vdpasim:
> >
> > $ vdpa dev add parentdev vdpasim type net name foo2
> >
> > Show the newly created vdpa device by its name:
> > $ vdpa dev show foo2
> > foo2: type network parentdev vdpasim vendor_id 0 max_vqs 2
> max_vq_size 256
> >
> > $ vdpa dev show foo2 -jp
> > {
> > "dev": {
> > "foo2": {
> > "type": "network",
> > "parentdev": "vdpasim",
> > "vendor_id": 0,
> > "max_vqs": 2,
> > "max_vq_size": 256
> > }
> > }
> > }
> >
> > Delete the vdpa device after its use:
> > $ vdpa dev del foo2
> >
> > vdpa tool support by kernel:
> > ----------------------------
> > vdpa tool user interface will be supported by existing vdpa
> kernel framework,
> > i.e. drivers/vdpa/vdpa.c It services user command through a
> netlink interface.
> >
> > Each parent device registers supported callback operations with
> vdpa subsystem
> > through which vdpa device(s) can be managed.
> >
> > FAQs:
> > -----
> > 1. Where does userspace vdpa tool reside which users can use?
> > Ans: vdpa tool can possibly reside in iproute2 [1] as it enables
> user to
> > create vdpa net devices.
> >
> > 2. Why not create and delete vdpa device using sysfs/configfs?
> > Ans:
> > (a) A device creation may involve passing one or more attributes.
> > Passing multiple attributes and returning error code and more
> verbose
> > information for invalid attributes cannot be handled by
> sysfs/configfs.
> >
> > (b) netlink framework is rich that enables user space and kernel
> driver to
> > provide nested attributes.
> >
> > (c) Exposing device specific file under sysfs without net namespace
> > awareness exposes details to multiple containers. Instead exposing
> > attributes via a netlink socket secures the communication
> channel with kernel.
> >
> > (d) netlink socket interface enables to run syscaller kernel tests.
> >
> > 3. Why not use ioctl() interface?
> > Ans: ioctl() interface replicates the necessary plumbing which
> already
> > exists through netlink socket.
> >
> > 4. What happens when one or more user created vdpa devices exist
> for a
> > parent PCI VF or SF and such parent device is removed?
> > Ans: All user created vdpa devices are removed that belong to a
> parent.
> >
> > [1]
> git://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git
> <http://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git>
> >
> > Next steps:
> > -----------
> > (a) Post this patchset and iproute2/vdpa inclusion, remaining
> two drivers
> > will be coverted to support vdpa tool instead of creating
> unmanaged default
> > device on driver load.
> > (b) More net specific parameters such as mac, mtu will be added.
> > (c) Features bits get and set interface will be added.
>
>
> Adding Yong Ji for sharing some thoughts from the view of
> userspace vDPA
> device.
>
>
> Thanks for adding me, Jason!
>
> Now I'm working on a v2 patchset for VDUSE (vDPA Device in Userspace)
> [1]. This tool is very useful for the vduse device. So I'm considering
> integrating this into my v2 patchset. But there is one problem:
>
> In this tool, vdpa device config action and enable action are combined
> into one netlink msg: VDPA_CMD_DEV_NEW. But in vduse case, it needs to
> be splitted because a chardev should be created and opened by a
> userspace process before we enable the vdpa device (call
> vdpa_register_device()).
>
> So I'd like to know whether it's possible (or have some plans) to add
> two new netlink msgs something like: VDPA_CMD_DEV_ENABLE and
> VDPA_CMD_DEV_DISABLE to make the config path more flexible.
>
Actually, we've discussed such intermediate step in some early
discussion. It looks to me VDUSE could be one of the users of this.
Or I wonder whether we can switch to use anonymous inode(fd) for VDUSE
then fetching it via an VDUSE_GET_DEVICE_FD ioctl?
Thanks
> Thanks,
> Yongji
>
> [1] https://www.spinics.net/lists/linux-mm/msg231576.html
Powered by blists - more mailing lists