[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210210133452.GW4247@nvidia.com>
Date: Wed, 10 Feb 2021 09:34:52 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: "Tian, Kevin" <kevin.tian@...el.com>
CC: Max Gurtovoy <mgurtovoy@...dia.com>,
"cohuck@...hat.com" <cohuck@...hat.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"alex.williamson@...hat.com" <alex.williamson@...hat.com>,
"liranl@...dia.com" <liranl@...dia.com>,
"oren@...dia.com" <oren@...dia.com>,
"tzahio@...dia.com" <tzahio@...dia.com>,
"leonro@...dia.com" <leonro@...dia.com>,
"yarong@...dia.com" <yarong@...dia.com>,
"aviadye@...dia.com" <aviadye@...dia.com>,
"shahafs@...dia.com" <shahafs@...dia.com>,
"artemp@...dia.com" <artemp@...dia.com>,
"kwankhede@...dia.com" <kwankhede@...dia.com>,
"ACurrid@...dia.com" <ACurrid@...dia.com>,
"gmataev@...dia.com" <gmataev@...dia.com>,
"cjia@...dia.com" <cjia@...dia.com>,
"mjrosato@...ux.ibm.com" <mjrosato@...ux.ibm.com>,
"yishaih@...dia.com" <yishaih@...dia.com>,
"aik@...abs.ru" <aik@...abs.ru>,
"Zhao, Yan Y" <yan.y.zhao@...el.com>
Subject: Re: [PATCH v2 0/9] Introduce vfio-pci-core subsystem
On Wed, Feb 10, 2021 at 07:52:08AM +0000, Tian, Kevin wrote:
> > This subsystem framework will also ease on adding vendor specific
> > functionality to VFIO devices in the future by allowing another module
> > to provide the pci_driver that can setup number of details before
> > registering to VFIO subsystem (such as inject its own operations).
>
> I'm a bit confused about the change from v1 to v2, especially about
> how to inject module specific operations. From live migration p.o.v
> it may requires two hook points at least for some devices (e.g. i40e
> in original Yan's example):
IMHO, it was too soon to give up on putting the vfio_device_ops in the
final driver- we should try to define a reasonable public/private
split of vfio_pci_device as is the norm in the kernel. No reason we
can't achieve that.
> register a migration region and intercept guest writes to specific
> registers. [PATCH 4/9] demonstrates the former but not the latter
> (which is allowed in v1).
And this is why, the ROI to wrapper every vfio op in a PCI op just to
keep vfio_pci_device completely private is poor :(
> Then another question. Once we have this framework in place, do we
> mandate this approach for any vendor specific tweak or still allow
> doing it as vfio_pci_core extensions (such as igd and zdev in this
> series)?
I would say no to any further vfio_pci_core extensions that are tied
to specific PCI devices. Things like zdev are platform features, they
are not tied to specific PCI devices
> If the latter, what is the criteria to judge which way is desired? Also what
> about the scenarios where we just want one-time vendor information,
> e.g. to tell whether a device can tolerate arbitrary I/O page faults [1] or
> the offset in VF PCI config space to put PASID/ATS/PRI capabilities [2]?
> Do we expect to create a module for each device to provide such info?
> Having those questions answered is helpful for better understanding of
> this proposal IMO. 😊
>
> [1] https://lore.kernel.org/kvm/d4c51504-24ed-2592-37b4-f390b97fdd00@huawei.com/T/
SVA is a platform feature, so no problem. Don't see a vfio-pci change
in here?
> [2] https://lore.kernel.org/kvm/20200407095801.648b1371@w520.home/
This one could have been done as a broadcom_vfio_pci driver. Not sure
exposing the entire config space unprotected is safe, hard to know
what the device has put in there, and if it is secure to share with a
guest..
> MDEV core is already a well defined subsystem to connect mdev
> bus driver (vfio-mdev) and mdev device driver (mlx5-mdev).
mdev is two things
- a driver core bus layer and sysfs that makes a lifetime model
- a vfio bus driver that doesn't do anything but forward ops to the
main ops
> vfio-mdev is just the channel to bring VFIO APIs through mdev core
> to underlying vendor specific mdev device driver, which is already
> granted flexibility to tweak whatever needs through mdev_parent_ops.
This is the second thing, and it could just be deleted. The actual
final mdev driver can just use vfio_device_ops directly. The
redirection shim in vfio_mdev.c doesn't add value.
> Then what exact extension is talked here by creating another subsystem
> module? or are we talking about some general library which can be
> shared by underlying mdev device drivers to reduce duplicated
> emulation code?
IMHO it is more a design philosophy that the end driver should
implement the vfio_device_ops directly vs having a stack of ops
structs.
Jason
Powered by blists - more mailing lists