[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <AM0PR05MB48660E49342EC2CD6AB825F8D1750@AM0PR05MB4866.eurprd05.prod.outlook.com>
Date: Sun, 10 Nov 2019 19:48:31 +0000
From: Parav Pandit <parav@...lanox.com>
To: Jason Gunthorpe <jgg@...pe.ca>
CC: Alex Williamson <alex.williamson@...hat.com>,
Jakub Kicinski <jakub.kicinski@...ronome.com>,
Jiri Pirko <jiri@...nulli.us>,
David M <david.m.ertman@...el.com>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Saeed Mahameed <saeedm@...lanox.com>,
"kwankhede@...dia.com" <kwankhede@...dia.com>,
"leon@...nel.org" <leon@...nel.org>,
"cohuck@...hat.com" <cohuck@...hat.com>,
Jiri Pirko <jiri@...lanox.com>,
"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>,
Or Gerlitz <gerlitz.or@...il.com>,
"Jason Wang (jasowang@...hat.com)" <jasowang@...hat.com>
Subject: RE: [PATCH net-next 00/19] Mellanox, mlx5 sub function support
> From: Jason Gunthorpe <jgg@...pe.ca>
> Sent: Friday, November 8, 2019 6:57 PM
> > We should be creating 3 different buses, instead of mdev bus being de-
> multiplexer of that?
> >
> > Hence, depending the device flavour specified, create such device on right
> bus?
> >
> > For example,
> > $ devlink create subdev pci/0000:05:00.0 flavour virtio name foo
> > subdev_id 1 $ devlink create subdev pci/0000:05:00.0 flavour mdev
> > <uuid> subdev_id 2 $ devlink create subdev pci/0000:05:00.0 flavour
> > mlx5 id 1 subdev_id 3
>
> I like the idea of specifying what kind of interface you want at sub device
> creation time. It fits the driver model pretty well and doesn't require abusing
> the vfio mdev for binding to a netdev driver.
>
> > $ devlink subdev pci/0000:05:00.0/<subdev_id> config <params> $ echo
> > <respective_device_id> <sysfs_path>/bind
>
> Is explicit binding really needed?
No.
> If you specify a vfio flavour why shouldn't
> the vfio driver autoload and bind to it right away? That is kind of the point
> of the driver model...
>
It some configuration is needed that cannot be passed at device creation time, explicit bind later can be used.
> (kind of related, but I don't get while all that GUID and lifecycle stuff in mdev
> should apply for something like a SF)
>
GUID is just the name of the device.
But lets park this aside for a moment.
> > Implement power management callbacks also on all above 3 buses?
> > Abstract out mlx5_bus into more generic virtual bus (vdev bus?) so
> > that multiple vendors can reuse?
>
> In this specific case, why does the SF in mlx5 mode even need a bus?
> Is it only because of devlink? That would be unfortunate
>
Devlink is one part due to identifying using bus/dev.
How do we refer to its devlink instance of SF without bus/device?
Can we extend devlink_register() to accept optionally have sf_id?
If we don't have a bus, creating sub function (a device), without a 'struct device' which will have BAR, resources, etc is odd.
Now if we cannot see 'struct device' in sysfs, how do we persistently name them?
Are we ok to add /sys/class/net/sf_netdev/subdev_id
And
/sys/class/infiniband/<rdma_dev>/subdev_id
So that systemd/udev can rename them as en<X?><subdev_id> and roce<X><subdev_id>
If so, what will be X without a bus type?
This route without a bus is certainly helpful to overcome the IOMMU limitation where IOMMU only listens to pci bus type for DMAR setup,
dmar_register_bus_notifier(), and in
intel_iommu_init()-> bus_set_iommu(&pci_bus_type, &intel_iommu_ops);
and other IOMMU doing similar PCI/AMBA binding.
This is currently overcome using WA dma_ops.
Powered by blists - more mailing lists