[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191023120046.0f53b744@cakuba.netronome.com>
Date: Wed, 23 Oct 2019 12:00:46 -0700
From: Jakub Kicinski <jakub.kicinski@...ronome.com>
To: Yuval Avnery <yuvalav@...lanox.com>
Cc: netdev@...r.kernel.org, jiri@...lanox.com, saeedm@...lanox.com,
leon@...nel.org, davem@...emloft.net, shuah@...nel.org
Subject: Re: [PATCH net-next 0/9] devlink vdev
On Tue, 22 Oct 2019 20:43:01 +0300, Yuval Avnery wrote:
> This patchset introduces devlink vdev.
>
> Currently, legacy tools do not provide a comprehensive solution that can
> be used in both SmartNic and non-SmartNic mode.
> Vdev represents a device that exists on the ASIC but is not necessarily
> visible to the kernel.
>
> Using devlink ports is not suitable because:
>
> 1. Those devices aren't necessarily network devices (such as NVMe devices)
> and doesn’t have E-switch representation. Therefore, there is need for
> more generic representation of PCI VF.
> 2. Some attributes are not necessarily pure port attributes
> (number of MSIX vectors)
> 3. It creates a confusing devlink topology, with multiple port flavours
> and indices.
>
> Vdev will be created along with flavour and attributes.
> Some network vdevs may be linked with a devlink port.
>
> This is also aimed to replace "ip link vf" commands as they are strongly
> linked to the PCI topology and allow access only to enabled VFs.
> Even though current patchset and example is limited to MAC address
> of the VF, this interface will allow to manage PF, VF, mdev in
> SmartNic and non SmartNic modes, in unified way for networking and
> non-networking devices via devlink instance.
>
> Example:
>
> A privileged user wants to configure a VF's hw_addr, before the VF is
> enabled.
>
> $ devlink vdev set pci/0000:03:00.0/1 hw_addr 10:22:33:44:55:66
>
> $ devlink vdev show pci/0000:03:00.0/1
> pci/0000:03:00.0/1: flavour pcivf pf 0 vf 0 port_index 1 hw_addr 10:22:33:44:55:66
>
> $ devlink vdev show pci/0000:03:00.0/1 -jp
> {
> "vdev": {
> "pci/0000:03:00.0/1": {
> "flavour": "pcivf",
> "pf": 0,
> "vf": 0,
> "port_index": 1,
> "hw_addr": "10:22:33:44:55:66"
> }
> }
> }
I don't trust this is a good design.
We need some proper ontology and decisions what goes where. We have
half of port attributes duplicated here, and hw_addr which honestly
makes more sense in a port (since port is more of a networking
construct, why would ep storage have a hw_addr?). Then you say you're
going to dump more PCI stuff in here :(
"vdev" sounds entirely meaningless, and has a high chance of becoming
a dumping ground for attributes.
I'm kind of sour about the debug interfaces that were added to devlink.
Seems like the health API superseded the region stuff to a certain
extent and the two don't interact with each other.
I'm slightly worried there is too much "learning by doing" going on
with these new devlink uABIs.
I'm sure you guys thought this all through in detail and have more
documentation and design docs internally. Could you provide more of
this information here? How things fit together? Bigger picture?
The 20 lines of cover letters and no Documentation/ is definitely not
going to cut it this time.
Powered by blists - more mailing lists