lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 23 Oct 2019 15:14:09 -0700
From:   Jakub Kicinski <jakub.kicinski@...ronome.com>
To:     Jiri Pirko <jiri@...nulli.us>
Cc:     Yuval Avnery <yuvalav@...lanox.com>, netdev@...r.kernel.org,
        jiri@...lanox.com, saeedm@...lanox.com, leon@...nel.org,
        davem@...emloft.net, shuah@...nel.org
Subject: Re: [PATCH net-next 0/9] devlink vdev

On Wed, 23 Oct 2019 21:25:12 +0200, Jiri Pirko wrote:
> Wed, Oct 23, 2019 at 09:00:46PM CEST, jakub.kicinski@...ronome.com wrote:
> >On Tue, 22 Oct 2019 20:43:01 +0300, Yuval Avnery wrote:  
> >> This patchset introduces devlink vdev.
> >> 
> >> Currently, legacy tools do not provide a comprehensive solution that can
> >> be used in both SmartNic and non-SmartNic mode.
> >> Vdev represents a device that exists on the ASIC but is not necessarily
> >> visible to the kernel.
> >> 
> >> Using devlink ports is not suitable because:
> >> 
> >> 1. Those devices aren't necessarily network devices (such as NVMe devices)
> >>    and doesn’t have E-switch representation. Therefore, there is need for
> >>    more generic representation of PCI VF.
> >> 2. Some attributes are not necessarily pure port attributes
> >>    (number of MSIX vectors)
> >> 3. It creates a confusing devlink topology, with multiple port flavours
> >>    and indices.
> >> 
> >> Vdev will be created along with flavour and attributes.
> >> Some network vdevs may be linked with a devlink port.
> >> 
> >> This is also aimed to replace "ip link vf" commands as they are strongly
> >> linked to the PCI topology and allow access only to enabled VFs.
> >> Even though current patchset and example is limited to MAC address
> >> of the VF, this interface will allow to manage PF, VF, mdev in
> >> SmartNic and non SmartNic modes, in unified way for networking and
> >> non-networking devices via devlink instance.
> >> 
> >> Example:
> >> 
> >> A privileged user wants to configure a VF's hw_addr, before the VF is
> >> enabled.
> >> 
> >> $ devlink vdev set pci/0000:03:00.0/1 hw_addr 10:22:33:44:55:66
> >> 
> >> $ devlink vdev show pci/0000:03:00.0/1
> >> pci/0000:03:00.0/1: flavour pcivf pf 0 vf 0 port_index 1 hw_addr 10:22:33:44:55:66
> >> 
> >> $ devlink vdev show pci/0000:03:00.0/1 -jp
> >> {
> >>     "vdev": {
> >>         "pci/0000:03:00.0/1": {
> >>             "flavour": "pcivf",
> >>             "pf": 0,
> >>             "vf": 0,
> >>             "port_index": 1,
> >>             "hw_addr": "10:22:33:44:55:66"
> >>         }
> >>     }
> >> }  
> >
> >I don't trust this is a good design. 
> >
> >We need some proper ontology and decisions what goes where. We have
> >half of port attributes duplicated here, and hw_addr which honestly
> >makes more sense in a port (since port is more of a networking
> >construct, why would ep storage have a hw_addr?). Then you say you're
> >going to dump more PCI stuff in here :(  
> 
> Well basically what this "vdev" is is the "port peer" we discussed
> couple of months ago. It provides possibility for the user on bare metal
> to cofigure things for the VF - for example.
> 
> Regarding hw_addr vs. port - it is not correct to make that a devlink
> port attribute. It is not port's hw_addr, but the port's peer hw_addr.

Yeah, I remember us arguing with others that "the other side of the
wire" should not be a port.

> >"vdev" sounds entirely meaningless, and has a high chance of becoming 
> >a dumping ground for attributes.  
> 
> Sure, it is a madeup name. If you have a better name, please share.

IDK. I think I started the "peer" stuff, so it made sense to me.
Now it sounds like you'd like to kill a lot of problems with this 
one stone. For PCIe "vdev" is def wrong because some of the config 
will be for PF (which is not virtual). Also for PCIe the config has 
to be done with permanence in mind from day 1, PCI often requires
HW reset to reconfig.

> Basically it is something that represents VF/mdev - the other side of
> devlink port. But in some cases, like NVMe, there is no associated
> devlink port - that is why "devlink port peer" would not work here.

What are the NVMe parameters we'd configure here? Queues etc. or some
IDs? Presumably there will be a NVMe-specific way to configure things?
Something has to point the NVMe VF to a backend, right? 

(I haven't looked much into NVMe myself in case that's not obvious ;))

Powered by blists - more mailing lists