lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 28 Feb 2019 08:24:04 -0800
From:   Jakub Kicinski <jakub.kicinski@...ronome.com>
To:     Jiri Pirko <jiri@...nulli.us>
Cc:     davem@...emloft.net, oss-drivers@...ronome.com,
        netdev@...r.kernel.org, parav@...lanox.com, jgg@...lanox.com
Subject: Re: [PATCH net-next 4/8] devlink: allow subports on devlink PCI
 ports

On Thu, 28 Feb 2019 09:56:24 +0100, Jiri Pirko wrote:
> Wed, Feb 27, 2019 at 07:30:00PM CET, jakub.kicinski@...ronome.com wrote:
> >On Wed, 27 Feb 2019 13:37:53 +0100, Jiri Pirko wrote:  
> >> Tue, Feb 26, 2019 at 07:24:32PM CET, jakub.kicinski@...ronome.com wrote:  
> >> >PCI endpoint corresponds to a PCI device, but such device
> >> >can have one more more logical device ports associated with it.
> >> >We need a way to distinguish those. Add a PCI subport in the
> >> >dumps and print the info in phys_port_name appropriately.
> >> >
> >> >This is not equivalent to port splitting, there is no split
> >> >group. It's just a way of representing multiple netdevs on
> >> >a single PCI function.
> >> >
> >> >Note that the quality of being multiport pertains only to
> >> >the PCI function itself. A PF having multiple netdevs does
> >> >not mean that its VFs will also have multiple, or that VFs
> >> >are associated with any particular port of a multiport VF.  
> >> 
> >> We've been discussing the problem of subport (we call it "subfunction"
> >> or "SF") for some time internally. Turned out, this is probably harder
> >> task to model. Please prove me wrong.
> >> 
> >> The nature of VF makes it a logically separate entity. It has a separate
> >> PCI address, it should therefore have a separate devlink instance.
> >> You can pass it through to VM, then the same devlink instance should be
> >> created inside the VM and disappear from the host.  
> >
> >Depends what a devlink instance represents :/  On one hand you may want
> >to create an instance for a VF to allow it to spawn soft ports, on the
> >other you may want to group multiple functions together.
> >
> >IOW if devlink instance is for an ASIC, there should be one per device
> >per host.  So if we start connecting multiple functions (PFs and/or VFs)
> >to one host we should probably introduce the notion of devlink aliases
> >or some such (so that multiple bus addresses can target the same  
> 
> Hmm. Like VF address -> PF address alias? That would be confusing to see
> eswitch ports under VF devlink instance... I probably did not get you
> right.

No eswitch ports under VF, more in case of mutli-PF.  Bus addresses of
all PFs aliasing to the same devlink instance.

> >devlink instance).  Those less pipelined NICs can forward between
> >ports, but still want a function per port (otherwise user space
> >sometimes gets confused).  If we have multiple functions which are on
> >the same "switchid" they should have a single devlink instance if you
> >ask me.  That instance will have all the ports of the device.  
> 
> Okay, that makes sense. But the question it, can the same devlink
> instance contain ports that does not have "Switchid"?

No strong preference if switchid is different.  To me devlink is an ASIC
instance, if the multiport card is constructed by copy-pasting the same
IP twice onto a die, and the ports really are completely separate, there
is no reason to require single devlink instance.

> I think it would be beneficial to have the switchid shown for devlink
> ports too. Then it is clean that the devlink ports with the same
> switchid belong to the same switch, and other ports under the same
> devlink instance (like PF itself) is separate, but still under the same
> ASIC.

Sure, you mean in terms of UI - user space can do a link dump or get
that from sysfs, right?

> >You say disappear from the host - what do you mean.  Are you referring
> >to the VF port disappearing?  But on the switch the port is still  
> 
> No, VF itself. eswitch port will be still there on the host.
> 
> 
> >there, and you should show the subports on the PF side IMHO.  Devlink
> >ports should allow users to understand the topology of the switch.  
> 
> What do you mean by "topology"?

Mostly which ports are part of the switch and what's their "flavour".
Also (less importantly) which host netdevs are "peers" of eswitch ports.

> >Is spawning VMDq sub-instances the only thing we can think of that VMs
> >may want to do?  Are there any other uses?
> >  
> >> SF (or subport) feels similar to that. Basically it is exactly the same
> >> thing as VF, only does reside under PF PCI function.
> >> 
> >> That is why I think, for sake of consistency, it should have a separate
> >> devlink entity as well. The problem is correct sysfs modelling and
> >> devlink handle derived from that. Parav is working on a simple soft
> >> bus for this purpose called "subbus". There is a RFC floating around on
> >> Mellanox internal mailing list, looks like it is time to send it
> >> upstream.
> >> 
> >> Then each PF driver which have SFs would register subbus devices
> >> according to SFs/subports and they would be properly handled by bus
> >> probe, devlink and devlink port and netdev instances created.
> >> 
> >> Ccing Parav and Jason.  
> >
> >You guys come from the RDMA side of the world, with which I'm less
> >familiar, and the soft bus + spawning devices seems to be a popular
> >design there.  Could you describe the advantages of that model for 
> >the sake of the netdev-only folks? :)  
> 
> I'll try to draw some ascii art :)

Yess :)

> >Another term that gets thrown into the mix here is mediated devices,
> >right?  If you wanna pass the sub-spawn-soft-port to a VM.  Or run 
> >DPDK on some queues.
> >
> >To state the obvious AF_XDP and macvlan offload were are previous
> >answers to some of those use cases.  What is the forwarding model
> >for those subports?  Are we going to allow flower rules from VMs?
> >Is it going to be dst MAC only?  Or is the hypervisor going to forward
> >as it sees appropriate (OvS + "repr"/port netdev)?  

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ